Each year, some number of students continue working on their projects after completing CS229, submitting their work to a conferences or journals.
Thus, for inspiration, you might also look at some recent machine learning research papers.
Supervised machine learning is widely used across fields, but major issues are arising around biased, inaccurate, and incomplete training data. A team of undergraduate research interns are reviewing a large corpus of papers and recording questions such as: does the paper report how many human labelers were involved, what their qualifications were, whether they were given formal instructions or definitions, whether they independently checked each other’s work, how often they agreed or disagreed, and how they dealt with disagreements?
In this project, we investigate to what extent published machine learning application papers give specific details about the training data they used, focusing heavily on papers that involve humans labeling specific cases (e.g. Much of machine learning focuses on what to do once labeled training data is obtained, but this project tackles the equally-important issue about whether such data is reliable in the first place.
We’ll announce when submissions are open for each part.
You should submit on Gradescope as a group: that is, for each part, please make one submission for your entire project group and tag your team members.
(Just be sure to ask us for help if you're uncertain how to best get started.) Alternatively, if you're already working on a research or industry project that machine learning might apply to, then you may already have a great project idea.
A very good CS229 project will be a publishable or nearly-publishable piece of work.
Most students do one of three kinds of projects: Some projects will also combine elements of applications, algorithms and theory.
Many fantastic class projects come from students picking either an application area that they're interested in, or picking some subfield of machine learning that they want to explore more.