The Machine Learning Ph.D. Program

Home > Prospective Students > The Machine Learning Ph.D. Program

About the Ph.D. in Machine Learning

The Ph.D. Program in Machine Learning is for students interested in research in machine learning and computational statistics. The program is operated jointly by faculty in the School of Computer Science and Department of Statistics and Data Science.

The extraordinary spread of computers and online data is forever changing the way that important decisions are made in many organizations. Hospitals now analyze online medical records to decide which treatments to apply to future patients, banks analyze past financial records to learn to spot future fraud, and factories analyze past operations to learn to produce higher quality goods. Scientific research in many fields is also undergoing significant change as a result of dramatic increases in online data.

Understanding the most effective ways of using that data is a significant challenge to society — and therefore to science and technology — as it seeks to obtain a return on the huge investment being made in computerization and data collection. Advances in the development of automated techniques for data analysis and decision-making require interdisciplinary work in areas such as machine learning algorithms, the statistical and computational principles that underlie these algorithms, database and data warehousing methods, complexity analysis, data visualization, privacy and security issues, and application areas such as business, marketing and public policy.

Carnegie Mellon University's doctoral program in machine learning is designed to train students to become tomorrow's leaders in this rapidly growing area. The program is part of CMU's Machine Learning Department, which is made up of a multidisciplinary team of faculty and students across several academic departments. Machine learning is dedicated to furthering scientific understanding of automated learning, and to producing the next generation of tools for data analysis and decision-making based on that understanding.

Today's demand for expertise in machine learning far exceeds the supply, and this imbalance will become more severe over the coming decade. Through a combination of interdisciplinary coursework, hands-on applications and cutting-edge research, graduates of the Ph.D. Program in Machine Learning will be uniquely positioned to pioneer new developments in this field, and to be leaders in both industry and academia.

Ph.D. Program Requirements

General Overview

Complete required courses, demonstrate proficiencies in teaching, conference presentation and research skills and successfully defend a Ph.D. thesis.

Courses

The curriculum for the Machine Learning Ph.D. is built on a foundation of five core courses and two electives. 

A typical full-time graduate course load during the first two years consists of two classes each term (at 12 units per class) plus 24 units of advanced research. Thus, during the first two years, a student has the opportunity to take several elective classes in addition to the five required courses.

The ML curriculum joins courses with a computer science main theme and those with a probability and statistics main theme. These may be grouped, as follows:

  • In CS, relevant subfields include: databases, machine learning, data mining and algorithms applications in areas such as robotics, information retrieval and AI.
  • In statistics (including philosophy), the subfields include: statistical modeling (e.g., hierarchical and times series), Bayes' Nets causation, and experimental design. The curriculum is based on core academic courses on intermediate statistics, machine learning, statistical machine learning and discovery, multimedia databases, and algorithms.

These two required core courses provide a foundation in machine learning, statistics, probability and algorithms:

  • 10-715: Advanced Introduction to Machine Learning
  • 36-705: Intermediate Statistics

Plus any three from each of the following menu of core courses:

One Theory course: mathematical foundations and proofs 
One Methods course: algorithms and implementation 
One Practice course: application and aspects of ML in practice 

 Categories for Menu Courses: 

Theory (choose one)

  • 10-708 Probabilistic Graphical Models 
  • 10-716 Advanced ML: Theory and Methods 
  • 10-725 Optimization for Machine Learning
  • 10-734 Foundations of Autonomous Decision Making under Uncertainty 
  • 36-709 Advanced Statistical Theory I 
  • 36-710 Advanced Statistical Theory II 

Methods (choose one)

  • 10-723 Generative AI 
  • 10-703 Deep Reinforcement Learning & Control 
  • 10-707 Advanced Deep Learning 
  • 10-714 Deep Learning Systems 
  • 15-750 Algorithms in the Real World 
  • 15-850 Advanced Algorithms  
  • 15-780 Graduate Artificial Intelligence \
  • 36-707 Regression Analysis 

Practice (choose one)

  • 10-718 ML in Practice  
  • 10-805 ML with Large Datasets  

Plus two electives:

  • An additional course from the menu core list above.
  • Any course at the 700 or higher level in SCS or statistics (36-xxx).
  • Other courses by approval.
  • MLD PhD students must take two electives, while in the program.

Notes

  • Some students will have taken similar courses at other universities before entering the ML Ph.D. program. Based on such equivalent coursework, any student can apply to replace (not reduce) up to two courses with either menu cores or electives. All electives must be supported by the advisor and will be evaluated by the Ph.D. Director.
  • Ph.D. Students who want to replace a course should send a formal request to the Ph.D. Program Manager. The document should contain the transcripts and describe the contents of those courses. Student must also identify the replacement course. The course instructor and the Ph.D. Director of the program must approve the course replacement. (Courses can only be used for a single master's degree, though.) 

View a sample schedule and roadmap for the Ph.D. program.

Your Third Year

By the third year, a Ph.D. student should have completed all coursework. Students seeking an academic position after completing the ML Ph.D. or those pursuing certain subfields may choose to take additional advanced electives in the allied disciplines of SCS, the Mellon College of Science, the Philosophy Department, the Tepper School of Business, or other schools and departments in consultation with their adviser. As in each of the first two years, any coursework is supplemented by research, for a minimum total of 36 units/semester.

Your Fourth Year and Beyond

A Ph.D. student typically presents a thesis proposal no later than the start of the fourth year, and then spends the fourth and sometimes fifth year working on their thesis research.

Conference Presentation Skills

In order to satisfy the Speaking Skills requirement, students must give a successful 20 minute talk in the Speaking Skills course (10-905). The first week of class will be a presentation on what constitutes a good talk, given by the instructor. All second year students are expected to take the course in the fall of their second year, this is in addition to any students who have yet to complete this requirement.

Teaching

Ph.D. students are required to serve as teaching assistants for two semesters in machine learning courses (10-xxx), beginning in their second year. This fulfills their teaching skills requirement.

Research

It is expected that all Ph.D. students engage in active research from their first semester. In fact, adviser selection occurs within one month of entering the Ph.D. program, with the option to change at a later time. Roughly half of a student's time should be allocated to research and lab work, and half to courses until these are completed.

Other Requirements

Students must follow all university policies and procedures.

Financial Support

Machine Learning is committed to providing full tuition and stipend support for the academic year, for each full-time ML Ph.D. student, for a period of five years. Research opportunities are constrained by funding availability. ML's funding commitments assume that the student is making satisfactory progress in the program, as reported to the student at the end of each academic term. Students are strongly encouraged to compete for outside fellowships and other sources of financial support. MLD will supplement these outside awards in order to fulfill its obligations for tuition and stipend support.

Application Information

Apply using our online application, which opens in September.

  • GRE General test scores are optional.
  • You must speak English well. If you are not a native speaker, we recommend a combined TOEFL score of 100, with no subscore below 25, although we will make exceptions to this cutoff in exceptional cases.
  • Unofficially, we recommend a high level of comfort with math (particularly linear algebra, probability and proofs) and computer programming (at the level of an undergraduate degree in computer science, although many of our applicants get the necessary experience without majoring in CS). It is possible to fill in some of this background on the fly, but you will be working hard to do so!
  • The program is very competitive, so successful applications always stand out in some way from their peers — for example in terms of grades, research experience or recommendation letters.