Wk | Date | Topic | Reading | Assignments | Comments
|
---|---|---|---|---|---|
1 |
|||||
W 8/30 |
Introduction,
Framing a Learning Problem |
Flach:
Prologue & Ch. 1 |
Sign up for Piazza | Optional reading (but strongly suggested) The Discipline of Machine Learning |
|
2 |
M 9/4 |
Labor Day (no class) | |
||
W 9/6 |
Supervised Learning & Decision Trees |
Handout:
Decision Trees |
Assignment
1 out [HW1 Skeleton] [HW1 Submission Instructions] |
Background
material mathematical foundations |
|
3 |
M 9/11 |
Decision Trees
& Overfitting, k-Nearest Neighbor (Dr. Eaton out of town, guest lecturer: Dr. James Stokes) |
Slides from python tutorial | ||
W 9/13 |
Evaluation | Flach 2.1-3.2 | |||
4 |
M 9/18 |
(No
class -- please watch videos instead) Evaluation (continued) [video] Linear Regression and Gradient Descent [video] |
LfD 1.1,
3.1-3.2.1; Optional linear algebra review: Barber 29.1.1-29.1.9 |
|
Alternative reading if you
don't have LfD: Flach 7.1-7.2 |
W 9/20 |
Regularization (see above slides) [video] |
LfD 3.3 | Alternative reading if you
don't have LfD: Flach 7.4 |
||
5 |
M 9/25 |
Linear
Classification using the Perceptron [video] Logistic Regression [video] |
LfD 4.1-4.2 | Assignment 1 due | |
W 9/27 |
Why Machine
Learning Works: VC Dimension & Generalization Bounds [video] |
Learning Theory notes, LfD 3.2.2 |
For extra help on learning theory,
read LfD Ch. 2 |
||
6 |
M 10/2 |
Support
Vector Machines & Kernels [video - SVMs] [video - Kernels] |
Flach 7.3; LfD 3.4 | Alternative
reading if you don't have LfD: Flach 7.5 |
|
W 10/4 |
Flach 7.5; Bennett article |
||||
7 |
M 10/9 |
Ensemble Methods | Flach 11.2, LfD 4.3 | ||
W 10/11 |
Review for Midterm |
Assignment 2 due 10/13 [HW2 Skeleton] [HW2 Submission Instructions] |
Project Proposal due 10/13 | ||
8 | M 10/16 |
Midterm Exam |
|
Old exams: Fall 2015 Midterm Exam, Fall 2016 Midterm Exam | |
W 10/18 | Probability Review | Generative Model notes (Section 2 only) | |||
9 |
M 10/23 |
Naive Bayes | skim Flach 9.1; Generative Model notes |
For extra help with
naive Bayes, read Flach 9.2 |
|
W 10/25 |
Text Classification & Evaluation | ||||
10 |
M 10/30 |
Text Classification & Evaluation (continued) |
|||
W 11/1 |
Assignment 3
due 11/3 @ 11:59pm [HW3 Skeleton] [HW3 Submission Instructions] |
||||
11 |
M 11/6 |
Neural Networks | |
||
W 11/8 |
Deep Learning | Handout: Deep Learning | |
For more detail on deep learning, see: Bengio article (optional reading) | |
12 | M 11/13 |
Deep Learning (continued) |
|
||
W 11/15 |
Unsupervised
Learning: K-Means & GMMs |
Flach 8.4-8.6, 10.3 | |||
13 | M 11/20 |
Reinforcement Learning | Sutton & Barto Ch. 3, Ch. 4 | Assignment 4 due [HW4 Skeleton] [HW4 Submission Instructions] |
|
W 11/22 |
No Class (Friday class
schedule) |
Project Status Report due [LaTeX template] |
|||
14 | M 11/27 |
Reinforcement Learning (continued) | Sutton
& Barto Ch. 6; RL notes |
||
W 11/29 |
Principal
Components Analysis, Image Features |
|
|||
15 |
M 12/4 |
Learning on
Networks, Machine Learning for Big Data |
Assignment 5 due [HW5 Skeleton] [HW5 Submission Instructions] |
||
W 12/6 |
|||||
16 |
M 12/11 |
Review for Final Exam | Final Project Report and
Summary Slides Due [LaTeX template] |
||
Wed Dec. 20 12-2pm |
Final Exam |
Old exams: Fall 2015 Final Exam, Fall 2016 Final Exam |
INSTRUCTOR
Contact Information: All communication about the course should be posted to Piazza.Teaching Assistant
Yogitha Chilukuri (Master's Student, CIS)
Teaching Assistant
Harshal Godhia (Master’s student, CIS)
Teaching Assistant
Reno Kriz (PhD Student, CIS)
Office Hours: Mondays 1:30-3:30pm, Levine 5th floor bump spaceTeaching Assistant
Jorge Mendez (PhD Student, CIS)
Office Hours:
Monday/Wednesday 5:50-6:50pm, Levine 4th floor bump space
Research Interests: Transfer learning, robotics
Teaching Assistant
Francisco Selame (Senior Undergraduate, CIS)
Machine learning has been essential to
the success of many recent technologies, including autonomous
vehicles, search engines, genomics, automated medical diagnosis,
image recognition, and social network analysis, among many others.
This course will introduce the fundamental concepts and algorithms
that enable computers to learn from experience, with an emphasis
on their practical application to real problems. This course
will introduce supervised learning (decision trees, logistic
regression, support vector machines, Bayesian methods, neural
networks and deep learning), unsupervised learning (clustering,
dimensionality reduction), and reinforcement
learning. Additionally, the course will discuss evaluation
methodology and recent applications of machine learning, including
large scale learning for big data and network analysis.
Prerequisites:
CIS121
Course Website: http://www.seas.upenn.edu/~cis519/
Time: Monday/Wednesday,
noon to 1:30 pm
Location: Wu and Chen Auditorium (Levine 101)
Due to overwhelming demand, Penn is
offering two different machine learning courses: CIS 419/519
(Introduction to Machine Learning) and CIS 520 (Machine
Learning). This section briefly describes the differences
between these courses.
CIS 419/519 Introduction to Machine
Learning (this course!) is an introductory-level course in machine
learning (ML) with an emphasis on applying ML techniques. The
course is cross-listed between undergraduate (419) and graduate
(519) versions; the graduate course 519 has somewhat different
requirements as described below. CIS 419/519 is intended for
students who are interested in the practical application of
existing machine learning methods to real problems, rather
than in the statistical foundations and theory of ML covered in
CIS 520. Just because it is listed as "introductory" does
not necessarily mean that it is "easier".
CIS 520
Machine Learning is a more mathematically rigorous
course in statistical machine learning that provides
the background necessary to design and use new ML algorithms.
Consequently, CIS 520 requires students to have basic knowledge of
linear algebra (matrices, eigenvectors, etc.) It uses Matlab and
is said to require a lot of work, but prepares students to conduct
ML research.
CIS 519 is NOT a prerequisite for CIS 520. However, it makes
little sense to take CIS 519 after having already taken CIS
520. You certainly may take CIS 419/519 first and then later
take CIS 520.
Essentially, you should take CIS
419/519 if:
And, you should take CIS 520 if you're confident in your
mathematical background and:
Students registered for the graduate version of this course (CIS
519) will be required to complete additional work throughout the
semester. This work will include additional components to
the homework, additional requirements on the course project, and
(possibly) different or additional questions on the exams.
Since the two versions have different requirements, you cannot
complete the course as CIS 419 and later petition to have it
changed to CIS 519 for graduate credit; if you're considering
changing this course to CIS 519 for graduate credit, you should
register for the graduate version now.
CIS 419/519 Course Reading Packet This is a collection of readings that will be used
throughout the course. There is NOT a single reading
packet you need to obtain -- readings will be distributed
incrementally throughout the semester, either in hard-copy
or posted online throughout the course. |
|
Learning From Data by
Y. S. Abu-Mostafa, M. Magdon-Ismail, and H.T. Lin. |
|
|
Machine
Learning: The Art and Science of Algorithms That Make
Sense of Data by Peter Flach |
OPTIONAL: Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron. 1st Edition, O'Reilly Media, 2017 |
For a more advanced treatment of machine learning topics, I would
recommend one of the following books:
Attendance and active participation are
expected in every class. Participation includes asking questions,
contributing answers, proposing ideas, and providing constructive
comments.
As you will discover, I am a proponent of two-way communication
and I welcome feedback during the semester about the course. I am
available to answer student questions, listen to concerns, and
talk about any course-related topic (or otherwise!). Come to
office hours! This helps me get to know you. You are welcome to
stop by and chat. There are many more exciting topics to talk
about that we won't have time to cover in-class.
Please send all course communications through Piazza. Your
post should be public for general questions, private to all
instructors (which includes the TAs) for any student-specific
issues (e.g., grading, etc.), and private to Dr. Eaton for
extremely personal matters.
Although computer science work can be intense and solitary, please
stay in touch with me and the TAs, particularly if you feel stuck
on a topic or project and can't figure out how to proceed. Often a
quick face-to-face conference or Piazza post can reveal solutions
to problems and generate renewed creative and scholarly energy. It
is essential that you begin assignments and projects early, since
we will be covering a variety of challenging topics in this
course.
Your grade will be based upon five
homework assignments, two exams, and a course project.
Assignments must be submitted according to the assignment
submission instructions.
At the end of the semester, final grades will be calculated as a
weighted average of all grades according to the following weights:
Assignments: |
40% (8% each) |
Midterm Exam: | 15% |
Final Exam: |
20% |
Project: |
25% |
Total: | 100% |
The project grade will be broken down further in the Project Description.
Incomplete grades will be given only for verifiable medical
illness or other such dire circumstances.
All graded work will receive a percentage grade between 0% and
100%. Here is how the percentage grades will map to final
letter grades; percentages are not rounded:
Percentage |
Letter grade |
Percentage | Letter grade | |
97% <= |
A+ (4.0) |
77% <= | C+ (2.3) | |
93% <= | A (4.0) | 73% <= | C (2.0) | |
90% <= | A- (3.7) | 70% <= | C- (1.7) | |
87% <= | B+ (3.3) | 67% <= | D+ (1.3) | |
83% <= | B (3.0) | 60% <= | D (1.0) | |
80% <= | B- (2.7) | < 60% |
F (0.0) |
There will be two exams in this course. The exams will be
closed-book and closed-notes. They will cover material from
lectures, homeworks, and assigned readings (including topics not
discussed in class). So, keep up with those readings!
I want to encourage you to discuss the material and work together
to understand it. Here are my thoughts on collaborating with other
students:
The readings and lecture topics are group work. Please discuss the readings and associated topics with each other. Work together to understand the material. I highly recommend forming a reading group to discuss the material -- we will explore many challenging ideas and it helps to have multiple people working together to understand them.
It is fine to discuss the topics covered in the homeworks, to discuss approaches to problems, and to sketch out general solutions. However, you MUST write up the homework answers, solutions, and programs individually. You are not permitted to share specific solutions, mathematical results, program code, knowledge representations, experimental results, etc. If you made any notes or worked out something on a white board with another person while you were discussing the homework, you shouldn't use those notes while writing up your answer, however tempted you may be to do so.
You are fully permitted to (and should!) discuss projects
with members of your team. I also encourage you to work
outside of your team to understand the other topics in the
course.
Exams and papers, of course, must be your own individual work.
If you have any questions as to what types of collaborations are
allowed and which are dishonest, please ask me before you
make a mistake.
I have no problem with you using computers or tablets to take
notes or consult reference materials during class. Tempting
though it may be, please do not check e-mail or visit websites
that are not relevant to the course during class. It is a
distraction, both for you and (more importantly) for your fellow
classmates. Please silence your phones and computers when
you enter class.