Pattern Recognition
- Course: ENCE 3630 and ENCE 4630 2017
- Instructor: Pooran Singh Negi, [email protected]
- Office: 248
- Office Hours: After the class T, Th, 3.50- 4.50. Email for 1-on-1 help.
- Required:
- Machine Learning: a Probabilistic Perspective by Kevin Patrick Murphy. Online version is available in the DU library.
- An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. It is available online in pdf format
- Optional: For more advance treatment The Elements of Statistical Learning. It is available online in pdf format
Pattern recognition is an exciting area. We will go though theory behind pattern recognition using tool from probability and linear algebra. We will use python, its scientific libraries (numpy, scipy, matplotlib etc.) and scikit-learn: Machine Learning in Python during the course. For deep neural network part, we will use highly popular tensorflow Machine Intelligence library from the Google. For assignments, starter code or hint will be given. One can use other languages and tools as per comfort level. At the end of the course one should be able to understand theory behind various machine learning algorithms with a unifying perspective from probability theory, be comfortable using open source tools for building machine learning systems.
Please install Anaconda for Python 2.7 data science platform. Please install it before coming in the class on Tuesday. See the youtube link Installing Anaconda, Jupyter Notebook. You can also go to my python for reproducible research github repository and start by running pythonBasic.ipynb notebook. I will go over basic of python and jupyter notebook. For Deep Neural networks, we will go over installation instruction later in the course.
- try python notebook online without installing anything
- Runs and visualizes your python code
- The Python Tutorial
Student should be comfortable with linear algebra, probability, statistics, optimization and some programming experience. We will cover basics in the class.
Here are the main topics for the class. More topics can be added as per class interest and available time.
- Basic idea of machine learning, and probability
- Generative models, parametric estimation and supervised learning.
- Naive Bayes classifier etc.
- Gaussian models
- Linear and logistic regression
- Support vector machine, Kernels
- Decision tree.
- Bagging, boosting and model selection etc.
- Unsupervised learning
- Clustering etc.
- Deep learning
- Artificial Neural Networks(ANN), End to end learning, cost function
- Convolutional Neural Networks(CNN) for classification(image) and regression
- Recurrent Neural Networks for natural language processing(NLP) and time series data
You can use any dataset you are interested in. Here is some listing of open datasets.
There will be one mid term, a final exam, homework assignments, in class quizzes. Worst 2 of your homework assignments and 2 quizzes will be dropped. Students enrolled in ENCE 4630 will be asked to do slightly more homework questions.
Homework + Quizzes | 30(20+10) % |
---|---|
Midterm exam, 17th October | 25% |
Final exam, 21 Nov | 25% |
Final Project, Due 23 Nov | 20% |
Quiz | solution |
---|---|
1 | sol |
Homework numbers are as per Kevin Murphy ebook from the library
HW | Due Date | sol | |
---|---|---|---|
This homework is meant to dust off your probability hat. | |||
1 | Chapter 2, 2.1(use bayes rule, condition on event actually observed. like in part a say N_b = number of boys, N_b no of girls), 2.3, 2.6, 2.12 | sep 21 | HW1 sol |
2.16(look for beta distribution definition and use gamma function definition) | |||
2 | 3.6, 3.7, 3.11(only if enrolled in ENCE 4630), 3.20, 4.1(look into section 2.5.1 for definition of correlation coefficient), 4.14 | sep 28 | HW2 sol |
3 | 4.21, 4.22, 5.2, 5.3 | oct 5th | |
hint 5.3(b) if r/s ->0 then there in no cost of rejecting. so what should be done? | |||
For other part analyze the inequality. (it become trivial even for most probable class). What should we do then? | |||
4 | jupyter notebook for spam filter. answer the questions in the notebook. | oct 10 | |
5 | From the book using equation 7.30, 7.31 derive equation 7.32(ridge regression formulation), 7.2, 7.4, 8.3, | oct 12 | |
7.9(only if enrolled in ENCE 4630. will talk about it in the class for hint), | |||
Date | Announcement |
---|---|
29 sep | There will be quiz on 3rd oct (Quiz1) and 5th oct(Quiz2) |
They will include all the material covered till previous class. | |
Date | Reading assignment | uploaded slides |
---|---|---|
12 sep | Read chapter 1 of Kevin Murphy | overview of pattern recognition, linear algebra overview |
14 sep | section 2.2, 2.3, 2.4[.1, .2, .3, .4, .5, .6], 2.5[.1, .2, .4], 2.6.1, 2.8 of kevin Murphy | overview of probability and information theory from the Murphy book + Review of Probability Theory |
Here is the link of jupyter notebook created in the class jupyter notebook introduction | ||
19 Sep | chapter 3 of Kevin Murphy | slides |
[ optional Bayesian parameter estimation] | ||
21 Sep | k. M. book 4.1 upto 4.2.5 | We covered navie bayes and looked into mutlivariate gaussian(MVG). Modelling class conditional densities |
[optional] ISLR book from 4.4.1 upto 4.4.4. | as MVN lead to quadratic discriminant analysis(QDA) and linear discriminant analysis (LDA) when covariance matrices are tied i.e. |
|
Here is the link to mechanics of Lagrangian multiplier. For more detail see this link of metacademy. Go over free section | ||
26 Sep | K. M. book 5.7 upto 5.7.2.2 | class scribed note lecture 4 and 5 slides. Covered Bayesian decision theory, confusion matrix and ROC(AUC) curve |
7.1- 7.3.3, 7.5.1 | ||
28 Sep | class scribed note . We covered linear model and informally discussed how linear model can model non-linear model. Derived the MLE(least square solution) of parameter W | |
Oct 3 | class scribed note. | |
Oct 5 | 8.1, 8.2, 8.3.1-8.3.3, 8.3.6, 8.3.7 | class scribed note. Covered first discriminative classifier (logistic regression), iterative optimization algorithms |
issue related to choosing step size and using penalizing terms W^2 or W on parameter in the objective(cost, loss) function. | ||