Lecture 1: Introduction, Optimization Problems
Source: https://www.youtube.com/watch?v=C1lhuz6pZC0
- Computer Models
- Optimization models
- Knapsack problem
- Brute force for optimization problems
- Greedy algorithm for optimization problems
Lecture 2: Optimization problems
Source: https://www.youtube.com/watch?v=uK5yvoXnkSk
Lecture 3: Graph-theoretic models
Source: https://www.youtube.com/watch?v=V_TulH374hw
Lecture 4: Stochastic Thinking
Source: https://www.youtube.com/watch?v=-1BnXEwHUok
- Uncertainty
- Stochastic processes
- Probability
- Random numbers
- Sample probability
- The Birthday Problem
- Simulation Models
Lecture 5: Random Walks
Source: https://www.youtube.com/watch?v=6wUD_gp5WeE
Lecture 6: Monte Carlo Simulation
Source: https://www.youtube.com/watch?v=OgO1gpXSUzU
- Monte Carlo Simulations
- Inferential Statistics
- Confidence intervals
- Law of Large Numbers
- Gambler’s Fallacy
- Regression to the Mean
- Variance
- Empirical Rule
- Probability Distributions
- Probability Density Function
- Normal Distributions
Lecture 7: Confidence Intervals
Source: https://www.youtube.com/watch?v=rUxP7TM8-wo
Lecture 8: Sampling and standard error
Source: https://www.youtube.com/watch?v=soZv_KKax3E
- Inferential Statistics
- Monte Carlo Simulations
- Sampling
- Confidence intervals
- Standard of the Error Mean
- Skew
Lecture 9: Understanding Experimental Data
Source: https://www.youtube.com/watch?v=vIFKGFl1Cn8
- Data
- Modelling a spring
- Objective Functions
- Least Squares Objective Function
- Linear regression
- Coefficient of Determination
Lecture 10: Understanding Experimental Data (Cont.)
Source: https://www.youtube.com/watch?v=fQvg-hh9dUw
Lecture 11: Introduction to Machine Learning
Source: https://www.youtube.com/watch?v=h0e2HAPTGF4
You could say that all computer programs learn a little. The grade varies on the kind of algorithm. In this case, particularly, we’re interested in programs that learn from experience, seeing examples and generalizing from them instead of having to program that generalization ourselves.
In “regular” programming we program so that the system can process data (that we also provide) to generate output. In machine learning, we want to provide data and output so that the computer generates a program.
Memorization is declarative knowledge, it’s the accumulation of individual facts. It is limited by the time to observe them and the memory required to store them.
Generalizaton, instead, is imperative knowledge. Is to deduce new facts from old facts, limited just by the accuracy of the deduction process. It assumes taht the past predicts the future.
Observations: training data.
Supervised learning: for each example we have a label, and we’ll find a way to predict that label associated with the input.
Unsupervised: we have a set of feature vectors without labels, and we’ll try to group them into “natural clusters” (or labels for those groups). In some cases we’ll know how many labels there should be, in some other cases we’ll find which is the best number of them.
Clustering examples into groups:
- Pick examples (at random?) as exemplars
- Cluster remaining samples by minimizing distance between samples in same cluster (objective function) — put sample in group with closest exemplar
- Find median example in each cluster as new exemplar
- Repeat until there is no change
This works with unlabeled data, but if we had it labeled, we’d want to find a subsurface (e.g. for 2D data ⇒ line) of the data that naturally divides them.
Features are the information pieces we can gather from our examples. They never fully describe the situation. Extra features might actually hurt the model as there is the danger of finding sporadic correlations. Or it might generate overfitting, depending on how our process of feature engineering mixes them together to separate instances.
Feature engineering is the process of representing examples by feature vectors that will facilitate generalization.
During the construction of the model we might need to make design choices about which kinds of error the model will make, like prioritizing minimizing false positives.
Minkowski Metric:
When , we get the Manhattan distance When , we get the Euclidean distance
Accuracy: measure of how many instances the model got right.
PPV: Positive predictive value: how may true positives the model came up from the things it labeled positive.
Sensitivity: what percentage did the model correctly find.
Specificity: what percentage did the model correctly reject.
Sensitivity and specificity suffer a trade off between each other.
Lecture 12: Clustering
Source: https://www.youtube.com/watch?v=esmzYhuFnds
(Pending)
Lecture 13: Classification
Source: https://www.youtube.com/watch?v=eg8DJYwdMyg
(Pending)
Lecture 14: Classification and statistical sins
Source: https://www.youtube.com/watch?v=K2SC-WPdT6k
(Pending)
Lecture 15: Statistical Sins and Wrap Up
Source: https://www.youtube.com/watch?v=iOZVbILaIZc
(Pending)