[CS229] 01 and 02: Introduction, Regression Analysis and Gradient Descent

2018-10-21

01 and 02: Introduction, Regression Analysis and Gradient Descent

definition: a computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E . — Tom Mitchell (1998)
supervised learning:
- supervised learning: “right answers” given
- regression: predict continuous valued output (e.g., house price)
- classification: predict discrete valued output (e.g., cancer type)
unsupervised learning:
- unlabelled data, using various clustering methods to structure it
- examples: google news, gene expressions, organise computer clusters, social network analysis, astronomical data analysis
- cocktail party problem: overlapped voice, how to separate?
linear regression one variable (univariate):
- m : number of training examples
- X’s : input variable / features
- Y’s : output variable / target variable
- cost function: squared error function: \(J(\theta) = \frac{1}{2} \sum_i \left( h_\theta(x^{(i)}) - y^{(i)} \right)^2 = \frac{1}{2} \sum_i \left( \theta^\top x^{(i)} - y^{(i)} \right)^2\)
parameter estimation: gradient decent algorithm

If you link this blog, please refer to this page, thanks!
Post link：https://tsinghua-gongjing.github.io/posts/CS229-01-02.html

Previous: [CS229] resource