Machine Learning Notes 02

 

Linear Regression with one variable

  • Hypothesis: \(h(\theta) = {\theta}_{0} + {\theta}_{1} x_{1}\)

  • Cost function: \(J(\theta_{0},\theta_{1})=\frac{1}{2m}\sum_{i=1}^{n}(h_{\theta}(x^{(i)})-y^{(i)})^2\)


Gradient descent

Target: minimize \(J(\theta_{0},\theta_{1})\) or \(J(\theta_{0},\theta_{1},\cdots,\theta_{n})\)

  • Gradient algorithm:

   Repeat until convergence{
      \( \theta_{j}:=\theta_{j}-\alpha\frac{\partial }{\partial j}J(\theta_{0},\theta_{1})\;(for \: j=0\:and\:j=1)\)
   }
    ( \(:=\) - Assignment     \(\alpha\) - Learning rate )

Warning: \(\theta_{0}\) and \(\theta_{1}\)should be updated Simultaneously !!!

  Especially, when gradient descent for linear regression,

     \(\frac{\partial }{\partial j}J(\theta_{0},\theta_{1})=\sum_{i=1}^{n}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}\)  
when \(i=0\), we suppose \(x^{(0)}=1\)

Contents


本作品采用知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议进行许可。

知识共享许可协议