|
|
| norm of <math>\mathbf{w}</math>. Due to the minimization of the weight vector | | norm of <math>\mathbf{w}</math>. Due to the minimization of the weight vector |
| norm, the solution will be regularized in the sense of Thikonov | | norm, the solution will be regularized in the sense of Thikonov |
− | [1], improving the generalization performance.
| + | <ref name="Tikhonov1977"> A. Tikhonov, V. Arsenen, Solution to Ill-Posed Problems, V.H. Winston & Sons, 1977.</ref>, improving the generalization performance. |
| The minimization has to be subject to the constraints | | The minimization has to be subject to the constraints |
| | | |
|
|
| <math>\varepsilon</math> to be zero. This is equivalent to the minimization of | | <math>\varepsilon</math> to be zero. This is equivalent to the minimization of |
| the so-called <math>\varepsilon</math>-insensitive or Vapnik Loss Function | | the so-called <math>\varepsilon</math>-insensitive or Vapnik Loss Function |
− | [2], given by
| + | <ref name="Vapnik1998">V. Vapnik, Statistical Learning Theory, Adaptive and Learning Systems for Signal Processing, Communications, and Control, John Wiley & Sons, 1998.</ref>, given by |
| | | |
| <center><math>L_{\varepsilon}(\epsilon)= | | <center><math>L_{\varepsilon}(\epsilon)= |
|
|
| subject to <math>\xi_n, \xi'_n \geq 0</math> where <math>C</math> is the trade-off between | | subject to <math>\xi_n, \xi'_n \geq 0</math> where <math>C</math> is the trade-off between |
| the minimization of the norm (to improve generalization ability) and | | the minimization of the norm (to improve generalization ability) and |
− | the minimization of the errors [2]. | + | the minimization of the errors <ref name="Vapnik1998"/>. |
| | | |
| The optimization of the above constrained problem through Lagrange | | The optimization of the above constrained problem through Lagrange |
− | multipliers <math>\alpha_i</math>, <math>\alpha'_i</math> leads to the dual formulation[3] | + | multipliers <math>\alpha_i</math>, <math>\alpha'_i</math> leads to the dual formulation <ref name="Scholkopf1988"> A. Smola, B. Scholkopf, A Tutorial on Support Vector Regression, NeuroCOLT Technical Report NC-TR-98-030, Royal Holloway College, University of London, UK (1988).</ref> |
| | | |
| <center><math>L_d=-({\boldsymbol \alpha}-{\boldsymbol \alpha'})^T{\mathbf{R}}({\boldsymbol \alpha}-{\boldsymbol \alpha'})+({\boldsymbol | | <center><math>L_d=-({\boldsymbol \alpha}-{\boldsymbol \alpha'})^T{\mathbf{R}}({\boldsymbol \alpha}-{\boldsymbol \alpha'})+({\boldsymbol |
|
|
| the samples which are mainly affected by thermal noise (i.e., for which the | | the samples which are mainly affected by thermal noise (i.e., for which the |
| quadratic cost is Maximum Likelihood). The linear cost is then | | quadratic cost is Maximum Likelihood). The linear cost is then |
Exception encountered, of type "Error"