Line 8: | Line 8: | ||
= Outline of the slecture = | = Outline of the slecture = | ||
− | * Linear | + | * Background in Linear Classification Problem |
* Support vector machine | * Support vector machine | ||
* Summary | * Summary | ||
Line 15: | Line 15: | ||
---- | ---- | ||
− | == Linear | + | == Background in Linear Classification Problem== |
In a linear classification problem, the feature space can be divided into different regions by hyperplanes. In this lecture, we will take a two-catagory case to illustrate. Given training samples <math> \vec{y}_1,\vec{y}_2,...\vec{y}_n \in \mathbb{R}^p</math>, each <math> \vec{y}_i </math> is a p-dimensional vector and belongs to either class <math> w_1</math> or <math>w_2</math>. The goal is to find the maximum-margin hyperplane that separate the points in the feature space that belong to class <math>w_1</math> from those belong to class<math>w_2</math>. The discriminate function can be written as | In a linear classification problem, the feature space can be divided into different regions by hyperplanes. In this lecture, we will take a two-catagory case to illustrate. Given training samples <math> \vec{y}_1,\vec{y}_2,...\vec{y}_n \in \mathbb{R}^p</math>, each <math> \vec{y}_i </math> is a p-dimensional vector and belongs to either class <math> w_1</math> or <math>w_2</math>. The goal is to find the maximum-margin hyperplane that separate the points in the feature space that belong to class <math>w_1</math> from those belong to class<math>w_2</math>. The discriminate function can be written as | ||
− | <math> g(\vec{y}) = c\cdot\vec{y}</math> | + | <center><math> g(\vec{y}) = \vec{c}\cdot\vec{y}</math> </center> |
− | We want to find <math>c\in\mathbb{R}^{n+1}</math> so that a testing data point <math>\vec{y}_i</math> is labelled | + | We want to find <math>\vec{c}\in\mathbb{R}^{n+1}</math> so that a testing data point <math>\vec{y}_i</math> is labelled |
− | <math> {w_1} </math> if <math> c\cdot\vec{y}>0</math> | + | <center><math> {w_1} </math> if <math> \vec{c}\cdot\vec{y}>0</math> </center> |
− | <math> {w_2} </math> if <math> c\cdot\vec{y}<0</math> | + | <center><math> {w_2} </math> if <math> \vec{c}\cdot\vec{y}<0</math> </center> |
− | We can apply a trick here to replace all <math>\vec{y}</math>'s in class <math>w_2</math> by <math>-\vec{y}</math>, then the task is looking for <math> | + | We can apply a trick here to replace all <math>\vec{y}</math>'s in class <math>w_2</math> by <math>-\vec{y}</math>, then the task is looking for <math>\vec{c}</math> so that |
− | You might have already observe the ambiguity of c in the above discussion, which is, in the above case, if c separates data, <math>\lambda c</math> also separates the data | + | <center><math>\vec{c}\cdot \vec{y}>0, \forall \vec{y} \in </math>new sample space. </center> |
+ | |||
+ | Then hyperplane through origin is defined by <math>\vec{c}\cdot \vec{y} = 0</math>, where \vec{c} is the normal of the plane lying on the positive side of every hyperplane. | ||
+ | |||
+ | You might have already observe the ambiguity of c in the above discussion, which is, in the above case, if c separates data, <math>\lambda \vec{c}</math> also separates the data. One solution might be set <math>|\vec{c}|=1</math>. Another solution is to introduce the concept of "margin" which we denote by b, and ask | ||
+ | |||
+ | <center><math>\vec{c}\cdot\vec{y}\geqslant b > 0, \forall \vec{y} </math>.</center> | ||
+ | |||
+ | In this scenario, <math>\frac{}{c}</math> | ||
+ | However, it is not always possible to find a solution for c. An alternative approach is to find c that minimize a criterion function <math>J(\textbf{a})</math>that satisfy <math>\vec{c}\cdot \vec{y}>0</math>. |
Revision as of 12:59, 1 May 2014
'Support Vector Machine and its Applications in Classification Problems
A slecture by Xing Liu
Partially based on the ECE662 Spring 2014 lecture material of Prof. Mireille Boutin.
Outline of the slecture
- Background in Linear Classification Problem
- Support vector machine
- Summary
- References
Background in Linear Classification Problem
In a linear classification problem, the feature space can be divided into different regions by hyperplanes. In this lecture, we will take a two-catagory case to illustrate. Given training samples $ \vec{y}_1,\vec{y}_2,...\vec{y}_n \in \mathbb{R}^p $, each $ \vec{y}_i $ is a p-dimensional vector and belongs to either class $ w_1 $ or $ w_2 $. The goal is to find the maximum-margin hyperplane that separate the points in the feature space that belong to class $ w_1 $ from those belong to class$ w_2 $. The discriminate function can be written as
We want to find $ \vec{c}\in\mathbb{R}^{n+1} $ so that a testing data point $ \vec{y}_i $ is labelled
We can apply a trick here to replace all $ \vec{y} $'s in class $ w_2 $ by $ -\vec{y} $, then the task is looking for $ \vec{c} $ so that
Then hyperplane through origin is defined by $ \vec{c}\cdot \vec{y} = 0 $, where \vec{c} is the normal of the plane lying on the positive side of every hyperplane.
You might have already observe the ambiguity of c in the above discussion, which is, in the above case, if c separates data, $ \lambda \vec{c} $ also separates the data. One solution might be set $ |\vec{c}|=1 $. Another solution is to introduce the concept of "margin" which we denote by b, and ask
In this scenario, $ \frac{}{c} $ However, it is not always possible to find a solution for c. An alternative approach is to find c that minimize a criterion function $ J(\textbf{a}) $that satisfy $ \vec{c}\cdot \vec{y}>0 $.