Line 3: | Line 3: | ||
P(x) = k/(NV) | P(x) = k/(NV) | ||
− | where, k is the number of samples in V | + | where, k is the number of samples in V, N is the total number of samples, and V is the volume surrounding x. |
− | + | ||
− | + | ||
This estimate is computed by two approaches | This estimate is computed by two approaches | ||
1) Parzen window approach | 1) Parzen window approach | ||
− | + | - Fixing the volume V and determining the number k of data points inside V | |
2) KNN(K-Nearest Neighbor) | 2) KNN(K-Nearest Neighbor) | ||
− | + | - Fixing the value of k and determining the minimum volume V that encompasses k points in the dataset | |
Revision as of 22:21, 7 April 2008
The non-parametric density estimation is
P(x) = k/(NV)
where, k is the number of samples in V, N is the total number of samples, and V is the volume surrounding x.
This estimate is computed by two approaches
1) Parzen window approach
- Fixing the volume V and determining the number k of data points inside V
2) KNN(K-Nearest Neighbor)
- Fixing the value of k and determining the minimum volume V that encompasses k points in the dataset
- The advantages of non-parametric techniques
- No assumption about the distribution required ahead of time - With enough samples we can converge to an target density
- The disadvantages of non-parametric techniques
- If we have a good classification, the number of required samples may be very large - Computationally expensive