Margin Influence Analysis (MIA)

MIA is a variable strategy specially designedfor for suport vector machines(SVMs). This method statistically identifies the informative variable which by logistics is supposed to reduce the classification risk of SVM. As an example, the paired distributions of SVM margins of an informative gene and an uninformative one based on the colon data is shown below:
MIA

Shown in plot A of the above figure are two distributions of margins of SVM models for an individual variable (say Gene A): the black peak (denoted by 1) is the distribution of a population of models which contain Gene A, while the white peak (denoted by 0) shows the distribution of a different population of models which DO NOT include Gene A. As is known, the larger the margin of a SVM model is, the better prediction accuracy it would have. Therefore, based on Plot A, it can be concluded that including Gene A in a model would ON AVERAGE increase the margin of a SVM model and hence Gene A is considered to be informative.

In contrast, a gene with overlapping margin distribution as shown in Plot B, would be considered to be uninformative since including this gene in a model (Peak 1) decrease the margin comapred to the SVM models without including this gene (Peak 0).

Download

MIA_1.5.zip, Windows

MIA_1.5.zip, Mac (MATLAB2010a)

Usage

After unzipping this package, you will find a demo script file (Demo_package_functions.m). It shows how to run MIA, how to plot margin distribution of each variable etc.

Reference

If you use this method, please cite the following paper:

H.-D. Li, Y.-Z. Liang, Q.-S. Xu, et al., Recipe for Uncovering Predictive Genes using Support Vector Machines based on Model Population Analysis, IEEE/ACM T Comput Bi, 8 (2011) 1633. PDF



libPLS: an Integrated Library for Partial Least Squares Regression and Discrimiannt Analysis. libPLS is under continuous development.