Starting from:

$34.99

CSE569 Project 2 Solution

In this project, you will study how to use a common SVM package through doing some classification tasks.
Data Set: The given dataset contains 50 categories/classes. The training set has 4786 samples in the file ‘trainData.mat’, and the testing set has 1833 samples in the file ‘testData.mat’. Each sample is described by the rows of 3 different feature matrices i.e., 𝑿1, 𝑿2, and 𝑿3 in the corresponding file, and the category vector is always 𝒀. All the 3 features are normalized histograms, which means the elements are nonnegative and the sum of each feature equals to 1 (i.e., ∑𝑗 𝑿𝑘(𝑖, 𝑗) ≡ 1).


import scipy.io
data = scipy.io.loadmat(‘matlabfile.mat’)

SVM Package:
There are many SVM toolboxes. In this project, you will use libSVM. You can either use Python or Matlab for this project.

For Python users, the installation instructions are provided on the following link:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/

For Matlab users, you can directly use the code in folder ‘libSVM’. Run ‘make.m’ to install on your PC. The instructions and examples on how to use the package can be found in the file ‘README0’.

We will use the function svm_train in Python (svmtrain in Matlab) to train and svm_predict in Python (svmpredict in Matlab) for prediction.

NOTE: Fix the penalty parameter to ‘-c 10’ and use linear kernel ‘-t 0’ for svm_train/svmtrain.

Step 0: Classification by individual features.
Output: The classification accuracy for the testing set in the follow cases (1) and (2).
Instructions:
(1) For each of the 3 features in the training set, 𝑿𝑘 (1 ≤ 𝑘 ≤ 3), train a multi-class linear SVM classifier, i.e., ℎ𝑘(𝐱). Get the prediction result of ℎ𝑘(𝐱) based on the same feature 𝑿𝑘 in the testing set and compare to 𝒀 for computing the classification accuracy.

(2) Based on the SVM classifiers ℎ𝑘(𝐱), we can also obtain 𝑝𝑘(𝑤𝑖|𝐱), the (posterior) probability of sample 𝐱 that it belongs to the 𝑖-th category (𝑤𝑖) according to feature 𝑿𝑘 (1 ≤ 𝑘 ≤ 3). This can be done by using the parameter ‘-b 1’ option in training and testing (check http://www.csie.ntu.edu.tw/~cjlin/libsvm/ for more details). Train the SVM classifiers with this option and report the classification accuracies on the testing set based on the 3 features respectively.

Step 1: Feature combination by fusion of classifiers.
Output: The classification accuracy in the testing set and compare it to that of (2) in Step 0.
Instructions: Directly combine the 3 SVM classifiers with probability output i.e., 𝑝𝑘(𝑤𝑖|𝐱) (1 ≤ 𝑘 ≤ 3), in (2) of Step 0. Combine the 3 classifiers by probability fusion as 𝑝(𝑤𝑖|𝐱) = ∑𝑘 𝑝𝑘(𝑤𝑖|𝐱)⁄3. The final recognition result is 𝑤𝑖∗ = argmax𝑖 𝑝(𝑤𝑖|𝐱).

Step 2: Feature combination by simple concatenation.
Output: The classification accuracy in the testing set and compare it to that of (1) in Step 0.
Instructions: Directly concatenate the 3 features 𝐗𝑘,1 ≤ 𝑘 ≤ 3 to form a single feature, i.e. 𝐗 = [𝐗1 , ⋯ , 𝐗𝐾]; train a linear SVM classifier based on 𝐗 and obtain the classification accuracy for the testing set.


What to submit:

1. Your code for the above steps.
2. A report summarizing the results with the following format-
a. Introduction – start with problem statement, data description etc.
b. Method – your understanding of using this svm package, steps followed
c. Results and observation – the results asked in each of the steps (any intermediate results you want to show) along with your observations
d. Conclusion
Note: There is no minimum or maximum length requirement for the report. Writing the report is the opportunity for you to reflect on your understanding of the problems/tasks through organizing your results.
3. The report should be typed (handwritten reports are not allowed) and in a .pdf format (to be submitted as separate document, not included within the code file).
4. Do not submit a .zip file. Submit multiple individual files on Canvas instead.

The data files for the project are uploaded in the Files/Assignments folder:
trainData.mat testData.mat

More products