ITE4005-Programming Assignment 2 Solved

Starting from:

$20

1. Build a decision tree, and then classify the test set using it

3. Requirements
The program must meet the following requirements: l Execution file name: dt.exe

l Execute the program with three arguments: training file name, test file name, output file name n Example:



- Training file name=‘dt_train.txt’, test file name=‘dt_test.txt’, output file name=‘dt_result.txt’ - If using python, you are allowed to use 'dt.py' file instead of 'dy.exe'.

l Dataset n We provide you with 2 datasets

- Buy_computer: dt_train.txt, dt_test.txt

- Car_evaluation: dt_train1.txt, dt_test1.txt n You need to make your program that can deal with any datasets n We will evaluate your program with other datasets.

l File format for a training set

[attribute_name_1]\t[attribute_name_2]\t … [attribute_name_n]\n

[attribute_1]\t[attribute_2]\t … [attribute_n]\n

[attribute_1]\t[attribute_2]\t … [attribute_n]\n

[attribute_1]\t[attribute_2]\t … [attribute_n]\n

n [attribute_name_1] ~ [attribute_name_n]: n attribute names

n [attribute_1] ~ [attribute_n-1]

- n-1 attribute values of the corresponding tuple

- All the attributes are categorical (not continuous-valued) n [attribute_n]: a class label that the corresponding tuple belongs to n Example 1 (data_train.txt):

Figure 1. An example of the first training set.

n Example 2 (data_train1.txt):



Figure 2. An example of the second training set.

- Title: car evaluation database

- Attribute values l Buying: vhigh, high, med, low l Maint: vhigh, high, med, low l Doors: 2, 3, 4, 5more l Persons: 2, 4, more l Lug_boot: small, med, big l Safety: low, med, high

- Class labels: unacc, acc, good, vgood

- Number of instances: training set - 1,382; test set - 346

l Attribute selection measure: information gain, gain ratio, or gini index l File format for a test set

[attribute_name_1]\t[attribute_name_2]\t … [attribute_name_n-1]\n

[attribute_1]\t[attribute_2]\t … [attribute_n-1]\n

[attribute_1]\t[attribute_2]\t … [attribute_n-1]\n

[attribute_1]\t[attribute_2]\t … [attribute_n-1]\n

n The test set does not have [attribute_name_n] (class label) n Example 1 (dt_test.txt):

Figure 3. An example of the first test set.

n Example 2 (dt_test1.txt):



Figure 4. An example of the second test set.

l Output file format

[attribute_name_1]\t[attribute_name_2]\t … [attribute_name_n]\n

[attribute_1]\t[attribute_2]\t … [attribute_n]\n

[attribute_1]\t[attribute_2]\t … [attribute_n]\n

[attribute_1]\t[attribute_2]\t … [attribute_n]\n

n Output file name: dt_result.txt (for 1th dataset), dt_result1.txt (for 2nd dataset) n You must print the following values:

- [attribute_1] ~ [attribute_n-1]: given attribute values in the test set

- [attribute_n]: a class label predicted by your model for the corresponding tuple n Please DO NOT CHANGE the order of the tuples in each test set.

- You should print your outputs to match the order of correct answers.

n Please be sure to use \t to identify your attributes.

More products

CSCI3901-Lab 8 SQL Queries Solved

$25

Add to cart

CSCI3901-Lab 7 MySQL Connection Solved

$25

Add to cart

CSCI3901-Lab 6 Managing software Solved

$25

Add to cart