$20
1. find association rules using the Apriori algorithm
2. Requirements
The program must meet the following requirements:
l Execution file name: apriori.exe
l Execute the program with three arguments: minimum support, input file name, output file name n Example:
- Minimum support = 5%, input file name = ‘input.txt’, output file name = ‘output.txt’
- If you python, you are allowed to use ‘apriori.py’ file instead of ‘apripri.exe’
l Input file format (.txt)
[item_id]\t[item_id]\n
[item_id]\t[item_id]\t[item_id]\t[item_id]\t[item_id]\n
[item_id]\t[item_id]\t[item_id]\t[item_id]\n n Row: transaction n item_id is a numerical value
n There is no duplication of items in each transaction n Example:
l Output file format Figure 1. Input file example (.txt)
[item_set]\t[associative_item_set]\t[support(%)]\t[confidence(%)]\n
[item_set]\t[associative_item_set]\t[support(%)]\t[confidence(%)]\n n [item_set]\t[associative_item_set]: association rules with minimum support
- [item_set]à[associative_item_set]
- Use braces to represent item sets: {[item_id],[item_id],…} (Important!!) l e.g., {0}, {0,4}, {0,3,1}
n Support: probability that a transaction contains [item_set] [associative_item_set]
n Confidence: conditional probability that a transaction having [item_set] also contains [associative_item_set] n The order of output is unimportant.
n The value of support and confidence should be rounded to two decimal places.
- e.g., 24.631 rounded to two decimal places should become 24.63.
n An additional penalty will be imposed if you don’t keep the output file format. n Example: