Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Homework
If too much is never enough, you can implement the following for extra credit. Each extra credit
task will be worth 25% of a homework assignment.
EC Task 1: Using the training data file, implement the Apriori association rules algorithm for use
with the congressional voting data. The original paper is probably the most helpful for
implementation details; note that it outlines other algorithms as well.
Your program should be executable from the command line, with a corresponding file named
“Apriori.java” or “apriori.py”, and should take a filename and a minimum support parameter on
the command line. For example:
%> python apriori.py /path/to/congress_train.csv 19
For this exercise, consider rules concerning with positive “Yea” votes as making up a frequent
item set. You program should output a series of rules, one per line, like so:
3 -> 29
19, 28 -> 16
24, 31, 40 -> 33
Where the numbers are the column indices in the training file (starting with 0).
EC Task 2: Using the training data file, implement the PC structure learning algorithm (there’s
some background on slides 37-47 that we didn’t cover in class in the notes from lecture 10; it’s
also described on slide 45 here). Your program should conduct hypothesis tests using the
chi-square statistic to determine conditional independence (you are welcome to use a stats
library such as scipy for these calculations).
Your program should be executable from the command line, with a corresponding file named
“Pc.java” or “pc.py”, and should take a filename and an alpha parameter value on the command
line. For example:
%> java Pc /path/to/congress_train.csv 0.01
You should output the “skeleton” of the system as a series of edges, one per line (you do not
need to orient the edges):
19 - 42
22 - 25
22 - 28
Again, the numbers here should correspond to the column indices found in the training file
(starting with 0 for the first column).