COMMAND LINE ARGUMENTS
folder- it stores the string "email/" in it. files- it stores each email. emails- it stores the concatenated string .i.e. folder + files words- it stores each word of emails. words_dict- it is a dictionary that contains each words with its number of occurance.
CODE IDEA
The idea is this code can determine whether an email is spam or not. First each word of every email is extracted, then it is observed the presence of which word makes an email a spam. After that the whole dataset is divided into training and testing dataset.