task.py file contains a python program which extracts questions, multi choice options and answers from the given pdf.
The dependencies for the program are specified in requirements.txt file.
I used 3 inbuilt python modules and one additional python package(PdfMiner) that helps in reading pdf files.
I used Regular Expressions to extract information and they form the main part of logic.
The final output is a csv file named data_file.csv having questions along with options and answers.
- Install PdfMiner with below command or use requirements.txt
pip3 install pdfminer.six
- Run the task.py file and specify pdf file name as a cmd line argument.
python3 task.py The_Living_World.pdf