parse files with different formats and export data in json format
- parsed files data export in json format and located on current_working_directory/parsing_result/simple.json
- if folder parsing_result not found it will be created
- if file simple.json exist it will overwrite it
- if file simple.json not exist it will create it
- software packages are necessary to run this project
- Git version control language
- python software version 3.8.5
- pip3 python3 package index
- clone project from github
git clone https://github.com/BakrFrag/File-Parser
- go to inside project file
cd File-Parser
- install libraries used in project
pip3 install -r requirements.txt
or
pip3 install xlrd==1.2.0
pip3 install untangle==1.1.1
- add executable permission to file parser.py
sudo chmod +x parser.py
- if you want to run tests
python3.8 -m unittest
- start use parser.py and parse files
- script handle arguments from command line
- in case of csv and xlsx format customers file has to be parsed first
- files headers must match
- if exception will parse files happen it will printed on terminal exception like parse files with headers miss match , invalid file or file not exist
- if files parsed it will return the path to parsing output
- if xlsx customers has to be in first sheet in first workbook and vehicles has to be in first sheet in second workbook
- in case of csv format or xlsx format vehicles related to customer extracted by joining vehicle.owner_by == customer.id
- to parse xml file
./parser.py xml xml_file.xml
- to parse csv file
./parser.py csv customers_file.csv vehicles_file.csv
- to parse xlsx workbook
./parser.py xlsx customers_file.xlsx vehicles_file.xlsx
- parsers directory include parser classes used to parse different file formats like xslsx csv and xml parser_helper include utlity functions used by any parser
- tests directory include unittests for parser classess
- parser.py handle script arguments and apply right parser on them
- this script is developed in enviroment with Ubuntu 20.04 relase
- all file formatted using pep8
- each method or class include description about it