This the my Github repo. Contact me for support and PRs are welcome.
- Pip install the package.
$ pip3 install smart-search
- NOTE : Please have the pickle file in the same folder as the python script in which you will use our pip package.
Here i use the glove.6B.zip file from Standfords Github repository from the hyperlink. Then pickled this model for easier loading. Download it from here
- Import the library.
>> import smart_search
- Create an object of the class, smart_search.model(). Say,
functioncaller
.
>> functioncaller = smart_search.model()
- Now to convert a pdf to a list of lists containing page.no and words after stop word removal, we use the built in function
getting_list_of_words()
. This accepts 1 argument, i.e the path to the pdf and returns the required list to be fed to the model.
>> pdf_list = functioncaller.getting_list_of_words('path to your pdf')
- Pass this list to the model along with the word you want to get the search result of using the
perform_skip()
function. This accepts 2 variables, i.e the list produced by the previous function and the word you want to search for and retuns the top 5 relevant search locations of the word you searched for.
>> location[0:5] = functioncaller.perform_skip(pdf_list, input_word)
- You can use subprocesses library of python to navigate to the page if you want to.