GithubHelp home page GithubHelp logo

musehd / multidefine Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 9.0 82.06 MB

๐Ÿ“ Compiles the definitions of multiple words into a single defintions list.

License: MIT License

Python 100.00%
definition-generation definition-list hacktoberfest python scraper selenium vocabulary vocabulary-lists

multidefine's People

Contributors

deepsourcebot avatar delaguardianick avatar lpuv avatar metavinayak avatar musehd avatar suvanbalu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

multidefine's Issues

Inconsistent Formatting/Alignment of Results

The current formatting of words is quite inconsistent, adding spaces for some results, but not for others,
I believe this is because different sources have different formatting. This results in unaligned output, as shown:

image

This should be easy to fix by stripping the data collected, removing spaces and/or newlines and making sure that all of them are aligned.

Fix Duplication for Wikipedia

Definitions often break when searching for specific phrases due to google's new design.
Try searching using id and/or use next siblings in selenium to make sure that the right element is being selected and displayed

New driver not updated

update.py gets the required zip file and extracts it but doesn't replace the old driver. So I updated with the new driver in #8 . So do check it !

Async Code execution

One of the major problems right now is performance. The program has to go through each step for each of the words, before it moves on to the next one.

I've wanted to look into async calls but haven't gotten around to doing it. The current code will most likely need to be structured, as the order of operations needs to be taken into account. i.e. The program should only retrieve the definition for a word from one source, rather than getting definitions from several sources. Ideally, it should also detect how much additional load is being put on the system and add threads accordingly, allowing for more performance as well as accessibility.

If anyone is interested in implementing this, please let me know.

Messy Code

Currently, there's a single function called get_ans() which is not ideal from a development perspective.
It would be worth breaking up the statements in the try and except blocks into separate functions to make it easier to maintain and debug.

New search doesn't clear issues from previous search

When the definition for a word is not able to be obtained, it adds it to the list of failed words. When the program is re-run by pressing any key, the list does not get cleared and the failed words are stored in the list for the rest of the session. Should be easy to fix by clearing the list every time the user wants to rerun the program.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.