GithubHelp home page GithubHelp logo

niyathic / bio-stock-picker Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 22 KB

Pick biotech and pharma stocks based on recent trial announcements. Features to be added for webscraping for articles, and investing through brokerage accounts.

Python 100.00%

bio-stock-picker's Introduction

bio-stock-picker

Niyathi Chakrapani Anshuman Konuru

Modules used: Datetime - used in getStockPeaks(), get_google_finance_intraday(), to format and use datetime information for pharma trial announcement publications, e.g. line 346 Pandas - used in get_google_finance_intraday() to make dataframes to handle information, e.g. 453 Requests - used in get_content, get_google_finance_intraday() for webscraping, e.g. 423 Re - used in get_google_finance_intraday() for matching, 435 csv - used in get_google_finance_intraday() to read and parse data, 425 RequestException - used in get_content() to handle and log errors during webscraping, 55 closing - used in get_content(), findTopOccurrences() during webscraping, 95 BeautifulSoup - used in findTopOccurrences() to parse html data, 103 urllib - used in findTopOccurrences() to read html, 101 Used for data structures: defaultdict - used in findTopOccurrences(), 99 OrderedDict - used in findTopOccurences(), 113 Counter - used in findTopOccurences(), 144 string - used in clean(), 213

Classes from top to bottom: get_content(), is_good_response() and log_error() work in conjunction to get the content at a url by making an HTTP request. is_good_response() and log_error() specifically test whether the response is an HTML and print errors, respectively. findTopOccurrences() creates a dictionary mapping all words in a url to their occurrences. populate() calls findTopOccurrences() to apply it to many urls. clean() calls populate() to get this dictionary, then "cleans" or removes entries that have irrelevant words or special characters. testStocks() gets a csv with information on stock tickers, urls for their trial announcements, the date and time of those announcements, and whether the announcements are positive or negative. It parses this info, cleans (calls clean()) it, and reduces the inputted info to just very positive stocks and their urls, dates and times. It then calculates the precision and calls getStockPeaks() getStockPeaks() finds, for all the positive trial announcement tickers from testStocks(), the average time that the stock peak occurs. It then calculates how much money you would make from buying all these stocks and selling it at that time.

No generators/decorators were used.

Brief Synopsis: We used training and testing ML to identify what keywords correspond to positive pharmaceutical trial announcements. Then, after cleaning the dictionary of these keywords, we were able to use them to identify which pharmaceutical trial announcements were outstandingly positive in our test data set (were very strongly positive) From there, we tested the precision of our positive labels against manually-determined announcement results (instead of recall, we wanted to ensure that all announcements that we labeled positive, were indeed positive). We found the average time of stock peak after the announcement within the same day, then calculated the amount of money that would be made by buying the stocks at the announcement time and selling them at this calculated average time. Combining the results from multiple runs, this number is overwhelmingly positive.

To launch the project, simply build/run it and you will see the step outputs, as well as the final output of how much money you made!

Feel free to use this code for personal use. For commercial use please contact niyathic.

bio-stock-picker's People

Contributors

niyathic avatar

Stargazers

Dylan Masters avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.