GithubHelp home page GithubHelp logo

aghilesazzoug / sciwatch Goto Github PK

View Code? Open in Web Editor NEW
8.0 3.0 2.0 3.45 MB

SciWatch is a Python package designed to facilitate scientific monitoring for researchers

Home Page: https://aghilesazzoug.github.io/SciWatch/

License: MIT License

Makefile 1.30% Python 92.35% HTML 6.35%
arxiv openai python scientific-monitoring techcrunch

sciwatch's Introduction

tests docs mit_license

SciWatch is a Python package designed to facilitate scientific monitoring for data scientists and AI researchers (mainly). It serves as a useful tool for staying up-to-date with the latest developments in the ever-evolving world of science and technology. By effortlessly retrieving relevant scientific papers and technical blogs, SciWatch empowers researchers to keep their knowledge current and expand their horizons in their respective fields.

Usage

  1. Setup senders

See senders documentation for details

Example for with Gmail, setup the following env variables:

export [email protected]
export gmail_token=your_token
  1. Write a config (scrapping_config.toml)
title = "LLM & AL Watch" # Will be used as email title

end_date = "now" # will search content up to now (exec. time)
time_delta = "02:00:00:00" # will look for content up to two days ago

recipients = ["[email protected]"]

# define your queries
[[query]]
title = "LLM" # LLM query
raw_content = """intitle:(GPT* OR LLM* OR prompt* OR "Large language models"~2) AND incontent:(survey OR review OR evaluation* OR benchmark* OR optimization*)"""

[[query]]
title = "AL" # Active Learning on VRD (or benchmarks/surveys)
raw_content = """intitle:("active learning") AND incontent:(VRD OR documents OR survey* OR benchmark*)"""

# define your sources
[[source]]
type = "arxiv" # check for Computer Science papers on Arxiv
use_abstract_as_content = true
search_topic = "cs"
max_documents = 200

[[source]]
type = "openai_blog" # check for latest blogs on OpenAI blog (mainly for GPT updates)
max_documents = 20
  1. Run the watcher
from sci_watch.sci_watcher import SciWatcher

watcher = SciWatcher.from_toml("scrapping_config.toml")

watcher.exec()  # if some relevant content is retrieved, recipients will receive an Email

You might get an email like this:

Documentation

For full documentation, including grammar syntax, check the docs.

Contributing

Contribution are welcome by finding issues or by pull requests. For major changes, please open an issue first to discuss/explain what you would like to change.

  1. Fork the project
  2. Create your feature branch following the convention feature/feature-name (git checkout -b feature/feature-name)
  3. Run pre-commit (make pre-commit)
  4. Commit your changes (git commit -m "a meaningful message please")
  5. Push to the branch (git push origin feature/feature-name)
  6. Open a Pull Request

Roadmap

  • (feat) Add GPT support for papers summarization
  • (feat) Add better error handling (while scrapping, calling OpenAI API, etc.)
  • (refactor) Refactor configuration file parsing (and a lot of other things)
  • (perf) Add short-circuit evaluation for queries
  • (perf) Run sources only once for all queries
  • (perf) Process queries asynchronously

Feel free to post an issue or send an email if you have any idea :)

License

Copyright 2024 Aghiles Azzoug

SciWatch is free and open-source software distributed under the terms of the MIT license.

Contact

Aghiles Azzoug - LinkedIn - [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.