GithubHelp home page GithubHelp logo

uw-ssec / tutorials Goto Github PK

View Code? Open in Web Editor NEW
1.0 3.0 2.0 4.04 MB

SSEC tutorials for various topics

Home Page: https://uw-ssec-tutorials.readthedocs.io

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 94.73% Python 5.27%

tutorials's Introduction

ℹ️ SciPy Tutorial Attendees: You can find the tutorial materials in the SciPy2024 directory or click here.

Tutorials

This is the repository for a Jupyter Book website with tutorial materials for the University of Washington Scientific Software Engineering Center (UW-SSEC). The tutorials are written in Jupyter notebooks and are organized by topic or workshop.

The website is hosted at https://uw-ssec-tutorials.readthedocs.io.

Github Codespaces

This tutorial is available to run within Github Codespaces - "a development environment that's hosted in the cloud".

A codespace is a development environment that's hosted in the cloud. GitHub currently gives every user 120 vCPU hours per month for free, beyond that you must pay. So be sure to explicitly stop or shut down your codespace when you are done by going to this page (https://github.com/codespaces).

Contributing

If you would like to contribute to the tutorials, please feel free to open up issues or pull requests, regarding current or future materials.

Thanks to our contributors so far!

Contributors

tutorials's People

Contributors

anantmittal avatar lsetiawan avatar vanitech avatar uwcdc avatar anujsinha3 avatar

Stargazers

Henry Hamon avatar

Watchers

 avatar Anshul T. Tambay avatar  avatar

Forkers

vanitech madhavmk

tutorials's Issues

chore: Research on the best embedding models for the tutorial

Overview

There are so many embedding models as seen here: https://huggingface.co/sentence-transformers. This piece is critical on creating the best vector, which then gets feed into the VectorDB. Figuring out the most optimized embedding model is key here and probably having a good knowledge will be beneficial for participants, so let's look into it. @vanitech and @anantmittal do you have any insights on this?

Resources

feat: Create Embedding Tutorial

Overview

After @anujsinha3 have done such a thorough investigation on embedding models on issue #6, we need to put together a comprehensive tutorial on this.

Outline

  • What are embeddings?
  • Importance of embeddings in RAG applications
  • How to choose the right open-source embedding model for your RAG application.
  • Evaluating embedding models: what to look out for? performance considerations?

feat: Create tutorial on parsing arXiv dataset

Overview

@vanitech presented a really great dataset that would be really beneficial to be inserted into the vector database for use as context for the RAG approach. This dataset is essentially all of the papers in arXiv: https://www.kaggle.com/datasets/Cornell-University/arxiv/data

After some investigation it looks like this data lives in Google Cloud Storage, so it can easily be retrieved with fsspec without any credentials!

The kaggle repository does contain an index of this data, we can either just grab the data from gcs or directly from arXiv using langchain arXiv loader: https://python.langchain.com/docs/integrations/document_loaders/arxiv

chore: Add word wrapping to text sections as necessary

Especially natural language outputs, needing to scroll to the side seems to (to me, at least) diminish readability.

In addition, restructuring code line lengths with a limit of 80 characters (where/when it makes sense) would likely let code snippets be displayed without needing to scroll to the side.

feat: Incorporate astro questions into the tutorial

Questions to ask

• What observations show evidence for dark matter
• What are the most popular theories for what dark matter might be made of
• Does dark matter need to be a particle
• What is the expected mass of dark matter
• What experiments are currently ongoing and searching for dark matter
• How do they work
• What type of dark matter would the LHC be able to detect
• Can you modify gravity to explain dark matter. If so how

• How do astronomers detect the presence of planets around nearby stars
• What is an eclipsing binary star
• Can you describe a taxonomy for different classes of variable stars
• What would the light curve of a tidal disruption event look like. How would it be different from a super nova
• How do supernova explode
• How is a Type II SN different from a Type 1a

TODO: define evaluation criteria

ci: Setup Jupyterbook

Overview

Need to setup jupyterbook infrastructure to build and host this repo to readthedocs!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.