GithubHelp home page GithubHelp logo

joshniemela / ku-courses Goto Github PK

View Code? Open in Web Editor NEW
19.0 2.0 5.0 6.92 MB

A better version of KU's course catalogue

Home Page: https://disku.jniemela.dk

License: MIT License

Nix 0.63% Dockerfile 0.81% HTML 0.18% Svelte 25.31% JavaScript 1.67% TypeScript 6.34% Clojure 24.55% CSS 0.04% Rust 40.48%
catalogue clojure reitit svelte swagger disku

ku-courses's Introduction

KU-Courses

Example of KU-Courses

The entire application is governed through the docker-compose.yml file and is built with docker compose:

Starting the application

  1. Install docker and docker-compose, this may need a restart of your system since Docker is a very low level program.
  2. Run docker compose up --build as either a user with permissions to docker, or with sudo/doas, the build flag is required if the backend or frontend code has been changed, additionally -d will make it detach from the terminal.
  3. Wait for the scraper in the backend to complete scraping pages, this may take about 15 minutes.
  4. Run docker compose restart, this is required so that the parser will run and so that the vector store can create new embeddings.
  5. ???
  6. PROFIT!!!

db-manager

The backend is built with Clojure, a functional programmering language based on Lisp which runs on the Java Virtual Machine.
This part serves multiple purposes, it is responsible for scraping the course pages from KU as well as the statistics from STADS.
The backend also serves the frontend and contains the "datascript" database and is responsible for refreshing and various services occasionally (this feature is partially broken at the moment).

vector_store

This service is responsible for the semantic searches used in the get_course_overviews route, instead of using trigrams or full-text, we decided to use vector searches for the lower latency.

rust_parser

This service is the parser that takes the scraped course pages and parses them into a format we can use in the database for searching and for serving to the frontend.

frontend

Frontend is built in Svelte/Typescript. This is a highly responsible SPA that shows the courses in the form of cards which can be clicked into to get a more detailed view of the course.

Credits

  • Thanks to Jákup Lützen for creating the original course parser in Python.
  • Thanks to Kristian Pedersen for creating the original frontend, and help in designing the architecture and first database schema.
  • Thanks to Zander Bournonville for creating the statistics parser.

ku-courses's People

Contributors

dependabot[bot] avatar drzder avatar joshniemela avatar julebarn avatar kristiandampedersen avatar miguelmagueijo avatar xatha avatar zeyudeng36 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ku-courses's Issues

Filtering exams removes other exams

Advanced Computational Geophysics for instance has both an oral exam and a written assignment, if filtered on oral exam, all exams will be shown with only oral exams, the secondary exam is filtered away, this happens in the SQL code of db-manager

TODO:

  • test if manually querying postgresql causes the same issue
  • find fix?

Back button Z level

Back button on the course pages is below the stats graph and cannot be clicked

A complete rewrite of the python parser (Urgent)

The parser has to be completely rewritten due to being completely impossible to navigate and the fact that newer courses generated by KU no longer parse due to the panel-bodies being restructured differently (the previous statement is wrong, due to an acute case of stupidity I forgot to check if the HTMLs that were being scraped were acutally HTML files (they were not))

Margins

The pages need left-right margin to look slightly better for the user

Parser missing features (M):

These issues are in decreasing order of priority

  • It would be very wise to start writing unit tests to ensure how the parser is behaving and if its doing the expected of the below, or use property based testing
  • Almost everything is a list that contains 1 element, we need to figure out if they can be turned into key-vals or if they appear as larger items often enough to warrant being lists
  • workload is currently a list of the shape [..., "exams", 40, "total", 60, ...] this needs to be a list of key vals: [..., "exams": 40, "total": 60", ...], the first two values of category and type can be entirely omitted since we already implicitly know what they should be
  • WARNING DIV is a placeholder that needs to be removed once we know the parser is doing stuff as intended with divs
  • Some courses contain escape codes for danish letters instead of the actual danish letters: \uXXXXXinstead of æøå, is this reproducible?
  • Some keys have not been localised to english, etc Engelsk titel $\rightarrow$ english title
  • Schedule possibly still squishes div's together by calling .text too early, confirm?
  • Nulls appear in one of the fields in LBIK10214U, why does this happen and does it appear in other courses too?
  • duration & placement should match for the numbering using regex so we can save them as INT(1) in the database
  • credit extract the number of credits with regex so we can save them as FLOAT in the database
  • language change the keys so they are en or dk so we can save them as CHAR(2)
  • level change so we just save the abbreviations of the titles, it can therefore be CHAR(3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.