GithubHelp home page GithubHelp logo

data-science's People

Contributors

ahmedabbas11 avatar akshat02 avatar attackgnome avatar daniel-web-developer avatar elahi-cs avatar ericdouglas avatar johnaoss avatar meboler avatar pulkitkrishna00 avatar raincrash avatar royshouvik avatar sidgupta234 avatar smcgb avatar tawfiq9009 avatar torrontogosh avatar tuskydev avatar waciumawanjohi avatar zayd-r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-science's Issues

Natural Language Processing

Hi, it seems the Natural Language Processing course linked does not exist anymore.

I recommend this course to be used in place.

If you agree, I can make a pull-request.

SDK update required to sign up via Web App

Steps to reproduce
Navigate to https://ossu.firebaseapp.com/#/
Click Login with Github
Error Message encountered: {"code":"SERVICE_UNAVAILABLE","statusCode":410,"message":"SDK Update Required: See https://goo.gl/2cKQWm.","details":"SDK Update Required: https://goo.gl/2cKQWm. "}
More details on updating the SDK as of December 2018 is given in below google doc.
https://docs.google.com/document/d/1vpQV8DBQLkIZci7Vh8N4LUHyTkBOsuW5eXE1x8IAqvw/edit?usp=sharing

Would like to know if we can update the SDK as per above instructions. Thanks

Request for Comments: Data Science Curriculum v2

Problem:
The curriculum has not been maintained and does not represent best practice.

Duration:
2020-08-31

Background:
OSSU recommends courses that would constitute an undergraduate major in Data Science. It is our responsibility to ensure that we follow best practice. To do so, we must bring the curriculum into alignment with external guidelines. A candidate set of guidelines has been identified and previously proposed.

In 2017, the Annual Review of Statistics and Its Application published the report "Curriculum guidelines for undergraduate programs in data science." The report was authored by “25 undergraduate faculty from a variety of institutions in the United States, primarily from the disciplines of mathematics, statistics, and computer science.” It had a goal of providing “structure for institutions planning for or revising a major in data science.”

The current state of OSSU Data Science is one of disrepair. The curriculum has had 1 change in 3 years. That change deleted a link to a broken application. But there remained many links to courses that are no longer offered. A list of these can be found here. Prospective students have posted in the issues asking if the Data Science curriculum is still maintained. Updating the curriculum must ensure that all courses are available for students.

Proposal:
OSSU Data Science should adopt “Curriculum guidelines for undergraduate programs in data science” (CGUPDS) as our guidelines. The curriculum should be updated to match. The exact changes can be reviewed in this pull request.

RFC: Remove Patreon Link

Problem:
The OSSU Patreon link at the top of the curriculum does not navigate to an active contribution page on Patreon and may be Eric's individual page. (The same link is listed on the bioinformatics curriculum).

Duration:
Until 11/23/2021

Proposal:
Propose removing the Patreon link or replacing with an updated OSSU Patreon account.

Not all courses listed here are free

The majority of courses that link to Coursera from the Databases and Data Science Tools & Methods section are not free because they belong to some specializations. Is there any way to access them for free or alternatives to them?

Data Wrangling

Hi
I'm currently a Udacity Data Analyst Nanodegree student. One of the tougher projects that I encountered recently was that of data munging and wrangling. As New York Times also noted once, the data wangling part can take 50-80% of a data scientist's time in data science projects. So I thought it should be important for people to get their data wrangling skills on par as well when learning data science.
If you'd like you can include the following course under a new 'Data Wrangling' section -
Data Wrangling with MongoDB

Or maybe any other courses from EdX, Coursera or elsewhere.

Thanks
Akshat Tickoo

Fixing the link of Probability and Statistics

The link "Probability and Statistics with R" currently points to
https://github.com/open-source-society/data-science#probability-and-statistics-with-r

But the section "Probability and Statistics with R" has href "#probability-and-statistics".
Either of them need to be changed.

I am learning to contribute. Please help me out in creating a pull request because I want to fix the issue myself. I want this to be my first contribution towards open-source.

RFC:

Problem:
From what I can see there is no way to track the progress, unless the user tracks it all by themself. Could we create a README.md that the user can enter a command and that class is checked off as "completed" by the program when rendered.

There are more things I would like to add to this like specific courses for specialties but I didn't know how active or dynamic the project is.

our curricular guidelines. Examples are:

  • OSSU lists course X as required when the course's topics are elective in our curricular guidelines.
  • OSSU does not having a course to cover required topic X from our curricular guidelines.
  • OSSU lists courses X, Y and Z that cover the same topics when fewer courses could suffice.
  • OSSU recommends course X to teach a topic, but there exists a higher quality course that covers the same material.

Duration:
This should most often be 1 month from the date of posting.

Background:
Give an in depth description of the problem. Describe a solution to the problem. Describe the advantages and disadvantages of this solution. This section should be a few paragraphs.

Proposal:
Give a bullet point list of changes that are being proposed. These can link to a Pull Request.

Alternatives:
Give a bullet point list of alternative ways to address the problem.

Add official badge

Maybe we can put an official badge in this repository, so students will be able to link back to this in their own projects.

Example: Open Source Society University - Data Science

  • Markdown: [![Open Source Society University - Data Science](https://img.shields.io/badge/OSSU-data--science-blue.svg)](https://github.com/open-source-society/data-science)
  • HTML: <a href="https://github.com/open-source-society/data-science"><img alt="Open Source Society University - Data Science" src="https://img.shields.io/badge/OSSU-data--science-blue.svg"></a>

😄

Is this course still relevant?

Last update on this course is made 3 years ago so is it still relevant? Like are there better free courses out there than the ones listed here? I am planning on starting this. Anything I need to know?

Qesstion

Should I take computer science curriculum? or I can start with that?

LICENSE

The repository is missing a LICENSE, propose to add MIT license consistent with the ossu/computer-science repository.

Linear Algebra

Is there not a better Linear Algebra course? This LAFF course is painfully obtuse and dull.

Courses in the web app are different

The Readme says that I can track my progress in the web app ("my progress" tab) however there are different courses there, e.g. the data science track features Calculus from EdX and MIT but in the web app there is only calculus from Coursera. Am I missing something?

a lot of Python Courses

There is 4 courses for Python, it's more than enough
2 Courses 4 probability
and more duplicated courses
Please refactor the content like Computer Science track

RFC: Add python alternative for algorithms course

Problem:
Our curricular guidelines do not require learning multiple languages, but our curriculum asks students to learn a language just to take the algorithms classes.

Duration:
2021 Aug 15

Background:
Our curricular guidelines makes only a few references to programming languages. Students are expected to know SQL, the "language" of math, and "a suitable high-level language". (emphasis mine) The introductory courses, and most courses, use Python as this high level language.
But students are directed to Robert Sedgewick's Algorithms course, which is taught in Java.

Students have asked for a python alternative in issues and in the discord (example).

A possible option is the free interactive textbook Problem Solving with Algorithms and Data Structures using Python. The book links to a set of supporting lectures. It also has some exercises in the text, which are paired with youtube video solutions. Each chapter ends with a set of exercises; it does not seem that there is an official solution set but some student solutions can be found on github.

This free textbook is used by dozens of college courses. It is well rated by goodreads and by pythonbooks (which is really a measure of popularity on Amazon).

It is not clear that the book is of the same quality as the Sedgwick course. For one, the Sedgewick course provides an autograder. For another, user ratings of the Sedgewick Algorithms book are notably higher.

Proposal:
Offer Problem Solving with Algorithms and Data Structures using Python as an alternative Algorithms course for students who want to study in Python.

Alternatives:

  1. Stick with the status quo.
  2. Replace the Sedgewick course rather than offer an alternative.

Q: How long does this program typically take?

Hi, sorry if this is the wrong place to post this question but I can’t seem to find an answer. What is the typical duration for the ossu data science course (assuming a full-time study load)? Thanks!

Probability and Statistics with R

curriculum title - "Probability and Statistics with R"

This part have nothing to do with R programming, I think the the term "r" should be excluded from the title, and an addition of curses about R and statistics is necessary.

Trello Board

When accessing the Data Science Trello Board, and you click on the menu.. we are unable to copy the board.

RFC: Overhaul Statistics

Summary

OSSU should undertake a search for a number of new courses in statistics.

Background

OSSU currently recommends 2 courses on statistics:

The first of these is no longer offered.

Guidelines

OSSU Data Science uses the report Curriculum Guidelines for Undergraduate Programs in Data Science as our guide for course recommendation.

Section 6 "Transitioning To A Data Science Major Using Typical Existing Courses" states:

...The courses shown in bold are the ten courses that cover the bare minimum of the basic skills needed for data science...

Subsection 6.3 "Courses in Statistics" states:

Content in the Introduction to Statistics course should follow the revised Guidelines for Assessment and Instruction in Statistics Education (GAISE) for college courses

  • Introduction to Statistics
  • Statistical Modeling/Regression
  • Machine Learning/Data Mining
  • Theory of Statistics (requires Probability Theory)

Gaise

For reference, the K-12 GAISE report uses a framework of 3 levels of sophistication with stats expected of K-12 students. This can be found on page 24.

The GAISE College Report includes both goals, recommendations and suggestions for topics that might be omitted.

Goals (summarized)

  1. Critique stats based results/conclusions.
  2. Recognize when statistics would be useful and carry out investigations using stats.
  3. Produce graphical displays and numerical summaries. Interpret them.
  4. Explain the role of variability in statistics.
  5. Explain the central role of randomness in designing studies and drawing conclusions.
  6. Use statistical models, including multivariable models.
  7. Understand and use hypothesis tests and interval estimation in a multiple of settings.
  8. Interpret and draw conclusions from output of statistical software packages.
  9. Demonstrate an awareness of ethical issues associated with sound statistical practice.

Recommendations

These are largely recommendations for how statistics courses should be taught.

  1. Teach statistical thinking
  2. Focus on conceptual understanding
  3. Integrate real data with a context and a purpose
  4. Foster active learning
  5. Use technology to explore concepts and analyze data
  6. Use assessments to improve and evaluate student learning

Suggestions for Topics that Might be Omitted from Introductory Statistics Courses

  • Probability theory
  • Constructing plots by hand
  • Basic statistics
  • Drills with z-, t-, χ 2 , and F-tables
  • Advanced training on a statistical software program

Of note, the basic statistics section reads:

Histograms, pie charts, scatterplots, means, and medians are now taught in middle and high school and are a prominent part of the Common Core State Standards in Mathematics. Classes taught to adults continuing their education or to students with a different high school background may need to spend a bit more time on basic statistics. No matter the audience, instructors will want to be sure that students truly understand these concepts, but should not dwell on them more than is necessary. Instructors may want to briefly review them to be sure terminology and notation are consistent, but this should take little time.

Assertions

  • OSSU Data Science curriculum should not recommend a descriptive stats course. This is prerequisite material; OSSU's focus is requisite material for undergraduate learners.
  • OSSU should identify a suitable Introduction to Statistics course, replacing the two current recommendations
  • After identifying the appropriate Introduction to Statistics course, OSSU should determine if a Statistical Modeling/Regression course is necessary. I would be unsurprised if a suitably rigorous Intro Course, paired with our existing ML courses prove sufficient.
  • OSSU should identify an optional Theory of Statistics course.

Request for Comments

This RFC is asking specifically for comments on the assertions above. Are these the right steps? Are there other implications for OSSU's curriculum that are not identified?

There will be other RFCs for carrying out the individual steps (e.g. there will be a separate RFC for Identify an Introduction to Statistics course).

Programming languages

As data scientist python is your native language it's a bit annoying to learn an other programming language just for data structure and algorithms , So I think you may need to find alternative courses for data structure and algorithms using python

Capstone in Coursera and Nanodegree of Udacity are not free.

As far as I know inn Coursera however all courses from the specialization except for the capstone project can be found and enrolled separately. And to access the capstone project, you need to get certification from all other courses from the specialization.
Also on Udacity the nano degree programs are not free.

I think you should add these points in the introduction as many people may think all those specializations are free. Unless I'm wrong in which case I would like to know what's the procedure of accessing them for free.

Thanks

PROJECTS file

Hi,

It seems the projects file of computer-science path has been moved to it's own repository and I was noted that the file in open-source-society/help is not used anymore, it seems convenient to have our own PROJECTS file in this repository.

We could use the file in open-source-society/help as a starting point, but that seems too general as it includes courses from other paths, too.

related: https://github.com/open-source-society/help/pull/7

mentioning: @ericdouglas

5 years duration !

i just calculated the duration for all courses and it was 5 years ! 😄

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.