GithubHelp home page GithubHelp logo

data-lessons / library-sql-deprecated Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 16.0 6.52 MB

SQLite lesson for librarians NOW MOVED > https://github.com/LibraryCarpentry/lc-sql

Home Page: https://github.com/LibraryCarpentry/lc-sql

License: Other

Makefile 3.09% HTML 36.30% CSS 3.36% JavaScript 0.95% Python 52.76% R 3.12% Shell 0.20% Ruby 0.22%

library-sql-deprecated's People

Contributors

abbycabs avatar amsichani avatar bkatiemills avatar c-martinez avatar christinalk avatar cmacdonell avatar danmichaelo avatar elainewong avatar emanuelelanzani avatar erinbecker avatar evanwill avatar fmichonneau avatar gdevenyi avatar gvwilson avatar icecjan avatar jpallen avatar kyrretl avatar mattrob avatar maxim-belkin avatar mkuzak avatar naupaka avatar neon-ninja avatar pbanaszkiewicz avatar pipitone avatar rgaiacs avatar steltenpower avatar synesthesiam avatar twitwi avatar weaverbel avatar wking avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

library-sql-deprecated's Issues

Library challenges

Challenges on lessons 02-sql-aggregation.md and 03-sql-joins-aliases.md are still in ecology speech. Should be migrated to library speech.

00-sql-introduction.md: data types not clear

Some of the entries in the table of "Data types" have the same description, e.g. INTEGER(p), SMALLINT, INTEGER and BIGINT are all described as "Integer numerical (no decimal)". That's a little confusing: why have different data types that appear to be the same?

Also, perhaps add examples of data types (at least, for some).

Lesson episode headers

I've added headers to all the lessons but they are very basic and based on no knowledge of SQL... Could someone who knows more take a look? (who is maintaining this less, @mpfl @mkuzak?)

Proposed changes for 00-sql-introduction.md

Suggested corrections/clarifications

Relational databases

  • Introduction should be fleshed out, include concept of the relationships between tables.

Dataset description

  • Perhaps a diagram of how the three tables relate to each other.

Import

  • Step 1: There are four CSV files at the target URL. The three to use (plots, species, surveys) should be explicitly named in step 1 (they appear again in step 5)
  • Step 1: Link to Portal Project via DOI for long-term link stability: https://dx.doi.org/10.6084/m9.figshare.1314459.v5
  • Step 2: Does not mention that SQLite will ask for a database name. Suggest "portal"
  • Step 2: Does not mention that SQLite will want to save the database to a folder.
  • Step 6: How do we know if the first row contains column headers?
  • Step 10: INTEGER, not INT. What about the Primary Key? What about data types for columns not mentioned?

Data Types

  • Some explanatory text required.

00-sql-introduction.md: steps under "import" not clear

The steps under "Import" are not completely clear: it doesn't mention that you specify the location where your database needs to be saved.
Also, it assumes in step 6 and 10 that you have seen the content of the database: how do I know whether the first row has column headings or what columns contain which data type if I haven't seen the file? Better explain how to get a quick overview of the contents.

Installation of SQL needs to be better explained

I am working through the lesson prior to teaching it and have stumbled across a few gotchas. Setting up SQLite needs to be a documented step-through process as the language used on the SQLite download page is pretty hostile for Windows people. Install binaries? Huh? Install where? And then what? What I got when I installed was a command line SQLite3 but then the lesson suddenly talks about SQLite Manager which is a Firefox Add on - we need to walk people through getting that working as that was slightly complicated as well. It is really important in these lessons that we do not alienate people by assuming knowledge or leaving steps out so people feel stranded.

Lesson maintainers

This lesson has no maintainers listed. Maintainers perform the following tasks:

Maintainers perform a number of important tasks:

  • make sure their lesson is consistent with the other Library Carpentry lessons. For example, that the Readme or License pages are correct and consistent (indeed the readme does need a little work data-lessons/library-webscraping-DEPRECATED#28)
  • address any issues that are raised against the lesson
  • deal with any pull requests that are made for the lesson.
  • after a lesson is taught, make sure that suggestions for improvement from learners and instructors are integrated
  • as this is a new lesson, helping it get through the (new) incubator process data-lessons/librarycarpentry#22
  • and, ideally, keep up with general Library Carpentry chatter at https://gitter.im/weaverbel/LibraryCarpentry

The lesson needs two maintainers, but more the merrier, especially if we can ensure a good mix of timezones. Anyone up for it?

Rewrite sections "Filtering" and "Building more complex queries"

02-basic-queries SAYS:

Databases can also filter data – selecting only the data meeting certain criteria. For example, let’s say we only want data for a specific ISSN for the Theory and Applications of Mathematics & Computer Science journal, which has a ISSN code 2067-2764|2247-6202. We need to add a WHERE clause to our query:

SELECT *
FROM articles
WHERE issns='2067-2764|2247-6202';

But 2067-2764|2247-6202 is not a ISSN code, it's a combination of 2 (pipe-separated).
In case you want to match both at the same time this is not the way, cause

  • in a row of the table they might be ordered differently within the field
  • these 2 could be among 3 or more
    If you're also looking for 1 of the 2 matched, this query wouldn't return them either.

First focus on fields that have 1 entry, perhaps ?

It might be useful to acoit

Please delete the text below before submitting your contribution.


Thanks for contributing! If this contribution is for instructor training, please send an email to [email protected] with a link to this contribution so we can record your progress. You’ve completed your contribution step for instructor checkout just by submitting this contribution.

Please keep in mind that lesson maintainers are volunteers and it may be some time before they can respond to your contribution. Although not all contributions can be incorporated into the lesson materials, we appreciate your time and effort to improve the curriculum. If you have any questions about the lesson maintenance process or would like to volunteer your time as a contribution reviewer, please contact Kate Hertweck ([email protected]).


Proposed changes for 02-sql-aggregation.md

The HAVING keyword

  • Why do we suddenly reference table.column (surveys.species_id)? Not needed until we start using JOIN

Saving queries for future use

  • In the Challenge, perhaps remind students of ORDER BY ... ASC/DESC

Rephrasing needed for second exercise on aggregation

Had a question today about this exercise: "How many citations that were counted each month a) in total; b) per journal"

"How many citations that were counted each month" can be interpreted as "How many citations were made each month". We don't have any citation time data in the dataset, only article publication time, so that question would be impossible to answer with the data we have, but I still think we should try to rephrase it to make it more clear.

I struggle a little bit to come up with a good way of phrasing it though. Could be because the question is a bit artificial. Perhaps something like "the number of citations per (publication) month; a) ..."? Not sure if that is easy to understand. Help needed :)

Library-centric dataset

Might be nice to find a dataset that's more library-centric. Perhaps some fake bib records and a fake patron borrowing history?

Add authors to AUTHORS file

We now have a workflow for releasing citable versions of our lessons (with DOIs) every 6 months via Zenodo. This makes our more discoverable and sustainable and ensures that everyone involved gets the credit they deserve. For more on this work see data-lessons/librarycarpentry#5

In order to make this happen we need to make one crucial change: all AUTHORS files need to change so that they list names of contributors in the following format:

James Allen
James Baker
Piotr Banaszkiewicz
Erin Becker

@jt14den will run a script that that strips names from lesson logs and edit AUTHORS across all Library Carpentry repos.

When this is actioned (hopefully, soon!), lesson maintainers are asked to eyeball the AUTHORS file to see if anyone obvious is missing (for example, people who contributed to discussions but didn't edit any lessons). Note: template developers are credited in this process; this is in line with Software Carpentry best practice.

In the future, lesson maintainers are encouraged to ensure that those who contribute to lessons are added manually to AUTHORS files (encourage contributors to do it so they see where and how we give credit!)

backup when SQL not working

Sometimes things doesn't work on an attendees laptop: the software is installed incorrectly, they've installed the wrong thing, it all just unfathomably doesn't work.

Now, peer programming is great from a pedagogical point of view, so "work with someone else" is a good option. But prompted by @weaverbel, we should consider adding a backup to our Instructor Notes.

For shell data-lessons/library-shell-DEPRECATED#53 and refine data-lessons/library-openrefine-DEPRECATED#153 there are web based options we can use. What is possible with this lesson? (if nothing, fine - it is probably worth putting that in the notes!)

Proposed changes for 01-sql-basic-queries.md

Writing my first query

  • Requires explicit instructions to go to "Execute SQL" in SQLite

Calculated values

  • Suggest expand g into grams and kg into kilograms. Likewise for mg into milligrams for the Challenge underneath.
  • Suggest explaining why we use 1000.0 instead of 1000 for calculating weight in kilograms

Functions

  • Suggest introducing ROUND in a little more detail and how ROUND(weight/1000.0, 2) will take the value of weight/1000.0 and round to two decimal places.

Escaping STRING

  • Suggest adding an explanatory box about why we escape STRING but not other data types. Can use both single quotes (') and double quotes (") for escaping.

Filtering

  • Suggest adding explanatory box about operators i.e. <, <=, >, >=, =, !=

Handout

This lesson might benefit from making a handout of reference materials.

To do this add detail of commands/terminology under the keypoints headers for each lesson: for example, https://github.com/data-lessons/library-data-intro/blob/gh-pages/_episodes/04-regular-expressions.md. This effectively then builds a handout at - for example http://data-lessons.github.io/library-data-intro/reference/ - which can be printed out in advance of the session (librarians love handouts!)

Make sure you make a note of this in your Instructor Notes #49

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.