sfu-db / apiconnectors Goto Github PK
View Code? Open in Web Editor NEWA curated list of example code to collect data from Web APIs using DataPrep.Connector.
Home Page: https://github.com/sfu-db/dataprep#connector
A curated list of example code to collect data from Web APIs using DataPrep.Connector.
Home Page: https://github.com/sfu-db/dataprep#connector
Website Description
Search for content within the iTunes Store and Apple Books Store.
Reason(s) to Support the Website
Developers will be able to search for a variety of content; including books, movies, podcasts, music, music videos, audiobooks, and TV shows.
Endpoints
https://itunes.apple.com/search?term=jack+johnson
https://itunes.apple.com/search?term=jack+johnson&entity=musicVideo
https://itunes.apple.com/search?term=jim+jones&country=ca
https://itunes.apple.com/lookup?id=909253&entity=album
https://itunes.apple.com/lookup?isbn=9780316069359
Questions
Website Description
TheSportsDB API is a open, crowd-sourced database of sports artwork and metadata with a free API.
Reason(s) to Support the Website
Developers for sports-related apps can use this API to support their products.
Endpoints
https://www.thesportsdb.com/api/v1/json/1/eventspastleague.php
https://www.thesportsdb.com/api/v1/json/1/searchteams.php
https://www.thesportsdb.com/api/v1/json/1/all_sports.php
https://www.thesportsdb.com/api/v1/json/1/lookup_all_teams.php
https://www.thesportsdb.com/api/v1/json/1/lookupteam.php
Questions
Website Description
https://developers.amadeus.com/
Collect travel data, including airline, hotel, points of interest, activities.
Reason(s) to Support the Website
We haven't got an api for collecting travel data, which worth analyzing.
Endpoints
https://test.api.amadeus.com/v2/shopping/flight-offers
https://test.api.amadeus.com/v2/shopping/hotel-offers
https://test.api.amadeus.com/v1/shopping/activities
https://test.api.amadeus.com/v1/reference-data/locations/pois
https://test.api.amadeus.com/v1/safety/safety-rated-locations
Questions
What are the cheapest flights from Madrid to Paris on June 1st?
What are the best hotel deals during my trip?
What are the best tours and activities in this location?
How safe is this location?
What are the best places to visit in Barcelona?
Website Description
https://api.nasa.gov/
The objective of this site is to make NASA data, including imagery, eminently, etc.
Reason(s) to Support the Website
We haven't got an api for NASA data.
Endpoints
https://api.nasa.gov/planetary/apod
https://api.nasa.gov/DONKI/CME
https://api.nasa.gov/DONKI/GST
https://api.nasa.gov/DONKI/FLR
https://api.nasa.gov/DONKI/SEP
Questions
What are the title of Astronomy Picture of the Day from 2020-01-01 to 2020-01-10?
How to get Coronal Mass Ejection(CME) data from 2020-01-01 to 2020-02-01?
How many Geomagnetic Storms(GST) have occurred from 2020-01-01 to 2021-01-01? When is it?
How many Solar Flare(FLR) have occurred and completed from 2020-01-01 to 2021-01-01? How long did they last?
How to get Solar Energetic Particle(SEP) data from 2019-01-01 to 2021-01-01?
The folder containing the images for the README was renamed without updating the README, resulting in the images not being visible in the document.
Website Description
IPLegit is a service that gives information on IP addresses.
Reason(s) to Support the Website
This can be used to detect fraudulent IP addresses.
Endpoints
Questions
Given a list of IP addresses from people who have visited a my website...
There should be a tutorial on how to build their own config files for data connector.
Website Description
Provides data of recent arrests, jail inmate search and mugshots.
Reason(s) to Support the Website
Developers can use this data to analyze criminal activity.
Endpoints
Questions
Website Description
OurAirport is a service that gives information on airports in certain regions and nearest airports.
Reason(s) to Support the Website
Developers making travel applications can use this website for accessing airport data based on certain properties.
Endpoints
Questions
Add python test code for config file schema correctness.
Every time a config file is generated, in the test case, the developer should call the test code and make sure there is no error.
Only when the test code is successful, the PR can be merged.
The test code should also be merged in the GUI back-end and run by default.
I hope to add some new APIs for this project but faced some problems during using the generator part. I have studied the guidance of adding APIs via making API describing configuration, while I'm more interested in the generator function mentioned in the paper.
I use the harvardartmuseum(HVM) API as a test for the generator.
As described in its API's documentation, the apikey is filled in Query Params. I successfully constructed my request and got correct response using Postman, while problems arose when I used the generator to send a sample request.
Initially, in the 3rd box(Authorization), I selected QueryParam as authentication option, and filled the key-value pair with apikey:content of my apikey. However, error message noted me that there is a KeyError in ui.py as the figure below.
I thought client_id and client_secret are not required for accessing HVM API, and this error seemed like a delete of an empty key in a dictionary, so I tried two methods to bypass this error.
First I tried to comment that part of the code locally, but the error still exists.
Second I tried to give these 3 keys a default value, but it arose more errors in pydantic package. I think these fields might be compulsory for this framework.
Moreover, if I tried to choose "No Authorization" option and write the query pair apikey:content of my apikey into the 2nd box. The generator normally return Unauthorized as I think.
So I want to ask how to fill the 3 fields in Authorization box: client_id, client_secret and access_token if I want to use generator to send sample requests to generate api configuration.
The image folder is superfluous and makes it look like a website name.
We should avoid putting it under the root folder and move it to other places appropriate.
Website Description
An rapid API to get instagram data
Reason(s) to Support the Website
Instagram is a famous social APP which can be used to market, fetch useful information and so on.
Endpoints
https://instagram47.p.rapidapi.com/search/{search}
https://instagram47.p.rapidapi.com/get_user_id/{username}
https://instagram47.p.rapidapi.com/user_posts/{username}
https://instagram47.p.rapidapi.com/user_following/{userid}
https://instagram47.p.rapidapi.com/user_followers/{userid}
https://instagram47.p.rapidapi.com/public_user_posts/{user_id}
https://instagram47.p.rapidapi.com/post_comments/{post_id}
https://instagram47.p.rapidapi.com/location_search/{search}
https://instagram47.p.rapidapi.com/location_posts/{locationid}
Questions
What is the result when searching 'Kobe' in instagram?
What is the user id of username='stephencurry30'?
What are the posts for username='stephencurry30'?
Who did Curry follow?
Who followed Curry?
Website Description
APIsguru is a service that gives information on web APIs.
Reason(s) to Support the Website
Developers can use this service to filter for APIs they need to perform certain tasks.
Endpoints
https://{region}.api.riotgames.com/lol/clash/v1/tournaments
https://{region}.api.riotgames.com/lor/ranked/v1/leaderboards
https://{region}.api.riotgames.com/tft/league/v1/master
https://{region}.api.riotgames.com/tft/league/v1/grandmaster
https://{region}.api.riotgames.com/tft/league/v1/challenger
Questions
Website Description
Covid Tracking is an Api that provides covid data about US nationally and by separate state.
Reason(s) to Support the Website
Developer using Covid data can easily use it to get up to date as well as historical data about Covid's impact
Endpoints
This API supports 6 endpoints
https://api.covidtracking.com/v2/states/{state}.json
http://api.covidtracking.com/v2/us/daily.json
http://api.covidtracking.com/v2/states.json
https://api.covidtracking.com/v2/us/daily/{date}.json
https://api.covidtracking.com/v2/states/{state}/daily.json
https://api.covidtracking.com/v2/states/{state}/{date}.json
Questions
What is the no. of cases in US?
What were the no. of cases in US on 2nd March?
What were the no. of cases in each state on 2nd March?
What is the no. of cases in New York?
What were the no. of cases in New York on 2nd March?
Website Description
The OMDB API is a RESTful web service to obtain movie information.
Reason(s) to Support the Website
Developers can use this data to make movie database websites and apps.
Endpoints
The API supports 1 endpoint but two sets of parameters to access information:
Technically there is only 1 endpoint: http://www.omdbapi.com/, but I will make 2 config files - 1 for each set of parameters
Questions
Design and add a table to show the contributors of each configuration file in the readme page.
Please refer to this presentation here https://github.com/all-contributors/all-contributors#contributors-
Website Description
Provides data of Canada's economy, society and environment.
Reason(s) to Support the Website
Developers will be able to identify relationships between types of Canadian demographic data.
Endpoints
https://www12.statcan.gc.ca/rest/census-recensement/CR2016geo.json
https://www12.statcan.gc.ca/rest/census-recensement/CPR2016.json
https://www150.statcan.gc.ca/n1/dai-quo/ssi/homepage/ind-econ.json
https://www150.statcan.gc.ca/n1/dai-quo/ssi/homepage/ind-hp.json
https://www150.statcan.gc.ca/n1/dai-quo/ssi/homepage/schedule-key_indicators-eng.json
Questions
Website Description
It exposes city, region, and country data via both GraphQL and REST APIs.
Reason(s) to Support the Website
Collecting countries, cities information
Endpoints
...
Questions
Filter cities by name prefix, countries, location, time-zone, and minimum population
Get all country regions, states, and provinces
Get all cities in a given region
Get all countries supporting a currency
...
Hi, at present we need a query parameter to get twitter data. But many projects on twitter just ask for random tweets (for example get 1000 random tweets from twitter). Can we change q parameter to optional.
Describe the bug
First of all , this is not a critical issue.
Basically, while reviewing I faced the following issue. I saw that the developer was using a rudimentary form of getting the data, upon further examination, it was brought to my attention that system throws a assertion error when the config file is changed, as shown below.
To Reproduce
Steps to reproduce the behavior:
from dataprep.connector import connect
# You can get ”app_key“ by following https://www.themuse.com/developers/api/v2/apps
dc = connect('themuse', _auth={'access_token': app_key})
df = await dc.query('jobs', page=1, category='Data Science', location='Vancouver, Canada')
df[['id', 'name', 'company', 'locations', 'levels', 'publication_date']]
Desired behavior
id | name | company | locations | levels | publication_date | |
---|---|---|---|---|---|---|
0 | 5126286 | Senior Data Scientist | Discord | ['Flexible / Remote'] | ['Senior Level'] | 2021-03-15T11:10:24Z |
1 | 5543215 | Data Scientist-AI/ML (Remote) | Dell Technologies | ['Chicago, IL', 'Flexible /...] | ['Mid Level'] | 2021-04-02T11:45:57Z |
2 | 4959228 | Senior Data Scientist | Humana | ['Flexible / Remote'] | ['Senior Level'] | 2021-01-05T11:28:23.814281Z |
Current behavior
id | name | company | locations | levels | publication_date | |
---|---|---|---|---|---|---|
0 | 5126286 | Senior Data Scientist | Discord | [{'name': 'Flexible / Remote] | [{'name': 'Senior Level', 'short_name': 'senio... | 2021-03-15T11:10:24Z |
1 | 5543215 | Data Scientist-AI/ML (Remote) | Dell Technologies | [{'name': 'Chicago, IL'}, {'name': 'Flexible /... | [{'name': 'Mid Level', 'short_name': 'mid'}] | 2021-04-02T11:45:57Z |
2 | 4959228 | Senior Data Scientist | Humana | [{'name': 'Flexible / Remote'}] | [{'name': 'Senior Level', 'short_name': 'senio... | 2021-01-05T11:28:23.814281Z |
Additional context
same issue found in airplanes api
Implement currents and coingecko API's configurationto DataConnectorConfigs
Website:
https://currentsapi.services/en
Description of the websites:
Currents api is an API for news data.
Why to support this websites:
This website can show us latest news, also can search news manually. I think some data scientists or students who want to do some research on news data will use this API
Endpoints:
Questions:
Website:
https://www.coingecko.com
Description of the websites:
CoinGecko provides live pricing, trading volume, stocks, exchanges, historical data and other cryptocurrency data.
Why to support this websites:
User can get the live data and historical data of cryptocurrency. I think cryptocurrency investors or researchers will use this API.
Endpoints:
Questions:
Website Description
Finds and verifies professional email addresses.
Reason(s) to Support the Website
Businesses can use this API to gather contact and professional information about people to send targeted advertisements to.
Endpoints
account
domain-search
email-count
email-verifier
email-finder
Questions
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.