Comments (3)
For reference: script I used to get datasets with geological info:
import sys
import csv
def check_arguments():
if len(sys.argv) != 3:
print 'Missing arguments. Usage: python fossil_specimens.py <gbif_download_file.txt> <output_file.tab>'
sys.exit(-1)
return sys.argv[1:]
def get_columns(download_file,selected_columns):
my_selection = []
with open(download_file) as file:
reader = csv.reader(file, delimiter='\t', quoting=csv.QUOTE_NONE)
# QUOTE_NONE: to solve http://stackoverflow.com/questions/15063936/csv-error-field-larger-than-field-limit-131072
for row in reader:
my_selection.append([row[x-1] for x in selected_columns])
return my_selection
def get_datasets_with_geological(my_data):
# Expects a two-dimensional array with datasetKey at index 0
datasets = {}
number_of_columns = len(my_data[0][1:]) # Ignoring datasetKey at index 0
datasets['datasetKey'] = my_data[0][1:] # Store headers
for row in my_data[1:]: # Start after header line
dataset_key = row[0]
if dataset_key not in datasets: # New dataset
datasets[dataset_key] = [0 for x in range(1,number_of_columns+1)]
counter = 0
for value in row[1:]: # Loop over other 5 columns
if value is not '':
datasets[dataset_key][counter] += 1 # Increase value
counter += 1
return datasets # A dictionary
def write_datasets_with_geological(my_datasets,output_file):
with open(output_file, 'w+') as file:
writer = csv.writer(file, delimiter=',', lineterminator='\n')
for key in my_datasets:
writer.writerow([key] + my_datasets[key])
return True
def main ():
gbif_download_file, output_file = check_arguments()
columns = [197,83,130,85,132,86,133,84,131,82,129]
reduced_download_file = get_columns(gbif_download_file,columns)
geological_datasets = get_datasets_with_geological(reduced_download_file)
write_datasets_with_geological(geological_datasets,output_file)
main()
from gbif-dataset-metrics.
👍
from gbif-dataset-metrics.
Won't invest time to implement this. Labelled wontfix
.
from gbif-dataset-metrics.
Related Issues (20)
- Add metrics for SAMPLING_EVENT datasets with occurrences
- Get metrics from facets
- Please update stats of http://www.gbif.org/dataset/83598aa6-f762-11e1-a439-00145eb45e9a/stats HOT 3
- please re-process TNHC Ichthyology GBIF data set HOT 4
- Request for metrics HOT 2
- improve taxonomic overview HOT 1
- Request for metrics HOT 4
- Update metrics for Belgium HOT 1
- Sample of images: url HOT 9
- Run aggregator on EC2
- Reduce vulnerability with HTTPS on CartoDB
- Push code to gh-pages HOT 1
- Move data to production
- Updated metrics HOT 1
- http://www.gbif.org/dataset/3c428404-893c-44da-bb4a-6c19d8fb676a/stats HOT 3
- Recursive loop never ends when number of downloads is too low
- Update metrics request: INBO Bird tracking dataset HOT 7
- updating of metrics HOT 4
- Dataset metrics, please HOT 7
- Let's update metrics from Belgian datasets HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gbif-dataset-metrics.