GithubHelp home page GithubHelp logo

mebioda's Introduction

MeBioDA

Introduction

This repository contains materials for the MSc course Methods in Biodiversity Analysis, offered through a collaboration between Naturalis, IBL, and CML. The course will take place in weeks 48 through 51 (of 2019, and presumably onwards), mainly in the Sylvius Building (1.5.03) with some practicals in the Van Steenis Building, both on the main (bio)science campus of Leiden.

The goal of the course is to impart data science skills as relevant to biodiversity research. Technical topics to be covered will include research reproducibility (e.g. scripting) and data management (e.g. versioning), while the scientific applications will encompass biodiversity assessment using DNA sequence data (e.g. metabarcoding), geospatial data (e.g. ecological niche modeling) and phenotype/morphology data (e.g. morphometrics).

Preparations

The course will consist of lectures and hands-on computer practicals that require you to have the following:

Schedule

  • Week 1 - Molecular diversity
  • Week 2 - (Geo)spatial diversity
  • Week 3 - Functional diversity
  • Week 4 - Synthesis and exam

mebioda's People

Contributors

a-hooft avatar acouzens avatar astridvdburg avatar bnbastiaans avatar evacologie avatar fabiosweet avatar jelledercksen avatar jlopdel avatar katrijndebock avatar lbussc avatar lmar116 avatar maartenvtz avatar mdoeland avatar meeshoffmanns avatar melanieflink avatar mikepawlik avatar nielsraes avatar nkyriakopoulou avatar pipvanaalst avatar rfrische avatar rolodv avatar rpringle15 avatar rvosa avatar s1676989 avatar s1712675 avatar shadrinafr avatar smith-jh avatar sybrenhilgen avatar unpainteddoor avatar vpeelen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mebioda's Issues

practical phylogenetic comparative methods

@rvosa the tutorial for the phylogenetic comparative methods uses a data set (landplants) which was present in the older versions of the package ape. should the students just download the older version or it is better to look for another tutorial?

w2d2+d3 combined

Merge the assignment to fetch GBIF data and import it in ArcGIS (+introduction ArcGIS)

Computing capacity for practicals

The PC lab at IBL has Windows 7 PCs with 4 Gb RAM. It is going to be difficult to get anything done on this. Alternative to explore is to use a cloud instance that students log in on. Make an ICT project to discuss this.

Week 4 guest speakers

In the final week of the course there will not be a lot of lectures (see #12) and no assignments, as the final exam (#11) will be in that week, which means the final day won't be a lecture, the day before will be free for study and the one before that will be for contact hours. This leaves only two days, one of which will be for general wrap-up, and one for an inspiring two-part lecture about "4D data", slated for 19/12 between 9:00-9:45, then a break, and then continued 10:00-10:45.

The general idea is that we will get to think about how various data acquisition techniques (e.g. LIDAR, photogrammetry, other types of scanning) can be used to construct 3D models (e.g. of a paleontological dig, of geological features, or of vegetation) that change through time. Apart from a lot of eye candy, this lecture would also include information about data integration challenges (e.g. anchoring 3D observations) as well as data management challenges (e.g. volumes).

Speaker(s) to invite:

  • Pim Kaskes

Markdown to HTML

pandoc -t slidy --mathjax -s lecture1.md -o lecture1.html

(This is just a note to myself, to remember how to transform a markdown document to slidy slides.)

ruimtes boeken

  • week 49 (GIS): Maarten heeft archeologie zaal voorlopig geclaimd (ma/wo/vr)
  • andere weken, zelfde zaal in IBL als voorheen

integrate GIS+SDM with model organisms

Confusion arose because part of the GIS practical operated on data other than the model crops. Hence, not all students had lat/lon data for the crops, and so the ENM assignment also wasn't uniformly done for the crops.

Week 3 assignment functional diversity

In week 3 the general focus is on phylogenetic and functional diversity. If we make week 2's assignment about the niches of domesticated animals (ungulates), then week 3 should be about the functional diversity of crop plants.

Journal club

In addition to preparing a presentation, each student reads someone else's paper and prepares a question.

GIS practicum

  • meer laten stoeien met eigen data eerder in de week (GIS practicum dag 2)

export / integration with blackboard

Assuming you wanted the contents of a markdown document to end up on blackboard, there is the option of doing this as a SCORM object. This is somewhat attractive because such objects can also contain quizzes and other interactive content, and blackboard (at least the version used by Leiden University) is able to import these so they can be added to courses. Here are the steps:

Convert Markdown to HTML

There are numerous options for this. One option is to make a single HTML file, for example like this:

pandoc --from markdown --to=html --css=pandoc.css \
--standalone --out=lecture1.html lecture1.md

In this case, the styling can be much improved by using this pandoc.css

Another option is to make slides, e.g. as follows:

pandoc --to=slidy --standalone --out=lecture1.html lecture1.md

In this case, you may achieve more visually appealing results using some of the other options besides slidy, although the other ones require more supporting file scaffolding. See here

  1. package the html (with all assets) into a SCORM archive using libscorm
  2. upload the archive to blackboard (Content > Build content > Content Package (SCORM))

Week 2 guest speakers

Given that this is about 2D / tabular / image data, this should include requests to:

  • Maarten van 't Zelfde
  • Leon Marshall
  • Rutger Vos
  • Niels Raes
  • Merlijn van Weerd
  • Joris Timmermans
  • Jeroen Creuwels

Week 3 lectures

This week will be about three-dimensional data. This might be 3D (CT) scanned objects, photogrammetry, 3D LIDAR data. Focus on 3D scanning and functional diversity.

  • 11/12, I-II, Introduction to 3D data. Data capture techniques, segmentation. Scripting.
  • 12/12, I-II, Lecture on scanning Nepenthes pitchers (?)
  • 13/12, I-II, Lecture on Aidan's scanning
  • 14/12, I-II
  • 15/12, I-II, Hands-on practical 3D data (PC room)

Attention: week 3 reorganization

I have re-organized week 3, having moved the materials I definitely want to keep (whose PDFs you've reconstructed (anyway, nearly complete)) to their respective days. I've also renamed the *.md files according to the same scheme as week 1 and link to them from the root README.md. You should do a git pull before you start committing things again.

Arrive at 6 EC

  • Week 1: 15 lecture hours
  • Week 2: 15 lecture hours
  • Week 3: 15 lecture hours
  • Week 4: 5 lecture hours

So far, that's a total of 50 lecture hours, i.e. 45/28 = ±1.8EC (1EC=28hrs). The remainder needs to be made up by practicals, assignment report, presentation, and exam.

Gastsprekers

  • Van der Hoorn / Beentjes (Rutger, beschikbaar)
  • Merlijn van Weerd (Maarten, gevraagd, in principe akkoord)
  • Joris Timmermans (Maarten)
  • Jeroen Creuwels (Maarten)
  • OPTIE: Mark Rademaker (Rutger, beschikbaar, inplannen, voorkeur voor 11:00, vragen 16/12)
  • Nuno De Mesquita César de Sá (Maarten)
  • Rosaleen March (Maarten)
  • Aidan Couzens (Rutger)
  • Jeremy Miller (Rutger, onder voorbehoud)
  • Tom van Dooren (Rutger)
  • Benedict King (Rutger)
  • Hans ter Steege (niet beschikbaar)
  • Jorinde Nuytinck (Rutger, gevraagd)

Examples of cropping => translate workflow

The operations can be performed with a variety of tools:

  • in ArcGIS / DIVA-GIS / QGIS (visually)
  • ArcGIS modelbuilder (automated)
  • within R / python / gdal (scripted, in code)

Week 1: paper presentation / journal club

The general idea is that each student (or team?) briefly presents a publication. To this end, we need to select a list of ±20 candidate papers on the general topic of the usage of DNA sequence data (e.g. barcodes) for biodiversity assessment.

Collect lecture materials for week 3

Steps:

  1. install XMind
  2. open the mindmap for week3
  3. for each lecture (e.g. "w3l1 - TRAIT BASED ECOLOGY & OVERVIEW I") make a folder with the same name inside the week3 folder
  4. follow the hyperlink for the tutorial and search for the PDF lecture notes. Place these in the right lecture folder.
  5. on the tutorial webpage, look for assigned readings. Add them to the Mendeley library (for which you should have received an invite to your naturalis email) and tag the reference with the lecture code (e.g. "w3l1") inside Mendeley desktop

Week 1: practical metabarcoding

As of commit f366c6b there is now a set of gzipped FASTQ files that Jozsef Geml has made available for testing purposes. These are supposed to be processed, roughly, according to the steps outlined here. However, these steps are based on Jozsef's (windows-based) workflow. The general idea is that we translate these to a linux-based, open source workflow, that we can let the students run as a shell script.

What this means, specifically, is the following:

  • The end result is a workflow shell script (with a lot of comments) that we will store in the same git repository folder as the data files. Another likely end result is a python (or other lang) script that does the sequence renaming (see below).
  • Find a workaround in the workflow so we don't need Geneious. Jozsef uses Geneious to filter and trim sequences, we should be able to do this with FASTX-toolkit instead.
  • Geneious is also used to do some sequence renaming, namely to add sample prefixes to the sequence identifiers (so the samples can subsequently be merged). It should be easy to script this in python or something.
  • All the tools that I think we need (so, in addition to FASTX, those are mothur and usearch) are available on the virtual machine. Please do your work with that environment in mind and note if anything is missing.

Voorbereiding SDM practicum

  • checken van de code zodat die draait
  • consequente uitleg
  • mini-presentatie aan het begin
  • alle prints beschikbaar en in orde
  • data voorbereiden

re-check metabarcoding workflow

Ever since my edits, singletons are no longer filtered out of the OTU table. Make sure that we reproduce Irene's results (her most recent commit), but with the additional syntax coloration.

Preliminaries / voorbereiden

Wat moeten de studenten in ieder geval hebben om een beetje effectief mee te kunnen doen?

  • Rstudio, met git, om de cursus lokaal te hebben (handig voor databestanden)
  • SQLite browser, om SQL databases uit te kunnen proberen
  • PyCharm, om bioinformatische code uit te kunnen proberen
  • Java, voor maxent en mesquite (openjdk? oracle?)
  • maxent
  • mesquite
  • many R packages
  • CYPHER API key voor EoL

Week 1 lectures

This will inevitably include an NGS intro, handling sequence data, and basic bioinformatics (e.g. BLAST). Focus on species diversity through metabarcoding.

  • 27/11, I-IV, Introduction to the course, to biodiversity concepts
  • 28/11, I-III, metabarcoding, NGS data, formats, basic analysis steps
  • 29/11, I-II, hands-on introduction to UNIX-like OS (PC room)
  • 30/11, I-III, practical: scripting a metabarcoding analysis, HTS-checker (PC room)
  • 1/12, no lectures

Week 1 practical assignment

This week should demonstrate how to manage and handle sequence data. This should include sequence analysis that is not phylogenetics. An idea for a project might be that the students build a local BLAST database and use it to assess the contents of a sample, i.e. species diversity.

wanneer wat

Schema naar Frans met practica wanneer waar wat

Accounts

Need:

  • blackboard account
  • IBL account
  • VUW account (is probably same as IBL account)

Week 3 guest speakers

Will need somebody to talk about 3D objects for the week 3 lectures (#9). People to invite:

  • Aidan Couzens confirmed
  • Frederic Lens (declined, try again next year)
  • Vincent Merckx (available)
  • Jeremy Miller (available)
  • Tom van Dooren (available)
  • Tiedo van Kuijk (declined)
  • a visualizer, e.g. from the Aerts lab

Week 2 lectures

In handling 2D image data a number of basic concepts will have to be addressed in lectures: pixels, e.g. color values, bit depths; grid scales, e.g. dpi, spatial grids; coordinate systems, e.g. polar / cartesian; visualization. Focus on geospatial / community diversity. SDM.

  • 4/12, I-II, Introduction to two-dimensional data challenges
  • 5/12, I-II, Image processing
  • 6/12, I-II, Species occurrence data: GBIF, TNRS, Gazetteers and SDM
  • 7/12, I-II, Hands-on: collecting, cleaning occurrences of wild relatives (PC room)
  • 8/12, I-II, Hands-on: MAXENT, git (PC room)

Week 4 lectures

  • 18/12, I-II, Lecture: time series, 4D data, and wrapup
  • 19/12, I-II, Guest lecture: space through time
  • 20/12, I contact hour
  • 21/12, no lecture, exam prep
  • 22/12, exam

Week 4 exam

Presumably there won't be another assignment but simply an exam.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.