GithubHelp home page GithubHelp logo

dataforirina's Introduction

DataForIrina

The goal of DataForIrina is to document the steps taken for the Dataset provided to Irina

load needed packages

library(dplyr)
library(data.table)
library(readxl)

we will start by reading the dataset delivered to us from from the biodiversiy council, this dataset was generated on the 21st of September of 2022.

ToClean <- readxl::read_xlsx("2022-09-21.xlsx") 

Filter data

We then filter the dataset to include only plants

OnlyPlants <- ToClean |> 
  # make all names machine readable
  janitor::clean_names() |>
  # Filter to Kingdom Plantae
  dplyr::filter(rige == "Plantae")

Bot not all this have been resolved to species level, in the table bellow we see the number of taxa that have been resolved to which level

taxonrang n
Art 5333
Slægt 1562
Hybrid 612
Underart 460
Familie 358
Varietet 303
Orden 126
Form 70
Klasse 32
Sektion 25
Række 10
Superart 8

We want to keep only some of this, so we only include the ones that are to the level of Species, Form, Subspecies, and Variety (Art, Form, Underart, Varietet).

OnlyPlants <- OnlyPlants |> 
  # Filter only some resolutions
  dplyr::filter(taxonrang %in% c("Art", "Form", "Underart", "Varietet")) |> 
# Ensure no duplecates
  dplyr::distinct()

This leaves us with 6,166 valid taxa. this dataset can be downloaded here

Remove introduced species

Another dataset was made where we removed introduced species

OnlyNatives <- OnlyPlants |> 
  dplyr::filter(herkomst != "Introduceret" | is.na(herkomst)) 

This leaves us with 4,891 valid taxa. this dataset can be downloaded here

Download Datasets

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.