GithubHelp home page GithubHelp logo

nshiab / simple-data-analysis-benchmarks Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 775 KB

Comparing performance of different versions of simple-data-analysis with popular Python and R libraries for data analysis.

JavaScript 65.90% Python 16.67% R 17.44%

simple-data-analysis-benchmarks's Introduction

Simple-data-analysis benchmarks

To test the performance of simple-data-analysis, we calculated the average temperature per decade and city with the daily temperatures from the Adjusted and Homogenized Canadian Climate Data.

We ran the same calculations with [email protected] (both NodeJS and Bun), [email protected] (NodeJS), [email protected] (NodeJS), Pandas (Python), and the tidyverse (R).

In each script, we:

  1. Load a CSV file (Importing)
  2. Select four columns, remove rows with missing temperature, convert date strings to date and temperature strings to float (Cleaning)
  3. Add a new column decade and calculate the decade (Modifying)
  4. Calculate the average temperature per decade and city (Summarizing)
  5. Write the cleaned-up data that we computed the averages from in a new CSV file (Writing)

Each script has been run ten times on a MacBook Pro (Apple M1 Pro / 16 GB). The durations have been averaged and we calculated the standard deviation.

The charts displayed below come from this Observable notebook.

Small file

With ahccd-samples.csv:

  • 74.7 MB
  • 19 cities
  • 20 columns
  • 971,804 rows
  • 19,436,080 data points

[email protected] was the slowest, but [email protected] versions are now the fastest.

A chart showing the processing duration of multiple scripts in various languages

Big file

With ahccd.csv:

  • 1.7 G
  • 773 cities
  • 20 columns
  • 22,051,025 rows
  • 441,020,500 data points

The file was too big for [email protected], so it's not included here.

While [email protected] was already fast, [email protected] shines even more with big files.

A chart showing the processing duration of multiple scripts in various languages

simple-data-analysis-benchmarks's People

Contributors

nshiab avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.