GithubHelp home page GithubHelp logo

seansolberg / 1brc-objectpascal Goto Github PK

View Code? Open in Web Editor NEW

This project forked from objectpascal-community/1brc-objectpascal

0.0 0.0 0.0 835 KB

1️⃣🐝🏎️ The 1 billion Row Challenge in Object Pascal

License: MIT License

C++ 0.08% Pascal 99.92%

1brc-objectpascal's Introduction

1️⃣🐝🏎️ The One Billion Row Challenge in Object Pascal

This is the repository that will coordinate the 1 Billion Row Challenge for Object Pascal.

The One Billion Row Challenge (1BRC) is a fun exploration of how far modern Object Pascal can be pushed for aggregating one billion rows from a text file. Grab all your threads, reach out to SIMD, or pull any other trick, and create the fastest implementation for solving this task!

1BRC

The text file contains temperature values for a range of weather stations. Each row is one measurement in the format <string: station name>;<double: measurement>, with the measurement value having exactly one fractional digit. The following shows ten rows as an example:

Hamburg;12.0
Bulawayo;8.9
Palembang;38.8
St. John's;15.2
Cracow;12.6
Bridgetown;26.9
Istanbul;6.2
Roseau;34.4
Conakry;31.2
Istanbul;23.0

The task is to write an Object Pascal program which reads the file, calculates the min, mean, and max temperature value per weather station, and emits the results on STDOUT like this (i.e. sorted alphabetically by station name, and the result values per station in the format <min>/<mean>/<max>, rounded to one fractional digit):

{Abha=-23.0/18.0/59.2, Abidjan=-16.2/26.0/67.3, Abéché=-10.0/29.4/69.0, Accra=-10.1/26.4/66.4, Addis Ababa=-23.7/16.0/67.0, Adelaide=-27.8/17.3/58.5, ...}

Entering The Challenge

Submissions will be via a PR( Pull Request ) to this repository.
The challenge will run from the 10th of March until the 10th of May, 2024.

When creating your entry, please do as follows:

  1. Create a folder under entries with your first initial and last name, e.g., for Gustavo Carreno: entries/gcarreno.
  2. If you're worried about anonymity, because the Internet stinks, feel free to use a fictional one: Bruce Wayne, Clark Kent, James Logan, Peter Parker, Diana of Themyscira. Your pick!
  3. Create a README.md with some content about your approach, e.g., entries/gcarreno/README.md.
  4. Put all your code under entries/<your name>/src, e.g., entries/gcarreno/src.
  5. If you need to provide a custom .gitignore for something not present in the main one, please do.

This challenge is mainly to allow us to learn something new. This means that copying code from others will be allowed, under these conditions:

  1. You can only use pure Object Pascal with no calls to any operating system's API or external C/C++ libraries.
  2. The code must have some sort of mention/attribution to the original author, in case you've used someone else's code.
  3. It's not a blatant copy just for the sake of submission.
  4. It adds something of value, not just a different code formatting.
  5. All code should be formatted with the IDE's default formatting tool.

Submit your implementation and become part of the leader board!

Generating the measurements.txt

NOTE
We now have both a Lazarus version and a Delphi version of the generator for both 32b and 64b.

In order to produce the One Billion Rows of text, we are providing the source code for the official generator, so we all have the same entry data.

Parameter Description
-h or --help Writes this help message and exits
-v or --version Writes the version and exits
-i or --input-file The file containing the Weather Stations
-o or --output-file The file that will contain the generated lines
-n or --line-count The amount of lines to be generated ( Can use 1_000_000_000 )

Verify

You can verify the generated measurements.txt with a SHA256 utility:

Linux

$ sha256sum ./data/measurements.txt

Windows (PowerShell)

Get-FileHash .\data\measurements.txt -Algorithm SHA256

Expected SHA256 hash: ebad17b266ee9f5cb3d118531f197e6f68c9ab988abc5cb9506e6257e1a52ce6

NOTE

I'm still being lazy and I need to do the baseline in order for us to have the same SHA256 value for an official output.

Results

These are the results from running all entries into the challenge on my personal computer:

  • Ubuntu 23.10 64b
  • Ryzen 9 5950x 16 cores
  • 32GB RAM
  • 250GB SSD
  • 1TB HDD
# Result (m:s.ms): SSD Result (m:s.ms): HDD Compiler Submitter Notes Certificates
1 0:29.212 2:2.504 lazarus-3.0, fpc-3.2.2 Székely Balázs Using 16 threads

Evaluating Results

Each contender is run 10 times in a row for both SSD and HDD using hyperfine for the time taking.
The mean value of the 10 runs is the result for that contender and will be added to the results table above.
The exact same measurements.txt file is used for evaluating all contenders.

Prize

This is being run for bragging rights only and the fun of such a challenge.

FAQ

Q: Can I copy code from other submissions?
A: Yes, you can. The primary focus of the challenge is about learning something new, rather than "winning". When you do so, please give credit to the relevant source submissions. Please don't re-submit other entries with no or only trivial improvements.

Q: What is the encoding of the measurements.txt file?
A: The file is encoded with UTF-8.

Q: Which operating system is used for evaluation?
A: Ubuntu 23.10.

Honour Mentions

I'd like to thank @paweld for taking us from my miserable 20m attempt, to a whopping ~25s, beating the Python script by about 4 and a half minutes.
I'd like to thank @mobius for taking the time to provide the Delphi version of the generator.

Links

The original repository: https://github.com/gunnarmorling/1brc
I found out about it by watching this video about an attempt in Go: https://www.youtube.com/watch?v=cYng524S-MA
The blog post in question: https://www.bytesizego.com/blog/one-billion-row-challenge-go

License

This code base is available under the MIT License.

Code of Conduct

Be excellent to each other!
More than winning, the purpose of this challenge is to have fun and learn something new.

1brc-objectpascal's People

Contributors

corneliusdavid avatar dtpfl avatar gcarreno avatar mobius1qwe avatar paweld avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.