GithubHelp home page GithubHelp logo

stenway / rsv-challenge Goto Github PK

View Code? Open in Web Editor NEW
89.0 5.0 16.0 3.97 MB

RSV = Rows of String Values

Home Page: https://www.stenway.com

License: Other

Fortran 6.24% TypeScript 15.83% C# 5.54% Java 14.42% Go 6.45% C 6.05% Python 4.19% C++ 3.56% JavaScript 3.27% PHP 2.62% Visual Basic .NET 2.93% MATLAB 3.21% Kotlin 2.96% Pascal 3.97% Swift 2.44% Ruby 2.69% Rust 2.99% Lua 2.96% Nim 3.08% Zig 4.60%
csv dsv rsv tsv

rsv-challenge's People

Contributors

stenway avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rsv-challenge's Issues

Why require a fieldseparator after a null value?

Why would you require a field separator (0xFF) after a null value (0xFE)? Why not have 0xFE signal both the null value and "end of field"? That would make reading RSV much easier because when reading a 0xFE you can immediately emit the null value and don't need any more shenanigans to wait for reading the 0xFF, looking back in the buffer if there's one and only one byte in it and it is null etc. That way you can make it 'streaming' much easier IMHO.

There is a little problem with the C implementation

In the createBytes and createString functions, after this line result->buffer = malloc(result->length); you're checking if the result is null, instead of checking if result->buffer is null.

Also in the deleteBytes function, there's no point in zeroing out the bytes->buffer and bytes->length values as the bytes itself is freed.

I knew about the RSV format after watching your video, and I think it's pretty cool. I'll try to use it where ever possible. Also this project awesome

[Rust impl] A less imperative implementation with iterators?

Hi, I liked your video and wanted to check the Rust impl out, and I felt that it could be made better. Here's what I came up with, this uses thiserror for a proper error type, but a Box<dyn Error> should work fine as well.

#[derive(Debug, PartialEq, Eq, thiserror::Error)]
enum RSVError {
    #[error("incomplete document")]
    IncompleteDocument,
    #[error("incomplere row")]
    IncompleteRow,
    #[error("invalid string")]
    InvalidString(#[from] std::string::FromUtf8Error),
}

fn decode_rsv(bytes: &[u8]) -> Result<Vec<Vec<Option<String>>>, RSVError> {
    macro_rules! assert_last_byte_if_exists {
        ($value:expr, $expected:expr, $error:expr) => {
            if !matches!($value.last(), Some($expected) | None) {
                return Err($error);
            }
        };
    }
    assert_last_byte_if_exists!(bytes, 0xFDu8, RSVError::IncompleteDocument);
    bytes
        .split(|b| *b == 0xFD)
        .map(|line| {
            assert_last_byte_if_exists!(line, 0xFFu8, RSVError::IncompleteRow);
            line.split(|b| *b == 0xFF)
                .map(|field| {
                    Ok(match field {
                        [0xFE] => None,
                        bytes => Some(String::from_utf8(bytes.to_vec())?),
                    })
                })
                .collect()
        })
        .collect()
}

fn encode_rsv(fields: &[&[Option<String>]]) -> Vec<u8> {
    fields
        .iter()
        .flat_map(|&line| {
            line.iter()
                .map(Option::as_ref)
                .flat_map(|field| {
                    field
                        .map_or(&[0xFE][..], |item| item.as_bytes())
                        .iter()
                        .chain(once(&0xFF))
                })
                .chain(once(&0xFD))
        })
        .copied()
        .collect()
}

I also notice that methods that already return R<T, B> doesn't catch IO errors, like

fn load_rsv(file_path: &str) -> Result<Vec<Vec<Option<String>>>, Box<dyn Error>> {
    let bytes = fs::read(file_path).expect("Could not load RSV file");
    return decode_rsv(bytes);
}

Could just be

fn load_rsv(file_path: &str) -> Result<Vec<Vec<Option<String>>>, Box<dyn Error>> {
    Ok(decode_rsv(fs::read(file_path)?)?) // Second ? needed assuming decode_rsv is using a concrete error type
}

[Rust] rsv crate with serde support

Decided to just go ahead and make an rsv encoder/decoder using the serde framework for rust. Should make it compatible with most structs and types (w.i.p) that have Serialize/Deserialize traits.

Probably a bit too big to shove into this challenge but figured I'd link it in case someone else wants to play around with the crate.

Foreign-function-interface solutions for PHP, ...

@reed6514

Do you have foreign-function-interface solutions for any of the slow languages like PHP?

I suspect the C implementation of the json_decode/json_encode functions are much faster than any php implementation of RSV decode/encode could be.

RSV in Elixir

Here is a quick attempt at RSV in Elixir:

defmodule RSV do                                                                                                                 
  @end_of_rec  <<0xFD>>                                                                                                          
  @null        <<0xFE>>                                                                                                          
  @end_of_item <<0xFF>>                                                                                                          
                                                                                                                                 
  def encode(list), do: Enum.reduce(list, <<>>, fn row, acc -> acc <> encode_row(row) end)                                       
  defp encode_row(row), do: Enum.reduce(row, <<>>, fn item, acc -> acc <> encode_item(item) <> @end_of_item end) <> @end_of_rec  
  defp encode_item(nil), do: @null                                                                                               
  defp encode_item(val) when is_binary(val), do: val                                                                             
                                                                                                                                 
  def decode(data), do: data |> split(@end_of_rec, "document") |> Enum.map(&decode_row/1)                                        
  defp decode_row(row), do: row |> split(@end_of_item, "row") |> Enum.map(&decode_item/1)                                        
  defp decode_item(@null), do: nil                                                                                               
  defp decode_item(val), do: val                                                                                                 
                                                                                                                                 
  defp split(data, seperator, component) do                                                                                      
    String.ends_with?(data, seperator) || raise "Incomplete RSV #{component}"                                                    
    String.split(data, seperator) |> Enum.drop(-1)                                                                               
  end                                                                                                                            
end

and some unit tests:

defmodule RSVTest do                                                                                                             
  use ExUnit.Case                                                                                                                
  doctest RSV                                                                                                                    
                                                                                                                                 
  def hello_world_rsv, do: <<72, 101, 108, 108, 111, 255, 240, 159, 140, 142, 255, 254, 255, 255, 253>>                          
                                                                                                                                 
  def hello_world, do: [["Hello", "๐ŸŒŽ", nil, ""]]                                                                                
                                                                                                                                 
  test "greets the world" do                                                                                                     
    assert RSV.encode(hello_world()) == hello_world_rsv()                                                                        
  end                                                                                                                            
                                                                                                                                 
  test "decodes the world" do                                                                                                    
    assert RSV.decode(hello_world_rsv()) == hello_world()                                                                        
  end                                                                                                                            
                                                                                                                                 
  test "encodes and decodes" do                                                                                                  
    assert RSV.encode(hello_world()) |> RSV.decode() == hello_world()                                                            
  end                                                                                                                            
end                                                                                                                              

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.