GithubHelp home page GithubHelp logo

juliaio / codeclz4.jl Goto Github PK

View Code? Open in Web Editor NEW
9.0 9.0 12.0 304 KB

Transcoding codecs for compression and decompression with LZ4

Home Page: https://juliaio.github.io/CodecLz4.jl/

License: MIT License

Julia 100.00%

codeclz4.jl's People

Contributors

dependabot[bot] avatar dm3 avatar github-actions[bot] avatar iamed2 avatar jakobnissen avatar joaoaparicio avatar juliatagbot avatar mattbrzezinski avatar nhz2 avatar omus avatar ranocha avatar robertfeldt avatar sjkelly avatar viralbshah avatar vtjnash avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

codeclz4.jl's Issues

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Error Transcoding File

Can't seem to get this work (Windows 10, Julia 1.1.1)

MWE:

using CodecLz4, CodecZlib

function gz2lz4mwe(infile::String, outfile::String)

  @assert isfile(infile)

  indata = open(infile, "r")
  outdata = open(outfile, "w")

  s = GzipDecompressorStream(LZ4CompressorStream(outdata))

  write(s, indata)
  close(s)
end

gz2lz4mwe("data\\CS-A.csv.gz", "data\\CS-A.csv.lz4")

Error:

LoadError: LZ4F_compressUpdate: ERROR_dstMaxSize_tooSmall
in expression starting at C:\[...]\qfactors\QFactorScripts.jl:25
changemode!(::TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}, ::Symbol) at stream.jl:710
callprocess(::TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}, ::TranscodingStreams.Buffer, ::TranscodingStreams.Buffer) at stream.jl:641
flushbuffer(::TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}, ::Bool) at stream.jl:594
flushbufferall at stream.jl:601 [inlined]
writedata!(::TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}, ::TranscodingStreams.Buffer) at stream.jl:679
flushbuffer(::TranscodingStreams.TranscodingStream{GzipDecompressor,TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}}, ::Bool) at stream.jl:593
flushbuffer at stream.jl:584 [inlined]
unsafe_write(::TranscodingStreams.TranscodingStream{GzipDecompressor,TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}}, ::Ptr{UInt8}, ::UInt64) at stream.jl:458
unsafe_write at io.jl:509 [inlined]
macro expansion at gcutils.jl:87 [inlined]
write at io.jl:532 [inlined]
write(::TranscodingStreams.TranscodingStream{GzipDecompressor,TranscodingStreams.TranscodingStream{LZ4Compressor,IOStream}}, ::IOStream) at io.jl:579
gz2lz4mwe(::String, ::String) at QFactorScripts.jl:21
top-level scope at none:0

Problem building on mac and julia 1.0

Build problems when trying to install/build CodecLz4 for my MacBook Pro with Julia 1.0:

julia> Pkg.build("CodecLz4")
  Building CodecLz4 → `~/.julia/packages/CodecLz4/HaBdh/deps/build.log`
┌ Error: Error building `CodecLz4`:
│ ERROR: LoadError: UndefVarError: is_apple not defined
│ Stacktrace:
│  [1] getproperty(::Module, ::Symbol) at ./sysimg.jl:13
│  [2] top-level scope at none:0
│  [3] include at ./boot.jl:317 [inlined]
│  [4] include_relative(::Module, ::String) at ./loading.jl:1038
│  [5] include(::Module, ::String) at ./sysimg.jl:29
│  [6] include(::String) at ./client.jl:388
│  [7] top-level scope at none:0
│ in expression starting at /Users/feldt/.julia/packages/CodecLz4/HaBdh/deps/build.jl:17
└ @ Pkg.Operations /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.0/Pkg/src/Operations.jl:1068
```

Getting error when reading LZ4 compressed files from R

I use R's fst to compress vectors into LZ4.

But I get the below when I try to fnl_res = transcode(LZ4Decompressor, compressed_chunk) reading the README and this thread python-lz4/python-lz4#143 indicates that CodecLz4.jl uses the Frame format but not the standard LZ4 or LZ4_HC formats.

Is there any possibility of adding support for them?

ERROR: LZ4F_decompress: ERROR_frameType_unknown                       
Stacktrace:                                                           
 [1] transcode(::LZ4Decompressor, ::Array{UInt8,1}) at C:\Users\RTX2080\.julia\packages\TranscodingStreams\MsN8d\src\transcode.jl:121       
 [2] transcode(::Type{LZ4Decompressor}, ::Array{UInt8,1}) at C:\Users\RTX2080\.julia\packages\TranscodingStreams\MsN8d\src\transcode.jl:34  
 [3] uncompress_cfst(::String) at c:\scratch\fst.jl:32                
 [4] top-level scope at REPL[60]:1   

Decompression with known "uncomressed size"

What's the equivalent of the following python code:

lz4.block.decompress(compressed, uncompressed_size=255)

Currently the decompression is failing without such info.

I thought this is what I wanted:

julia> text
"aaaabbbb"

julia> stream = read(LZ4FrameCompressorStream(IOBuffer(text)), String)
"\x04\"M\x18@@\xc0\b\0\0\x80aaaabbbb\0\0\0\0"

julia> lz4_decompress(Vector{UInt8}(stream), length(text))
ERROR: LZ4_decompress_safe: Decompression failed.
Stacktrace:
 [1] check_decompression_error
   @ ~/.julia/packages/CodecLz4/2JFgC/src/headers/lz4.jl:10 [inlined]
 [2] LZ4_decompress_safe

But not working as I would expect

transfer to JuliaIO org?

Now that all the other CodecX packages have been moved to the JuliaIO org, would it make sense to transfer this one there as well? If Invenia wants to add a little note in the README to indicate that it was developed there, that would be ok (in addition to the LICENSE, which already indicates that).

README's usage example doesn't work

using CodecLz4

# Some text.
text = """
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean sollicitudin
mauris non nisi consectetur, a dapibus urna pretium. Vestibulum non posuere
erat. Donec luctus a turpis eget aliquet. Cras tristique iaculis ex, eu
malesuada sem interdum sed. Vestibulum ante ipsum primis in faucibus orci luctus
et ultrices posuere cubilia Curae; Etiam volutpat, risus nec gravida ultricies,
erat ex bibendum ipsum, sed varius ipsum ipsum vitae dui.
"""

# Streaming API.
stream = LZ4FrameCompressorStream(IOBuffer(text))
for line in eachline(LZ4DecompressorStream(stream))
    println(line)
end
close(stream)

which returns ERROR: UndefVarError: LZ4DecompressorStream not defined

I suspect
for line in eachline(LZ4DecompressorStream(stream))
should be
for line in eachline(LZ4FrameDecompressorStream(stream))?

Unable to pass LZ4FrameCompressor instance to transcode

Should this work? It seems to be supported by other codecs:

julia> transcode(LZ4FrameCompressor(; compressionlevel=4), bytes)
ERROR: LZ4F_cctx: Uninitialized compression context
Stacktrace:
 [1] transcode(codec::LZ4FrameCompressor, data::Vector{UInt8})
   @ TranscodingStreams ~/.julia/packages/TranscodingStreams/MsN8d/src/transcode.jl:121
 [2] top-level scope
   @ REPL[93]:1

Is there another (better?) way to compress a Vector{UInt8} at a user-provided compression level?

Fails to precompile on Apple M1

Running on the sf/apple_silicon_latest branch of Julia, I get this when trying to import:

julia> import CodecLz4
[ Info: Precompiling CodecLz4 [5ba52731-8f18-5e0d-9241-30f10d1ec561]
ERROR: LoadError: UndefVarError: liblz4 not defined
Stacktrace:
 [1] include
   @ ./Base.jl:386 [inlined]
 [2] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt64}}, source::Nothing)
   @ Base ./loading.jl:1213
 [3] top-level scope
   @ none:1
 [4] eval
   @ ./boot.jl:369 [inlined]
 [5] eval(x::Expr)
   @ Base.MainInclude ./client.jl:453
 [6] top-level scope
   @ none:1
in expression starting at /Users/me/.julia/packages/CodecLz4/2JFgC/src/CodecLz4.jl:2
ERROR: Failed to precompile CodecLz4 [5ba52731-8f18-5e0d-9241-30f10d1ec561] to /Users/me/.julia/compiled/v1.7/CodecLz4/jl_iZRXpo.
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:33

I did install the lz4 library through brew, but I don't think that its finding the library.

Decompression finalize fails after frame error

If a LZ4Decompressor cannot decode the frame header, finalize() gives a malloc error.

julia> using CodecLz4

julia> codec = LZ4Decompressor()
CodecLz4.LZ4Decompressor(Base.RefValue{Ptr{CodecLz4.LZ4F_dctx}}(Ptr{CodecLz4.LZ4F_dctx} @0x0000000000000000))

julia> try
           data = transcode(codec, "not properly formatted")
           println(data)
           
       catch e
           println(e)
       finally
           try
               CodecLz4.check_context_initialized(codec.dctx[])
               println("codec exists")
           catch
               println("empty codec")
           end
           CodecLz4.TranscodingStreams.finalize(codec)
       end
ErrorException("LZ4F_decompress: ERROR_frameType_unknown")
codec exists
julia(25795,0x7fffa6174340) malloc: *** error for object 0x309371a21220309: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

signal (6): Abort trap: 6

Looking into the LZ4 source code here, it seems that the temporary buffers don't get allocated properly before they are freed.

Various Issues

  • Use Julia types in interfaces. For example LZ4Compressor as the keyword blockmode is required to be a Cuint. The keyword can only be either 0 for LZ4F_blockLinked or 1 LZ4F_blockIndependent. We could either make this an Integer, Bool, or @enum (an enum is probably best in this case)

    julia> TranscodingStream(LZ4Compressor(blockmode=0), stream)
    ERROR: TypeError: Type: in typeassert, expected UInt32, got Int64
  • Note that on 32-bit Int will be an Int32. This may relate to your 32-bit issues.

  • I need to read up on this but I think Memory(pointer(""), 0) could cause a crash

  • Update README links away from "morris25" to "invenia"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.