GithubHelp home page GithubHelp logo

vigna / dsiutils Goto Github PK

View Code? Open in Web Editor NEW
28.0 4.0 9.0 19.65 MB

The DSI Utilities are a mishmash of classes accumulated during the last twenty years in projects developed at the DSI (Dipartimento di Scienze dell'Informazione, i.e., Information Sciences Department), now DI (Dipartimento di Informatica, i.e., Informatics Department), of the Università degli Studi di Milano.

Home Page: http://dsiutils.di.unimi.it/

License: GNU Lesser General Public License v2.1

Shell 0.10% Makefile 0.10% Java 99.12% HTML 0.16% CSS 0.53%
data-structures pseudorandom-number-generator mutable-strings java

dsiutils's People

Contributors

seirl avatar vigna avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

dsiutils's Issues

FrontCodedStringBigList deserialization from file errors when containing >2^31 items

Hello!

After building a FrontCodedStringBigList from the CLI with >2^31 elements, I get an error if I try to deserialize it. I couldn't find where the code of CharArrayFrontCodedList was, so I wasn't able to investigate further.

Steps to reproduce

Generating a small file which works fine:

$ seq 1 47483700 | ~/swhgraph.sh it.unimi.dsi.big.util.FrontCodedStringBigList test-underflow.fcl
2021-06-11 21:56:02,043 149 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Reading strings...
2021-06-11 21:56:09,137 7243 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Completed.
2021-06-11 21:56:09,143 7249 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Elapsed: 7s [47,483,700 strings, 6,694,445.23 strings/s, 149.38 ns/string]; used/avail/free/total/max mem: 1.38G/320.74G/1.60G/2.99G/322.12G
2021-06-11 21:56:09,143 7249 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Writing front-coded list to file...
2021-06-11 21:56:10,000 8106 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Completed.

$ java -cp ~/swh-graph-0.4.0.jar -Xmx100G FCLTest.java test-underflow.fcl
<no error>

Generating a file with >2**31 elements, which overflows:

% seq 1 2147483700 | ~/swhgraph.sh it.unimi.dsi.big.util.FrontCodedStringBigList test-overflow.fcl
[...]
2021-06-11 21:53:31,456 294037 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Elapsed: 4m 53s [2,147,483,700 strings, 7,307,224.59 strings/s, 136.85 ns/string]; used/avail/free/total/max mem: 31.36G/290.77G/1.69G/33.05G/322.12G
2021-06-11 21:53:31,456 294037 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Writing front-coded list to file...
2021-06-11 21:54:13,415 335996 INFO [main] i.u.d.b.u.FrontCodedStringBigList - Completed.

% java -cp ~/swh-graph-0.4.0.jar -Xmx100G FCLTest.java test-overflow.fcl
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 95049506 out of bounds for length 95049506
        at it.unimi.dsi.fastutil.BigArrays.get(BigArrays.java:4096)
        at it.unimi.dsi.fastutil.chars.CharArrayFrontCodedList.readInt(CharArrayFrontCodedList.java:180)
        at it.unimi.dsi.fastutil.chars.CharArrayFrontCodedBigList.rebuildPointerArray(CharArrayFrontCodedBigList.java:433)
        at it.unimi.dsi.fastutil.chars.CharArrayFrontCodedBigList.readObject(CharArrayFrontCodedBigList.java:447)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at java.base/java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1175)
        at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2325)
        at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
        at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
        at java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2464)
        at java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2358)
        at java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2196)
        at java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1679)
        at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:493)
        at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:451)
        at it.unimi.dsi.fastutil.io.BinIO.loadObject(BinIO.java:95)
        at it.unimi.dsi.fastutil.io.BinIO.loadObject(BinIO.java:106)
        at FCLTest.main(FCLTest.java:9)

The FCLTest.java file only contains this code to deserialize and load the FCL:

import it.unimi.dsi.big.util.FrontCodedStringBigList;
import it.unimi.dsi.fastutil.io.BinIO;

import java.io.IOException;

public class FCLTest {
    static public void main(final String[] argv) throws IOException, ClassNotFoundException {
        FrontCodedStringBigList testMap = (FrontCodedStringBigList) BinIO.loadObject(argv[0]);
    }
}

Thanks!

Add JMH test to repository.

In the package summary for Package it.unimi.dsi.util, it says

The timings were measured on an Intel® Core™ i7-8700B CPU @3.20GHz using JMH microbenchmarks

Where are these JMH tests that have been run? How reproducible are they? What java version was it run on?

It seems these tests were first added back in 2016 (according to the CHANGES file, they were added in version 2.3.3, which was released in 2016 on maven central), so are these results outdated?

They should almost certainly be re-run with newer java versions, as there have been many performance improvements since then.

NoSuchMethodError on ByteBuffer.position calls

ByteBuffer.position calls (and possibly other ByteBuffer method calls) result in a java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer; error under a Java 8 JRE. The root cause may be the one reported in this issue — namely, that compiling the bytecode with a Java 9 (9+?) JDK with --release 8 results in incorrect bytecode for Java 8.

I specifically ran into the error when using ByteBufferInputStream:

java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;
   at it.unimi.dsi.io.ByteBufferInputStream.map(ByteBufferInputStream.java:127)
   at it.unimi.dsi.io.ByteBufferInputStream.map(ByteBufferInputStream.java:113)
   ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.