GithubHelp home page GithubHelp logo

Comments (9)

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Hi,

Interesting. When I run on that file, there is an exception from a bug (which I 
have fixed), but it is not that exception. That stack trace looks an awful lot 
like the caching inside the java builtin Long class is doing funny things -- 
might it have something to do with your ExecJavaMojo calling things through 
reflection?

In any case, I have fixed the big and am running some tests before I release a 
fix. 1.1.1 should be out by tomorrow.

Original comment by [email protected] on 9 Aug 2012 at 5:31

  • Changed state: Started

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Hi,

Thanks for looking into the issue so quickly.

Interesting that you don't see the same exception. I assume that since
berkeleylm in written in Java it should support input encoded in UTF-8. Is
that a fair assumption?

I have tried calling the program through maven (I imported all the source)
and also without using maven at all and see the same exception in both
cases which is a bit odd if it is caused by reflection.

Original comment by [email protected] on 9 Aug 2012 at 5:43

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
UTF-8 should be fine. Hopefully the fix I've committed will resolve your issue 
in any case.

Original comment by [email protected] on 9 Aug 2012 at 7:33

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Apologies, I fell asleep on this fix. Version 1.1.1 has been uploaded. Let me 
know if this doesn't fix your issue. 

Original comment by [email protected] on 13 Aug 2012 at 2:02

  • Changed state: Fixed

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I unzipped the new 1.1.1 code but unfortunately am still seeing the same 
ArrayIndexOutOfBoundsException. I have tried on a different input data set in 
case that was the problem (en-test.txt, attached below) but I see the same 
problem on that input.

Here's the steps I took to produce the error:

1. Unzip the code
2. cd to the top level directory, berkeleylm-1.1.1
3. Run ant from the top level directory
4. From the top level directory, run:
java -cp jar/berkeleylm.jar edu.berkeley.nlp.lm.io.MakeKneserNeyArpaFromText 5 
test-en.model en-test.txt
5. Output is:
Reading text files [en-test.txt] and writing to file test-en.model {
    Reading in ngrams from raw text {
        On line 0
    } [2s]
    Writing Kneser-Ney probabilities {
        Counting counts for order 0 {
        } [0s]
        Counting counts for order 1 {
        } [0s]
        Counting counts for order 2 {
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 256
    at java.lang.Long.valueOf(Long.java:548)
    at edu.berkeley.nlp.lm.map.ExplicitWordHashMap$KeyIterator.next(ExplicitWordHashMap.java:140)
    at edu.berkeley.nlp.lm.map.ExplicitWordHashMap$KeyIterator.next(ExplicitWordHashMap.java:121)
    at edu.berkeley.nlp.lm.collections.Iterators$Transform.next(Iterators.java:107)
    at edu.berkeley.nlp.lm.io.KneserNeyLmReaderCallback.parse(KneserNeyLmReaderCallback.java:284)
    at edu.berkeley.nlp.lm.io.LmReaders.createKneserNeyLmFromTextFiles(LmReaders.java:299)
    at edu.berkeley.nlp.lm.io.MakeKneserNeyArpaFromText.main(MakeKneserNeyArpaFromText.java:57)

Original comment by [email protected] on 15 Aug 2012 at 11:34

Attachments:

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Followed your steps and did not encounter any exceptions. I'm guessing this is 
a bug in your JVM -- the exception is occurring while boxing a long! You can 
try using a different JVM, or even try using -server (which you should do 
anyway, for speed). 



Original comment by [email protected] on 15 Aug 2012 at 5:10

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
Thanks again for testing this out. It is quite odd that the error comes from 
boxing a long. I ran both with and without -server but saw the exception in 
both cases. I'm going to try a different JVM. Would you mind posting the output 
you get from running "java -version" so that I can start with that 
implementation? I'm using HotSpot 64 bit:

$ java -version
java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode)

Thanks for the help.

Original comment by [email protected] on 15 Aug 2012 at 5:28

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
$ java -version
java version "1.6.0_33"
Java(TM) SE Runtime Environment (build 1.6.0_33-b03-424-10M3720)
Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03-424, mixed mode)

Original comment by [email protected] on 15 Aug 2012 at 5:56

from berkeleylm.

GoogleCodeExporter avatar GoogleCodeExporter commented on May 21, 2024
I updated my java-6-sun jvm to 1.6.0_34, I was using a version from 2008. I no 
longer see the exception. Looks like Oracle has been hard at work fixing 
autoboxing issues in the last few years. :)


Original comment by [email protected] on 15 Aug 2012 at 8:58

from berkeleylm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.