GithubHelp home page GithubHelp logo

Comments (7)

saffsd avatar saffsd commented on August 29, 2024

Thanks for getting in touch! From the error message you get this is an OS-level issue - I've only tested the software on Linux. My guess is that on windows systems file descriptors can't be transmitted between processes, which breaks the technique I use to parallelize the learning.

from langid.py.

 avatar commented on August 29, 2024

Thanks for the reply. Can it be fixed on Windows?

from langid.py.

saffsd avatar saffsd commented on August 29, 2024

I have made some changes to the way that processes communicate in the parallelization, this method works for me under Win7. You can find the implementation in the windows-train branch. Please give it a try and let me know if it works for you. I will need to test it a bit to make sure it still works on Linux-based systems and produces the same output.

from langid.py.

 avatar commented on August 29, 2024

I tested the new code. I was able to generate a features file. I did however get an error at the end:

...
processed chunk (60/64) [723 terms]
processed chunk (61/64) [709 terms]
processed chunk (62/64) [710 terms]
processed chunk (63/64) [692 terms]
processed chunk (64/64) [716 terms]
selected 300 features
wrote features to "features"
Exception RuntimeError: RuntimeError('cannot join current thread',) in <Finalize object, dead> ignored

from langid.py.

saffsd avatar saffsd commented on August 29, 2024

The RuntimeError is a quirk with multiprocessing and the way it shuts down its worker pools. It shouldn't affect the result of the computation, but it's rather annoying so I've added calls to join() to ensure the pools are fully shut down before the program terminates.

from langid.py.

 avatar commented on August 29, 2024

Now it works without any errors. Thanks!

from langid.py.

saffsd avatar saffsd commented on August 29, 2024

I did a quick check and the updated training code appears to produce the same output as the old code on linux systems, so I have merged in back into master and I am closing this issue. Cheers!

from langid.py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.