GithubHelp home page GithubHelp logo

Comments (7)

grantjenks avatar grantjenks commented on July 16, 2024

I'm not sure. Where did you see that: "it cannot be used in any commercial application as the license of the data only allows it to be used for educational/research purposes?"

from python-wordsegment.

kootenpv avatar kootenpv commented on July 16, 2024

I've wanted to use the trillion corpus before... so I remember it from then.
You link to: https://catalog.ldc.upenn.edu/LDC2006T13 which under license contains a link to: https://catalog.ldc.upenn.edu/license/web-1t-5-gram-version-1.pdf
Then look at section 1.1 and 1.2 (can't copy paste it).

from python-wordsegment.

grantjenks avatar grantjenks commented on July 16, 2024

Hmm, that looks conclusive. I don't think I got it from LDC though. I thought I got it from the book publisher but I can't find that data now. And likely the publisher would have inherited the same restriction.

Perhaps I should make that more clear on the landing page.

How do you want to use it?

from python-wordsegment.

grantjenks avatar grantjenks commented on July 16, 2024

There's also corpus data at http://storage.googleapis.com/books/ngrams/books/datasetsv2.html

from python-wordsegment.

kootenpv avatar kootenpv commented on July 16, 2024

That one also mentions Creative Commons, which is not fully compatible with Apache I believe (w.r.t. commercial use). It's fun for experimentation though.

from python-wordsegment.

grantjenks avatar grantjenks commented on July 16, 2024

When I click the link it says: "You are free to: Adapt — remix, transform, and build upon the material
for any purpose, even commercially.

from python-wordsegment.

kootenpv avatar kootenpv commented on July 16, 2024

Oh my bad! I see that this is a different version. Great! I will have a go myself :)

from python-wordsegment.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.