GithubHelp home page GithubHelp logo

Comments (6)

jeffThompson avatar jeffThompson commented on May 26, 2024

Hmm. What line is the error showing up on? The code is intended for Python 2.7, so Python 3 might be the culprit.

from word2vecandtsne.

arthurbr140896 avatar arthurbr140896 commented on May 26, 2024

Thank you so much for your response, it is much appreciated.

So I have installed Python 2.7.0,
It seemed to have fixed the Tuple Parameter Error that I was getting but I can't be sure yet.
I have run the code again and this is the traceback that I am getting:**

Warning (from warnings module):
File "C:\Python27\lib\site-packages\gensim\utils.py", line 860
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
UserWarning: detected Windows; aliasing chunkize to chunkize_serial
building vocabulary...
training model...

Traceback (most recent call last):
File "C:\Users\arthu\Desktop\Arthur\Documents\UCI\Fall 2017\Independent Study with LaFarge\Slime Mold\Word2VecAndTsne-master (1)\Word2VecAndTsne-master\TrainModel.py", line 38, in
model.train(sentences)
File "C:\Python27\lib\site-packages\gensim\models\word2vec.py", line 940, in train
"You must specify either total_examples or total_words, for proper alpha and progress calculations. "
ValueError: You must specify either total_examples or total_words, for proper alpha and progress calculations. The usual value is total_examples=model.corpus_count.

Again thank you so much for your help.

Arthur Rodrigues

from word2vecandtsne.

jeffThompson avatar jeffThompson commented on May 26, 2024

The warning you can just ignore. See this post to see how to suppress it, if you want to.

The error is, I think, due to a new version of Gensim. It seems maybe it needs two extra arguments. Can you try modifying this section of the TrainModel.py file to be like this:

# train model
print 'training model...'
if skip_gram:
	model.train(sentences, total_examples=self.corpus_count, epochs=self.iter, sg=1)
else:
	model.train(sentences, total_examples=self.corpus_count, epochs=self.iter)

And let me know if it works?

from word2vecandtsne.

arthurbr140896 avatar arthurbr140896 commented on May 26, 2024

Alright I tested it out and this is the traceback that I got:

Warning (from warnings module):
File "C:\Python27\lib\site-packages\gensim\utils.py", line 860
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
UserWarning: detected Windows; aliasing chunkize to chunkize_serial
building vocabulary...
training model...

Traceback (most recent call last):
File "C:\Users\arthu\Desktop\Arthur\Documents\UCI\Fall 2017\Independent Study with LaFarge\Slime Mold\Word2VecAndTsne-master (1)\Word2VecAndTsne-master\TrainModel.py", line 37, in
model.train(sentences, total_examples=self.corpus_count, epochs=self.iter)
NameError: name 'self' is not defined

Thank you,
Arthur Rodrigues

from word2vecandtsne.

arthurbr140896 avatar arthurbr140896 commented on May 26, 2024

I think I was able to fix the issue:

print 'training model...'
if skip_gram:
model.train(sentences, total_examples=model.corpus_count, epochs=model.iter)
else:
model.train(sentences, total_examples=model.corpus_count, epochs=model.iter)

However now I have to run the TrainModel script with the new version of Gensim and the rest of
the scripts with the older version of Gensim.

Also a question I have is how do I get it to take the csv file and output it into a PNG, I the blog post
it tells us to use "The included Processing script" but I'm not sure which one that is.

Again thank you so much.

But also just another quick question, the models that I'm getting words with similar meaning aren't really being clustered together, but that is probably I haven't trained the algorithm with a large enough data set right? (I've just trained it only with Michel Foucault's The Order of Things)

Thank you

Arthur Rodrigues

from word2vecandtsne.

jeffThompson avatar jeffThompson commented on May 26, 2024

Glad you fixed it!

Re older version:
I'll have to take a look when I have some free time. For now, if you have the older version installed, you could use that through the whole process. Or, if you find fixes to any other problems you find with the new version, let me know and I'll update everything.

Re image file:
That's the Processing sketch in the VisualizeSpace folder. It will create a visual representation of your vector space.

Re proximity:
Yes, you probably need a larger dataset. A single book isn't probably enough to get super accurate on all words, but it should get clear clustering. This is probably due to your t-SNE variables – try tweaking them for better results.

from word2vecandtsne.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.