GithubHelp home page GithubHelp logo

Comments (6)

dschaehi avatar dschaehi commented on August 27, 2024 1

Thank you for the suggestions!

No worries, if you don't have the script anymore. It isn't actually a common practice to keep such a script, but in my opinion, it would be good to also make it to a practice. This way, the authors can prove that they didn't choose (a range of) hyperparameters that work best on the test set, but best on the validation set.

from knowledgegraphembedding.

Edward-Sun avatar Edward-Sun commented on August 27, 2024

Hi, as described in the paper, the ranges of the hyperparameters for the grid search are
set as follows: embedding dimension k ∈ {125, 250, 500, 1000}, batch size b ∈ {512, 1024, 2048},
self-adversarial sampling temperature α ∈ {0.5, 1.0}, and fixed margin γ ∈ {3, 6, 9, 12, 18, 24, 30}.

I used a for loop in bash script. The syntax is

for VAR1 in var1 var2 var3
do
for VAR2 in var4 var5 var6
do
bash something.sh
done
done

In practice, I train the model for 1/10 max_steps to search hyperparameters roughly, and then select 3 or 4 good candidates and try them on full max_steps.

I hope these information can help you reproduce our results:)

from knowledgegraphembedding.

dschaehi avatar dschaehi commented on August 27, 2024

Thank you for your answer.
But the ranges for the following parameters are not specified in the paper:

  • NEGATIVE_SAMPLE_SIZE,
  • LEARNING_RATE,
  • MAX_STEPS

In best_config.sh it seems that you use different values for the three parameters. Can you say more about this? Perhaps it would be easier that you just upload to the repository the script you used for finding the best hyperparameters.

from knowledgegraphembedding.

Edward-Sun avatar Edward-Sun commented on August 27, 2024

Sorry, I have left mila, so I cannot find my script for searching hyperparameters.
@KiddoZhu is still working on the KGE project in our group. Maybe he will have some insights.

As for my personal experience, I have the following suggestions:

NEGATIVE_SAMPLE_SIZE is the larger the better (more accurate). So just set it to the extend of GPU memory. When every other hyperparameters are fixed, there will be a trade off among HIDDEN_DIM, NEGATIVE_SAMPLE_SIZE and BATCH_SIZE for GPU memory.

You can also grid search LEARNING_RATE as well.

As for MAX_STEPS, since I haven't observed any overfitting in our implementation of several popular KGE models. You can train as many steps as you wish. An easy way is to plot the loss/step curve and find when the loss doesn't continue to drop.

Thank you for your concerns in our work!

from knowledgegraphembedding.

KiddoZhu avatar KiddoZhu commented on August 27, 2024

According to my results, the gap between the validation set and test set on all datasets are small enough compared to the gap of different methods. I am not sure about the original experiments in the paper, but the results shouldn't differ very much whichever dataset Zhiqing used.

If you want to adopt RotatE model to your own datasets, you need to search LEARNING_RATE and GAMMA. It would be better if you also tune NEGATIVE_SAMPLE_SIZE, but the default value is usually good. For MAX_STEPS, generally some value proportional to |E| will work.

from knowledgegraphembedding.

dschaehi avatar dschaehi commented on August 27, 2024

Thanks for the tips @KiddoZhu!

from knowledgegraphembedding.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.