Hyper-parameters like below: Number of search threads <li

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Can hyper-parameters of MCTS be altered in Go benchmark under CLOSE division? about training HOT 6 CLOSED

jgong5 commented on June 18, 2024

Can hyper-parameters of MCTS be altered in Go benchmark under CLOSE division?

from training.

Comments (6)

petermattson commented on June 18, 2024

Under the present closed division rules, the RL environment cannot be altered, including most of those hyper parameters. We should make this distinction clearer. It can, however, be reimplemented to run faster as long as the underlying semantics don't change. So, more threads, depending on your use, might be allowed -- If it does the same work faster, you can do it. If it changes the result, you can't.

from training.

jgong5 commented on June 18, 2024

@petermattson Will you publish a doc on the detailed rules, e.g. changing what parameters is allowed and what not allowed? My understanding is that it is allowed to re-implement this on another framework like Caffe, right? I feel it is difficult for a re-implementation by simply looking into the reference implementation since there are many variables...

from training.

bitfort commented on June 18, 2024

These are good questions, I can provide some clarify.

Basically, for the closed the division we do not want to adjust work done by adjusting parameters. For example, reducing number of games or fan out in the tree. Though, if the particular values we provide in the reference implementation are clearly subpar, we can propose changes (which everyone will follow).

If you have results to suggest that there are clearly better parameters, please do share and I'm happy to propose changes to the reference to the larger committee.

As far as "Algorithm used to determine end-of-game winner" -- you can change such algorithms as long as they produce the same output. For example, if you have a better way to enumerate legal moves, feel free to enumerate more efficiently. The guiding wisdom is that better implementations are fine, as long as they produce the same output.

I'm happy to dig into the details if you have additional questions.

from training.

jgong5 commented on June 18, 2024

@bitfort Thanks for the clarification. I understand your explanation on the parameter settings. But I feel it would be more clear to list all the relevant parameters with their values in a document instead of embed them in all the places of the reference implementation. That can make participants much easier to follow for a re-implementation.

The guiding wisdom is that better implementations are fine, as long as they produce the same output.

I feel the wording "better implementations" here is ambiguous. What kind of "algorithm" can be tuned in the CLOSED division? Can I say CNN-based policy/value network part of the "algorithm"?

from training.

petermattson commented on June 18, 2024

Hi Jiong, In closed you can't change the model. Nor can you change the reinforcement learning algorithm, though as Victor says, you can reimplement the RL algorithm more efficiently as long as its output is unchanged (Go has no floating point issues so, precisely, for the same random seeds it should evaluate the same positions and feed the same inputs to the models). Hope that helps. Best, Peter

…

On Fri, May 25, 2018 at 5:49 PM, Jiong Gong ***@***.***> wrote: @bitfort <https://github.com/bitfort> Thanks for the clarification. I understand your explanation on the parameter settings. But I feel it would be more clear to list all the relevant parameters with their values in a document instead of embed them in all the places of the reference implementation. That can make participants much easier to follow for a re-implementation. The guiding wisdom is that better implementations are fine, as long as they produce the same output. I feel the wording "better implementations" here is ambiguous. What kind of "algorithm" can be tuned in the CLOSED division? Can I say CNN-based policy/value network part of the "algorithm"? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#37 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AhFaHd6aZ0Hd89-DE45awliCsMH4a8AVks5t2KaXgaJpZM4T3y94> .

from training.

jgong5 commented on June 18, 2024

@petermattson @bitfort Thank you both for the kind answers. I will close the issue now but with the hope that you can publish doc on more detailed list of hyper-parameters which are currently embedded in the reference code.

from training.

Can hyper-parameters of MCTS be altered in Go benchmark under CLOSE division? about training HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs