achiabodo / compintelligence Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 1.81 MB

Generic repo for the 23/24 course of Computational Intelligence @ Polito

License: MIT License

Jupyter Notebook 94.51% Python 5.44% TeX 0.05%

compintelligence's People

Contributors

Watchers

compintelligence's Issues

Lab2-NIM : Review Giovanni Bordero s313010

Overall

The code is well written and the creation of a class for each single player make the code well organized.

Code implementation

The mutation doesn't apply a FP change, so it is not strictly an ES.
The training phase it is computationally expensive but provide good results.
The idea to save the trained player can be very useful to avoid time wasting due to the training.
The K constraint is not always set.

Lab 10 pair review by Andrea Panuccio s294603

Hi Alessandro,
here is a review of your code by me. I hope that you will find it helpful:

It's a merge of Q-Learning and Monte Carlo that uses a single reward for the whole episode while taking into account the importance of discounting.
You took care of hyperparameters and training schedule and, if we look at your agent performance curve, it really seems to pay off.
The choice of handling the two players as the two sides of the spectrum by using the minus sign was clever and easy to understand (if I did it right).
Maybe the only effective way to improve it could be to use a canonical Q-Learning approach, but I think that yours avoids lots of problems while getting a quite good performance, so it's really effective and I appreciate it.

Best regards,
Andrea

Lab2 Review by Beatrice Occhiena s314971

Overall Feedback

The code is well-structured, and the evolutionary algorithm for Nim players appears solid. Well done!

Annotations

K Parameter Consideration

Ensure K is considered consistently. Check:

Winning move when only 1 column remains.
Handle cases when no optimal moves are found.

SimulativePlayer Optimization

Your number_wins method involves creating deep copies of the game state and repeatedly playing random moves until the game ends. The repeated simulation of matches can be very computationally expensive, particularly when it's used in the evolutionary process for fitness evaluation.

Lab 9 Peer Review

Lab 9 Peer Review:

The code written shows a complete analysis of the given problem, making a comparison between different paradigm.
In particular i found very interesting the implementation of the PatternBasedAgent, the idea of diving the 1000-loci genome in smaller repeating subsequence is , in my opinion, simply brilliant.
It is interesting to notice the variuos implementations of the crossover operation, during my tests i observed that,for this problem, performing the crossover operation among multiple Individuals (Agents) in most of the cases helped to mantain and improve the diversity. As a suggestion and further development of your work maybe you can try to adopt this crossover operation and evaluate if there is any improvement
-A possible implementation of the suggested crossover operation could be:

# W: Set of vectors, p: Crossover probability
def uniform_crossover_multiple_ind(W, p):
    k = len(W)
    l = len(W[0])
    v = [0] * l
    # Crossover Operation
    for i in range(l):
        #  Check probability
        if p >= randint(0,1):
            # Load v with ith elements from each vector in W
            v = [W[j][i] for j in range(k)]
            # Randomly Shuffle v
            shuffle(v)
            #Put back the values in W
            for j in range(k):
                W[j][i] = v[j]
    #Return modified set of vectors W
    return W

Aside from that, this is a very well executed work that shows a complete mastery of the topic and of the python language.

Lab2-Nim: review by Edoardo Franco s310228

My suggestions and advice:

In the SimulativePlayer play() method, in order to evaluate if a certain state will be a good one, a better choice could be to use two players that knows exactly the best move to perform (for example those who use a Nim sum rule). That is because, in that case, the role of the starting state will be crucial to determine the final result. Moreover, by comparing the results with “best_sum > tmp_sum” you’re getting the worst state and not the best one.
During the training phase you’re choosing parents randomly, which works for Evolution Strategies. But since you're doing crossover, we're not following ES anymore. Instead, consider using a tournament or roulette wheel selection based on fitness, like in GA and GP. Otherwise, to stick to the ES just use tweaking (mutation), although some modern ES methods also mix things up a bit by using also recombinations.

Appreciations:

The code is well written and well organized.
I appreciate the approach in which you associated each possible state of the game to a specific action and how you performed the mutation or the crossover inside the agents.

Lab 10 Peer Review

I think this code is good. It makes a Tic-Tac-Toe game and has different kinds of players, like random ones and a smart one that learns by playing. The game training part looks cool, where the smart player learns and gets better over time. However, in some places, simpler words or comments could make it easier to understand. For example, in the ReinforcedPlayer class, explaining a bit more about why it's updating the Q-values could be helpful. Also, in the training loop, using more clear names for variables could make it easier to follow. Overall, it's a nice code for a game and learning player.

achiabodo / compintelligence Goto Github PK

compintelligence's People

Contributors

Watchers

compintelligence's Issues

Lab2-NIM : Review Giovanni Bordero s313010

Overall

Code implementation

Lab 10 pair review by Andrea Panuccio s294603

Lab2 Review by Beatrice Occhiena s314971

Overall Feedback

Annotations

K Parameter Consideration

SimulativePlayer Optimization

Lab 9 Peer Review

Lab2-Nim: review by Edoardo Franco s310228

Lab 10 Peer Review

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs