GithubHelp home page GithubHelp logo

scoleri-mr / computational_intelligence_2022_301841 Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 3.48 MB

Exercises and projects for the Computational Intelligence course @PoliTO 2022-2023

Jupyter Notebook 98.86% Python 1.14%

computational_intelligence_2022_301841's People

Contributors

scoleri-mr avatar

Watchers

 avatar

computational_intelligence_2022_301841's Issues

Lab 3 Peer Review

Peer Review

Repo/File link

At First Sight: Presentation and Readability

The README is concise and explains perfectly the idea behind the code and algorithm.
It also intuitively reports the results.
The Notebook is full of explanations which complete the introduction made in the README.


Task 1: Algorithmic Approach and Code Analysis

Given that in the Nim game there exist an Optimal strategy, every other fixed rule strategy will be suboptimal so I am going to focus on the other tasks.


Task 2: Algorithmic Approach and Code Analysis

The Evolvable Algorithm is instead more interesting.

the genome is a set of a 1/0 used to select some rules from a predefined set.

An Individual is a random choice of these rules using a sort of Binary Embedding.

Then each individual is evaluated using a fitness function which consists in testing the Individual (and his rules) against the suboptimal strategy.

Choosing the right fitness function is crucial for this GA.
in this case, the choice of evaluating the strategy against the sub-optimal strategy seems computationally heavy and probably not a perfect one, due to the fact that it is involved a 25% of randomness, so you will get a fitness function which is kind of random.

Anyway, you get good results against both random and sub-optimal strategies.

I tried a similar approach too, but then I realised that finding a good fitness function was harder than I thought

I can suggest trying a different approach for the definition of the genome, and so for the definition of the fitness function.
An idea can be creating a random set of moves taken from the possible moves, and using nimsum as a fitness function


Task 3: Algorithmic Approach and Code Analysis

The MinMax strategy is kind of classic in the idea, but really interesting in the implementation.

The idea to use a cache system to lighten the computational load is really smart and seems to work quite well. It also helps to avoid using limited depth with small nim dimensions.


Task 4: Algorithmic Approach and Code Analysis

The RL algorithm assigns a G value at each new state reachable from the current one trying different action from the possile_action.

The RL Agent is not properly "strong" considering your results.

The first problem could be the training time.
Have you tried more than 10k epochs? Or was it too computationally long?

A second problem could be the absence of randomization in the environment. You decided to only use nim(5) but you could also use nim(3) or nim(7) to see if it helps the agent generalise.
Another possibility could be randomizing also the strategy during training

And finally, you could you a discounted reward in the learning process, this could lead the agent to "forecast" something and try a long-term strategy.

Lab3 Peer Review

Peer Review by Giuseppe Atanasio (s300733)

Task 3.0 - Hardcoded

The hardcoded strategy rules that you have implemented are outstanding, nothing to adjust or suggest you. Nice choices!

Task 3.1 - Fixed rules based on nim-sum

The strategy is simple but straightforward. Also in this case, nothing to suggest you, good job.

Task 3.2 - Evolved rules

Your results are pretty similar to ours regarding pure_random, but that's different for optimal: if you always lose, you can try to modify some values. For instance, you can play with number of generations, population size and offspring size (check our code, our parameters are reported below):

NUM_MATCHES = 100
NUM_GENERATIONS = 30
NIM_SIZE = 10
POPULATION_SIZE = 50

Task 3.3 - MinMax

More or less same thoughts as before, and I apprieciated your knowledge of python using the pickle library. Furthermore, you have reached impressive times of runtime: I think we're going to take inspirations to your cache implementation for the future ;)

Task 3.4 - Reinforcement Learning

Now let's move on the most interesting field of this lab. Your strategy is kind of well structured, but I strongly suggest you to consider the implementation of model-free Q-learning: it maintains memory of the previous games always modifying the weights, and not only memorizing the max value in a row as in your code.

If you check our code, you can notice that there is a Q-table based on state-action tuples as keys, sort of dicts of dicts. We have exploited the Bblais' Game library to handle the table, but you can also use shelve as the professor suggested.

Try to dedicate some time to test it, since it could be useful to your final project!

Lab3 - Peer review

Nothing to say about the hardcoded rules and evoved one.

I appreciate the minmax implementation.

The reinforcement learning strategy is well written(I like a lot the generation at runtime of the possible states instead of complete enumeration), but probably, only encoding the state doesn't give good result. I made the same mistake, but later, moving to a score associated to the pair (state, action) i was able to achieve good result.

It was a good idea also to make some hyperparameter tuning, showing the analysis you have made.

Review of lab1

REVIEW BY MAGNALDI MATTEO, STUDENT ID 296852

The code is written in a clear way and it's easy to understand.
The solution that use the combinations consider also nodes that they may not be considered, because we already know that they would not lead to an optimal solution, one could therefore think of some kind of pruning. It also use a lot of memory for the different variables.
The second solution instead is very good, the number of nodes that are considered is small and produce an optimal soliution. The class "State" is very useful and simplify a lot the management of the different state inside the dictionary.

Into the Greedy and the Combination solutions to compute the elements inside a state, it might be more efficient to use itertools.chain(), instead of two for loop.
Into the graph solution the visited_states is inizialized to 1, but it should start at zero.

Lab 3 Review by Sofer Fabio

Code Structure

The readme file is very extensive and helps the reader prepare for the code analysis, there are a couple of small typos and incorrections, such as
We manage to win more than 55% of the games against the optimal strategy, around 1% against the optimal and around 20% against the semi-optimal. and Evaluating this strategy we see that it wins all the matches against both the random and the optimal_strategy. (talking about nim-sum, which is the optimal strategy)
Citing the sources is always a plus, especially with working hyperlinks.

3.0 Hardcoded rules

Working as expected, but I would rename the internal function

def evolvable(state: Nim) -> Nimply:
	.....
	return evolvable

and change the use of genome["p"], as they are confusing from a naming standpoint when talking about hard-coded rules.

3.1 Expert Agent

Everything is clear and concise.

3.2 Evolved rules Agent

It is good to see multi-rule agents performing well, and the semi-optimal opponent is a good idea that will help with the reinforcement learning procedures as well.

3.3 MinMax Agent

Cleanly implemented from the source material.

3.4 RL Agent

While the code works and it produces reasonable results, the glaring issue lies with the specificity required for the agent to only work on a specific size of Nim games, this could be fixed, as well as helping out the MinMax, by smarter use of caching through a class-specific hash function implementation, that maps the same board states to the same index, no matter the size.

Lab 2 - EA set covering review

Lab 2 review

First of all, thanks a lot for the presentation in class, it was incredibly useful and insightful. I have read the code trying to review it and i found it very complete and exhaustive. I have some small remarks but they are mostly prompts for discussion, which I will write here.

Strategy remarks

  • The namedtuple is a very effective way of recording the genome and the values of fitness, very nice idea
  • Wouldn't be a very effective way to kickstart the search to have a initial population composed only of valid solutions?
  • I am not sure if the is_valid function is used anywhere, and if it is not, is the evaluate function verifying also the completeness of the individual as a solution after evaluating its fitness? I am not familiar with the Counter in Python, so there I might lose how it works
  • I remember that you talked about the inefficiency of the fusion crossover despite its heaviness in computational terms. The fitness ratio is a factor that favors better individuals, making the mutation more "hill-climbing" like than random, especially in case of very fit individual vs. a very unfit one, in that case you would have a very small amount of mutations and basically make it a single element random mutation. Nevertheless I think you are the only ones to have tried to implement a method found in a scientific paper, and this is already much more than I did, so kudos ๐Ÿ‘
  • The self-adapting mutation probability is greatly implemented.
  • I think the plateau check is a bit "greedy" especially as a fixed factor: with higher N, it is likely to not have improvement for 30 straight generations. Maybe you could adapt it with respect to the problem original size, something like (N/20) + 20. Just an idea.
  • The variable population size, for example, is a good use of a parameter that varies based on the problem original size.

Formal remarks

  • In the imports at the beginning reduce from functools is not accessed and can be removed
  • The is_valid function doesn't make use of the parameter P, which can be removed from the function declaration
  • In the initialize_population function I suppose the genome can be directly initialized as Tuple and have elements added to it
  • The parameter n in the evaluate function is not accessed and used
  • The parameter len_P in the crossover function isn't the same as len(genome)? Really a small thing but would make the function more self-explanatory with just the genomes as parameters

In conclusion, thanks a lot for the complete, clear and well-organized solution, with the graphs too! It was challenging to understand it all but congratulations on the huge work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.