Hi there! Visit our website for more information.
strangedev / neat_pygenetics Goto Github PK
View Code? Open in Web Editor NEWAn implementation of neuro-evolution of augmented topologies
An implementation of neuro-evolution of augmented topologies
Hi there! Visit our website for more information.
A list of possible networking errors should be assembled and appropriate counter measures should be put in place.
We should decide on some naming conventions.
Endpoints of Nodes:
source - target vs head - tail
tbd
The Genome selector selects two genomes from the Genome-Repository for breeding.
The selection is based on cluster and other stuff.
The Genome Selector will be a main part of a continuous evolutionary process.
We are currently experimenting with MongoDB.
MongoDB stores objects in JSON representation.
In the module DatabaseConnector we have custom JSONEn- and Decoders, which are used to encode StorageGenomes and their AnalysisResults into JSON and decode them back.
I want to build the repositories on top of that and then test performance with many (100,000+) objects.
In order to solve more complex problems, different node types should be added.
Different types should be able to specify their own transport functions, minimum and maximum number of in-/outputs.
The simulator uses an arbitrary simulation (in strangedev's case his prg-1 python project, I want to use 2048) to receive inputs for a genome and to test the outputs of the genome.
The simulation then has to prvide a fitness for the tested genome.
For this we need a really abstract structure.
A Simulator should have this interface:
Very abstract ;D
Since we need to store a bit of information about clusters, namely
We need a dedicated structure for the analysis of genomes.
The structure should be lightweight, but easily traversable and easy to check for cycles etc.
The currently best idea for this is an adjacency list.
The Genome Analyzer performs different tasks on Genomes in the AnalysisStructure.
The currently planned analysis tasks are:
The Genome analyzer should return a diff object on request. This diff object should contain information about what in the graph has to change according to the given rules. If there are source or sinks found, for example, they should be contained in the diff object as "to be removed", so that they can be removed from the StorageStructure afterwards.
The Gene Repository collects and distributes Genes.
How these are stored exacty is not yet decided, but it should be persistent.
Genes should be accessible via getGene(geneID) or something like that. Also the repository should provide the possibility to search for Genes starting or ending in a specific node (for reusage).
I recommend sqlite for testing and prototyping, then switching to MySQL later on.
I haven't toyed with nosql databases yet - maybe they are a viable in this case.
Thoughts and opinions?
The MainDirector class is called from main and does everything in an automated run.
It creates databases as necessary and does everything the program flow contains.
It currently contains pseudo code like stuff detailing, how the program should work. See #10
It outsources a lot of its work: there are plans for objects that encapsulate selection or decision making.
By no means perfect, currently a proposal for how things should work, once everything is implemented.
Because I like to think that this is going to be a massive success, I am thinking a bit about licensing.
Something open, of course, with free access and usage in my opinion.
But i feel the need to expand whatever license we use (and i don't know many) with a clause protecting this project from any military use. Which is a potential threat with ai.
The Genome Repository is basically the container for the current population. Old Genomes should not be thrown away, but marked as old and out of use.
NEAT should be able to outsource the calculation of graph outputs over the network.
This would allow for the integration of other programs which use graph structures for their calculations. (NEAT would then only perform graph optimization)
Also, NEAT should be able to distribute calculation node over the network, when output calculation isn't outsourced.
I suggest building a web frontend. Mainly for statistics and status tracking, but maybe also as a control panel.
Python can do lots for that. Especially for statistics, things like matplotlib and scipy can go a long way for us.
Longterm, as it is only useful, once the application is done/usable.
The AnalysisResult currently contains an adjacency list of cycle closing edges and an adjacency list of the rest of the edges.
This will be changed into a single dictionary mapping the gene_ids of the StorageGenome to true, if they close a circle and to false, if they don't.
This way it is necessary for creating SimulationGenomes to use a GeneRepository, but it does not depend on the AnalysisGenome anymore.
Also the set of disabled nodes is to be removed, since genes are not disabled during the analysis.
We should have a module dedicated to several necessary Exceptions.
Things like SuccessorAlreadyExistsInNode etc.
These Exceptions should be configurable and verbose e.g. should be parameterized with information about the incident.
At the moment, doxygen is used for creating doc pages.
For some reason, doxygen is missing some of our docstrings, this should be fixed.
At the moment, only NEATServer uses a message queue for communication.
For better (and more robust) communication, NEATClient should also implement a threaded message queue.
Problem:
When following the program flow, there has to be a main class which ties all three graph representations together, in order to be able to access all the required methods and information at the right time.
Analysis of the situation:
For most steps, like clustering, breeding, selecting and mutating, the graph has to be examined in order to pick a spot for mutations to happen, compare two genomes, etc.
The AnalysisStructure (AnalysisGenome) is the most appropriate class for these operations, because of the intuitive graph representation (adjacency list).
The simulation structure is only needed for a single step, which is computing the output of the NN.
The storage structure is also only needed when storing the genomes in persistent storage.
(The storage structure could potentially handle innovation numbers / ids, but it shouldn't)
Original proposal:
The Simulation structure should be the main structure tying together all three different data structures.
Objection:
However, the SimulationStructure (SimulationGenome), does not store information on genes (edges) but on neurons (nodes), which makes the analysis structure unfit for performing breeding/mutation/analysis. This is not a design flaw, because simulation structure ought to be a specialized class, only carrying the information it needs.
The simulation structure is therefore unfit for the purpose of being our main data structure and should only be used for computation of the NN.
New proposal:
The analysis structure could potentially carry all of the needed information, but should remain a specialized class and not take the role of the main, combining class for the three structures for design purposes.
Instead, we should wrap the three structures into a Genome class, which will perform graph operations using the analysis structure and create the simulation- and storage structure lazily as needed, because either of them will either lose information or computational fitness once converted.
This will also clear out responsibility issues and make the program design more structured.
At the moment, no authentication is required for comunication between server and client.
An appropriate method for authentication and session management should be added.
We want to cluster our population into species to encourage inbreeding.
Inbreeding is used to prevent "good ideas" from being killed befor they can develop. If a change in a genome, that will eventually lead to greatness, initially decreases the fitness, it would be discarded.
Because of that, we take similar genomes and cluster them, leaving the genomes in clusters alive, even if their fitness is low. Cluster with a low overall fitness produce less offspring, so we can focus our resources on better clusters, but they continue to exists, so they may improve their "ideas".
Therefore, we need a way to cluster our genomes. After breeding for a while inside these clusters, we breed two whole clusters together, thus (probably) creating a new, third, cluster. Afterwards, we recluster everything and keep on inbreeding.
We need a dedicated structure for storing genomes.
This structure should be storage efficient and easy to use for mutation and breeding.
The current concept is to use a list/set of Genes, while these Genes are unique, stored in a seperate repository and referenced via id. (Flyweight pattern)
Genes basically are graph edges; they contain a head, a tail and an innovation value. Additional Gene information that is needed in the Genome is stored there, in combination with the gene id.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.