GithubHelp home page GithubHelp logo

Balance data for latest version about pygta5 HOT 5 CLOSED

sentdex avatar sentdex commented on July 24, 2024
Balance data for latest version

from pygta5.

Comments (5)

Nixellion avatar Nixellion commented on July 24, 2024

Oh, ok. Seems that the problem is actually that collect_data is only recording 2 states: default (all zeroes) and w, so it just 'balances' data to the smallest array which is empty. Huh

from pygta5.

frossaren avatar frossaren commented on July 24, 2024

Ahhh so thats why i have huge problems getting it to work probably. Btw do we know anything about if this project is dead? Sentdex hasnt approved any pr or added anything in a long time.

from pygta5.

Nixellion avatar Nixellion commented on July 24, 2024

I don't know, but the stream with this bot runs pretty much 24\7

from pygta5.

kymckay avatar kymckay commented on July 24, 2024

For anyone looking at this in future, I have a slightly rewritten balance_data.py to handle an arbitrary number of choices and also repack files below a specified threshold of training data. Gist here.

from pygta5.

Phillyclause89 avatar Phillyclause89 commented on July 24, 2024

This is how I went about modifying balance_data.py to balance across the 9 possible choices:

import numpy as np
import pandas as pd
from collections import Counter
from random import shuffle
import random

random.seed()
FILE_I_END = 7
offset = 10

data_order = [i for i in range(1, FILE_I_END + 1)]
shuffle(data_order)
for count, i in enumerate(data_order):
    try:
        random.seed()
        file_name = 'training_data-{}.npy'.format(i)
        # full file info
        train_data = np.load(file_name, allow_pickle=True)
        print('training_data-{}.npy'.format(i), len(train_data))
        df = pd.DataFrame(train_data)
        print(df.head())
        print(Counter(df[1].apply(str)))
        w = []
        s = []
        a = []
        d = []
        wa = []
        wd = []
        sa = []
        sd = []
        nk = []
        for data in train_data:
            img = data[0]
            choice = data[1]
            if choice == [1, 0, 0, 0, 0, 0, 0, 0, 0]:
                w.append([img, choice])
                shuffle(w)
            elif choice == [0, 1, 0, 0, 0, 0, 0, 0, 0]:
                s.append([img, choice])
                shuffle(s)
            elif choice == [0, 0, 1, 0, 0, 0, 0, 0, 0]:
                a.append([img, choice])
                shuffle(a)
            elif choice == [0, 0, 0, 1, 0, 0, 0, 0, 0]:
                d.append([img, choice])
                shuffle(d)
            elif choice == [0, 0, 0, 0, 1, 0, 0, 0, 0]:
                wa.append([img, choice])
                shuffle(wa)
            elif choice == [0, 0, 0, 0, 0, 1, 0, 0, 0]:
                wd.append([img, choice])
                shuffle(wd)
            elif choice == [0, 0, 0, 0, 0, 0, 1, 0, 0]:
                sa.append([img, choice])
                shuffle(sa)
            elif choice == [0, 0, 0, 0, 0, 0, 0, 1, 0]:
                sd.append([img, choice])
                shuffle(sd)
            elif choice == [0, 0, 0, 0, 0, 0, 0, 0, 1]:
                nk.append([img, choice])
                shuffle(nk)
            else:
                print('no matches')
        w = w[:len(s)][:len(a)][:len(d)][:len(wa)][:len(wd)][:len(sa)][:len(sd)][:len(nk)]
        s = s[:len(w)]
        a = a[:len(w)]
        d = d[:len(w)]
        wa = wa[:len(w)]
        wd = wd[:len(w)]
        sa = sa[:len(w)]
        sd = sd[:len(w)]
        nk = nk[:len(w)]

        final_data = w + s + a + d + wa + wd + sa + sd + nk
        shuffle(final_data)
        np.save('balanced_training_data-{}.npy'.format(i+offset), final_data)

    except Exception as e:
        print(str(e))

from pygta5.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.