GithubHelp home page GithubHelp logo

ml-jku / hopular Goto Github PK

View Code? Open in Web Editor NEW
301.0 301.0 25.0 3.63 MB

Hopular: Modern Hopfield Networks for Tabular Data

Home Page: https://ml-jku.github.io/hopular/

License: MIT License

Python 100.00%

hopular's People

Contributors

bschaefl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hopular's Issues

Bug in dataset splits?

I'm reading the dataset code and there is probably a bug here:

self.__splits = (split_training, split_validation, split_test)
# Sort dataset according to splits.
self.__data = np.concatenate((
self.__data[split_training], self.__data[split_validation], self.__data[split_test]
), axis=0)

After that, you are using the old indices to index into the shuffled/concatenated arrays. So the splits are different (not stratified, for example):

def split_train(self) -> torch.Tensor:
return self.__splits[0]

issue adding dataset

HI ml-jku
I am trying to run hopular on my own data sets. It looks like there are two formats for doing this

  • one, used for blastchar, just involves adding a csv file and describing it in auxiliary/data.py. I have done this for the "adult census" data and it all seems to work (ie. no errors)

  • the other seems to involve the use of about 4 files i.e for statlog_heart it is

    • folds_py.dat
    • labels_py.dat
    • statlog_heart_py.dat
    • validation_folds_py.dat
  • I can't make any sense of this. The heart data should have 6 numeric and 7 discrete features, according to both data.py and the UCI website. however. it seems to have 13 numeric features

> head -2  resources/statlog_heart/statlog_heart_py.dat 
1.70892,0.688222,0.869313,-0.0752701,1.39961,-0.416256,0.979844,-1.75595,-0.699923,1.17882,0.675165,2.4681,-0.874083
1.37958,-1.44764,-0.183219,-0.91506,6.08171,-0.416256,0.979844,0.445582,-0.699923,0.480261,0.675165,-0.710216,1.18707

also I cant figure out what folds_py.dat and validation_folds_py.dat are doing.

> head -2  resources/statlog_heart/folds_py.dat 
0,1,0,0
1,0,0,0
> head -2  resources/statlog_heart/validation_folds_py.dat 
1,0,1,0
0,0,0,0

I am not so interested in the heart data as in understanding how to add my own data, use a test set or cross-validation and then
test on a independent data set.

Bye

Request for improvement

Hello
LightGBM/CatBoost/XGBoost offer the ability to play with non rescaled features with plenty of NaN in the dataset. This is specialy true when theses tools are invoked in FLAML which greatly simplifies their usage.
Do you plan to allow NaN in any cell from any column ? What is the impact of non rescaling data ?
Thanks

How to perform tests with other datasets that are not listed?

I looked into the repository and did not find an easy way to use our code in my own datasets. Something I would expect is an interface similar to the regression functions on scikit learn. This would be amazing for other researchers and students to use this new network. Is there any script, change or material available to help use this repository this way?

E.g. hopular.fit(x,y)

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.