ml-jku / hopular Goto Github PK

View Code? Open in Web Editor NEW

301.0 301.0 25.0 3.63 MB

Hopular: Modern Hopfield Networks for Tabular Data

Home Page: https://ml-jku.github.io/hopular/

License: MIT License

Python 100.00%

hopular's People

Contributors

Stargazers

Watchers

hopular's Issues

Bug in dataset splits?

I'm reading the dataset code and there is probably a bug here:

hopular/hopular/auxiliary/data.py

Lines 597 to 602 in 3e0c39f

 self.__splits = (split_training, split_validation, split_test) 

 # Sort dataset according to splits. 

 self.__data = np.concatenate(( 

 self.__data[split_training], self.__data[split_validation], self.__data[split_test] 

 ), axis=0)

After that, you are using the old indices to index into the shuffled/concatenated arrays. So the splits are different (not stratified, for example):

hopular/hopular/auxiliary/data.py

Lines 637 to 638 in 3e0c39f

 def split_train(self) -> torch.Tensor: 

 return self.__splits[0]

pytorch_lightning.utilities.exceptions.MisconfigurationException: `.test(ckpt_path="best")` is set but `ModelCheckpoint` is not configured to save the best model.

It repeatedly says that the ModelCheckpoint is not configured to save the best model. Please help to fix this issue

issue adding dataset

HI ml-jku
I am trying to run hopular on my own data sets. It looks like there are two formats for doing this

one, used for blastchar, just involves adding a csv file and describing it in auxiliary/data.py. I have done this for the "adult census" data and it all seems to work (ie. no errors)
the other seems to involve the use of about 4 files i.e for statlog_heart it is
- folds_py.dat
- labels_py.dat
- statlog_heart_py.dat
- validation_folds_py.dat
I can't make any sense of this. The heart data should have 6 numeric and 7 discrete features, according to both data.py and the UCI website. however. it seems to have 13 numeric features

> head -2  resources/statlog_heart/statlog_heart_py.dat 
1.70892,0.688222,0.869313,-0.0752701,1.39961,-0.416256,0.979844,-1.75595,-0.699923,1.17882,0.675165,2.4681,-0.874083
1.37958,-1.44764,-0.183219,-0.91506,6.08171,-0.416256,0.979844,0.445582,-0.699923,0.480261,0.675165,-0.710216,1.18707

also I cant figure out what folds_py.dat and validation_folds_py.dat are doing.

> head -2  resources/statlog_heart/folds_py.dat 
0,1,0,0
1,0,0,0

> head -2  resources/statlog_heart/validation_folds_py.dat 
1,0,1,0
0,0,0,0

I am not so interested in the heart data as in understanding how to add my own data, use a test set or cross-validation and then
test on a independent data set.

Bye

Answer refutation

@bschaefl Hi, hopfield networks are very interesting and understudied indeed however, can you please answer this?
https://medium.com/@tunguz/trouble-with-hopular-6649f22fa2d3

Hello
LightGBM/CatBoost/XGBoost offer the ability to play with non rescaled features with plenty of NaN in the dataset. This is specialy true when theses tools are invoked in FLAML which greatly simplifies their usage.
Do you plan to allow NaN in any cell from any column ? What is the impact of non rescaling data ?
Thanks

How to perform tests with other datasets that are not listed?

I looked into the repository and did not find an easy way to use our code in my own datasets. Something I would expect is an interface similar to the regression functions on scikit learn. This would be amazing for other researchers and students to use this new network. Is there any script, change or material available to help use this repository this way?

E.g. hopular.fit(x,y)

Thanks!

ml-jku / hopular Goto Github PK

hopular's People

Contributors

Stargazers

Watchers

Forkers

hopular's Issues

Bug in dataset splits?

pytorch_lightning.utilities.exceptions.MisconfigurationException: `.test(ckpt_path="best")` is set but `ModelCheckpoint` is not configured to save the best model.

issue adding dataset

Answer refutation

Request for improvement

How to perform tests with other datasets that are not listed?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

	self.__splits = (split_training, split_validation, split_test)

	# Sort dataset according to splits.
	self.__data = np.concatenate((
	self.__data[split_training], self.__data[split_validation], self.__data[split_test]
	), axis=0)

	def split_train(self) -> torch.Tensor:
	return self.__splits[0]