grafos-ml / test.fm Goto Github PK

Testing framework for Collaborative Filtering

License: Other

Python 100.00%

test.fm's Introduction

Introduction

Test.fm is (yet another) testing framework for Collaborative Filtering models. It integrates well with pandas as the default data manipulation library and gives an easy way to investigate how well your models perform and why. You can build a model using okapi and then check how it performs on the testing data. Or if you have only a little data set, you can use it directly.

Example of using the Test.fm framework

	import pandas as pd
	import testfm
	from testfm.models.baseline_model import Popularity, RandomModel
	from testfm.models.tensorcofi import TensorCoFi
	from testfm.evaluation.evaluator import Evaluator
	
	evaluator = Evaluator()

	# Prepare the data
	df = pd.read_csv(..., names=["user", "item", "rating", "date", "title"])
	training, testing = testfm.split.holdoutByRandom(df, 0.9)

	# Tell me what models we want to evaluate
	models = [
	    RandomModel(),
	    Popularity(),
	    TensorCoFi()
	    ]
	
	# Evaluate
	items = training.item.unique()
	for m in models:
		m.fit(training)
		print m.getName().ljust(50),
		print evaluator.evaluate_model(m, testing, all_items=items)

See other examples here...

Installation

You can check the official documentation here.

download and extract the sources.
check the dependencies in conf/requirements.txt
run #sudo python setup.py install
if you are a developer of test.fm better do python setup.py develop
enjoy and contribute
Check travis for the latest builds...
Check yaml for the build script.

Nosetests

$ nosetests -w src/ -vv --with-cover --cover-tests --cover-erase --cover-html --cover-package=testfm --with-doctest --doctest-tests tests testfm/evaluation testfm/models testfm/fmio testfm/splitter

Build Documentation

$ sphinx-build -b html source_folder doc_folder

Similar Projects

mrec from Mendeley. Good at building models. (python, ?)
okapi from Telefonica Research. Good at distributed model building using Apache Giraph (java, giraph, apache2).
graphlab from CMU. Probably the richest library of modern algorithms (c++, apache2).
mymedialite from Uni Hildesheim. Has ranking implementations. (c#, GPL).
mahout of apache. Uses hadoop to build the models. (java, hadoop, apache2)
lenskit Grouplens (java, GPL2.1)

test.fm's People

Contributors

Stargazers

Watchers

Forkers

kilburn xuanhan863 joaonrb mindis constantineg1 mathkann markusyatina jennifer421 lomascolo mohammadyousuf caiobelfort minghao2016

test.fm's Issues

The C structs are not being serialized

The structs in cython models (float_matrix) is not being serialized so it cannot be passed throw standard multi processing interface.

SVDpp Minor Error

AttributeError: 'SVDpp' object has no attribute '_lamb'

Apparently this model cannot print it string form because has a reference for a non existing Attribute.

Refitting the same "FactorModel" is producing Segmentation Fault

Refitting the same "FactorModel" is producing Segmentation Fault shown at tunning example.

install fail , reason is "Blas library is not detected in the system"

version 2.7.13 will work but it will have other problem
ERROR: Command errored out with exit status 1:
command: 'c:\python36\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\lenovo\AppData\Local\Temp\pip-req-build-cvsi0be3\setup.py'"'"'; file='"'"'C:\Users\lenovo\AppData\Local\Temp\pip-req-build-cvsi0be3\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\lenovo\AppData\Local\Temp\pip-req-build-cvsi0be3\pip-egg-info'
cwd: C:\Users\lenovo\AppData\Local\Temp\pip-req-build-cvsi0be3
Complete output (7 lines):
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\lenovo\AppData\Local\Temp\pip-req-build-cvsi0be3\setup.py", line 11, in
from compile import ext_modules, build_ext
File "C:\Users\lenovo\AppData\Local\Temp\pip-req-build-cvsi0be3\compile.py", line 59, in
raise EnvironmentError("Blas library is not detected in the system")
OSError: Blas library is not detected in the system

ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output

Example in ReadMe doesn't work

The example in the readme trows errors

C implementation of TensorCoFi

C implementation of TensorCoFi randomly make model that produce results a bit different that the standard. The results continue way better that Random.

PyTensorCoFi parameter bug

When trying to set parameters on PyTensorCoFi the internal structures don't change crashing the system

remove graphchi files

Now for each training of graphchi we create a lot of file in /tmp, we need to clean them up, or directly use pipes.

Issue with setting the arguments in the model

I made example for tuning the method, but it fails.
Could you check what is the problem? I believe its with **kwargs

error 'ValueError: sample larger than population' running the models

HI all,

I've run libfm in Ubuntu using the dataset detailed below,
Random model run OK,
however when I tried the rest of the models in my list (I followed the models_example.py example provided here), i.e. BPR, TFIDFModel, Popularity, TensorCoFi, ..., an error "ValueError: sample larger than population" is always triggered with every model.

Please, does anyone know what could be the source of this problem? any suggestion?
There are many entries in Internet related with this problem in python, but the answers and potential causes described I think doesn't apply this case, so is unclear for me.
Btw, the dataset size is bigger than 5K rows...

Thanks in advance,
regards,
R.
------------------ test:

python modeltest2.py
user item rating time title
0 1123 0 2 838985046 NameFilm
1 1107 0 1 838985046 NameFilm
2 1107 0 1 838985046 NameFilm
3 1107 0 2 838985046 NameFilm
4 1107 1 1 838985046 NameFilm
0:00:00.083082 Random [0.262394934911661]
0:00:09.887563 BPR (dim=10,iter=15,reg=0.0001,eta=0.001)

Traceback (most recent call last):
File "modeltest2.py", line 57, in
print evaluator.evaluate_model(m, testing, all_items=items,)
File "build/bdist.linux-x86_64/egg/testfm/evaluation/evaluator.py", line 83, in evaluate_model
File "build/bdist.linux-x86_64/egg/testfm/evaluation/evaluator.py", line 30, in partial_measure
File "/usr/lib/python2.7/random.py", line 321, in sample
raise ValueError("sample larger than population")
ValueError: sample larger than population

BPR dont implement param_details

BPR Model cannot be tuned because it don't have the param_details implemented

Installation on Ubuntu12.04

$> sudo pip install test.fm-1.0.tar.gz
.....(output omitted)....
error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.7 -c src/testfm/evaluation/cutil/measures.c -o build/temp.linux-x86_64-2.7/src/testfm/evaluation/cutil/measures.o" failed with exit status 1

Command /usr/bin/python -c "import setuptools;file='/tmp/pip-LXq7fY-build/setup.py';exec(compile(open(file).read().replace('\r\n', '\n'), file, 'exec'))" install --single-version-externally-managed --record /tmp/pip-q437OF-record/install-record.txt failed with error code 1

$> sudo python setup.py install
.....(output omitted)....
cythoning src/testfm/evaluation/cutil/measures.pyx to src/testfm/evaluation/cutil/measures.c

Error compiling Cython file:

...
cdef int i
for i in range(list_size):
#printf(">>>%f %f\n", ranked_list[i], ranked_list[i+1])
if ranked_list[i*2] == 1.:
relevant += 1.
map_measure += (relevant / (i+1.))

^

src/testfm/evaluation/cutil/measures.pyx:48:41: Pythonic division not allowed without gil, consider using cython.cdivision(True)

Error compiling Cython file:

...
for i in range(list_size):
#printf(">>>%f %f\n", ranked_list[i], ranked_list[i+1])
if ranked_list[i*2] == 1.:
relevant += 1.
map_measure += (relevant / (i+1.))
return 0.0 if relevant == 0. else (map_measure/relevant)

^

src/testfm/evaluation/cutil/measures.pyx:49:54: Pythonic division not allowed without gil, consider using cython.cdivision(True)
building 'testfm.evaluation.cutil.measures' extension
C compiler: gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC

compile options: '-I/usr/include/python2.7 -c'
gcc: src/testfm/evaluation/cutil/measures.c
src/testfm/evaluation/cutil/measures.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation.
src/testfm/evaluation/cutil/measures.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation.
error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/include/python2.7 -c src/testfm/evaluation/cutil/measures.c -o build/temp.linux-x86_64-2.7/src/testfm/evaluation/cutil/measures.o" failed with exit status 1

Setup doesn't work

The setup doesn't install anymore. Complains about the requirements. Possible special characters in URL requirement are messing with the string.

Pyjnius

This module gives to much problems. Take it off.

update readme and dependencies

We need to update readme and dependencies.txt to reflect the recent changes.

Error: undefined symbol: clapack_sgesv

Hi @ALL,
once I've installed and compiled LAPACK + ATLAS (atlas3.10.3) + test.fm-1.0
under:

Environment: Linux dev-host-01 4.2.0-27-generic #32~14.04.1-Ubuntu SMP
Python 2.7.6

now running one of the available tests-src as:

sysadmin@dev-host-01:~/testfm/test.fm-1.0/src/tests$ python test_models.py

	     Traceback (most recent call last):
	       File "test_models.py", line 10, in <module>
	         from testfm.models.graphchi_models import SVDpp
	       File "build/bdist.linux-x86_64/egg/testfm/models/graphchi_models.py", line 9, in <module>
	       File "build/bdist.linux-x86_64/egg/testfm/models/cutil/interface.py", line 7, in <module>
	       File "build/bdist.linux-x86_64/egg/testfm/models/cutil/interface.py", line 6, in __bootstrap__
	       File "src/testfm/models/cutil/float_matrix.pxd", line 10, in init testfm.models.cutil.interface (src/testfm/models/cutil/interface.c:8464)
	       File "build/bdist.linux-x86_64/egg/testfm/models/cutil/float_matrix.py", line 7, in <module>
	       File "build/bdist.linux-x86_64/egg/testfm/models/cutil/float_matrix.py", line 6, in __bootstrap__
	     **ImportError: /home/sysadmin/.cache/Python-Eggs/testfm-1.0-py2.7-linux-x86_64.egg-tmp/testfm/models/cutil/float_matrix.so: undefined symbol: clapack_sgesv**

Please, any idea?? At this time I've tried many options and read a lot of forums related with this error, but none of them have fixed this problem,
thanks in advance for your help,
regards,
@rheras

New evaluator broke with online TensorCoFi

When use the online method of tensorCoFi if the data on evaluator has users or items not used in the model, it throw a segmentation fault.

Okapi example is crashing when the data is splitted.

The example in the Okapi connector is crashing when the data is sliced into training and testing. The result files are not being found at hadoop fs.

AverageModel Crash With Non Numeric Data.

Average Model is crashing when some column have non numeric values.

build error on mac

If I build on mac, I get:
$python setup.py develop
building 'testfm.evaluation.cutil.measures' extension
/usr/bin/clang -fno-strict-aliasing -fno-common -dynamic -pipe -Os -fwrapv -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/opt/local/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c src/testfm/evaluation/cutil/measures.c -o build/temp.macosx-10.9-x86_64-2.7/src/testfm/evaluation/cutil/measures.o
clang: error: no such file or directory: 'src/testfm/evaluation/cutil/measures.c'
clang: error: no input files
error: command '/usr/bin/clang' failed with exit status 1

can it be that measures.c is missing?

prepare the data

df = pd.read_csv(DATAPATH+"/1M_movielens/ratings.dat",
                 sep=" ", header=None, names=["user", "item", "rating", "date"])
print df.head()
training, testing = testfm.split.holdoutByRandom(df, 0.8)

grafos-ml / test.fm Goto Github PK

test.fm's Introduction

Introduction

Example of using the Test.fm framework

Installation

Nosetests

Build Documentation

Similar Projects

test.fm's People

Contributors

Stargazers

Watchers

Forkers

test.fm's Issues

Error compiling Cython file:

^

Error compiling Cython file:

^

prepare the data

Recommend Projects

Recommend Topics

Recommend Org

Jobs