shingling
- k-shingles generation
- minhashing
jaccard similarity
- jaccard similarity calculation
- jaccard distance calculation
- jaccard conditional comparaison
adwords problem
- greedy_adwords
- balance_adwords
- generalized_balance_adwords
frequency problem
- items frequency
- the algorithm of savasere, omniescinski and navathe
graph problem
- graph construction
- shortest_path
- longest path
- centrality
- independent graphs detection
- clustering_coef
- dijkstra
- dijkstra with heap
recommendation problem
- hamming distance
- euclidean distance
- pearson correlation
- tanimoto score
- euclidean similarity
- pearson similarity
- tanimoto similarity
- top similars
- top similar with map reduce
- recommendation user filtred
- recommendation item filtred
Radix tree
- insert
- remove
- search
- longest prefix
Decision tree
- Divide data
- Gini impurity
- Entropy
- Variance
- Buil tree
- Prune
- Classify
- Draw tree
Page Rank
A very simple version/implementation of the page rank algorithm.
- Page rank
- Advanced version of page rank, topic sensitive
- spam farms
- spam farms
- trust rank
- Hiperlink induced topic search
- Map reduce to efficiently calculates the page rank
- Jaccard simiarity to be found in data analysis repo
Map-Reduce
Implementation of map reduce, and some examples.
- Map Reduce class
- Estimation of pi number
- Calculation of frequency of Items from multiple files