Rank Aggregation (RA) methods

22 unsupervised RA methods, 7 supervised RA methods and 1 semi-supervised RA methods were tested on our preprocessed datasets. These datasets cover the areas of person re-identification (re-ID), recommendation system, bioinformatics and social choices. The tested methods include both classical and state-of-the-art RA methods. If there is a need to test other datasets, please follow the instructions in the code comments for dataset preprocessing and necessary code modifications.

Unsupervised	Supervised	Semi-supervised
$\textrm{CombMIN}$ [1] $\textrm{CombMAX}$ [1] $\textrm{CombSUM}$ [1] $\textrm{CombANZ}$ [1] $\textrm{CombMNZ}$ [1] $\textrm{MC1}$ [2] $\textrm{MC2}$ [2] $\textrm{MC3}$ [2] $\textrm{MC4}$ [2] $\textrm{Borda count}$ [3] $\textrm{Dowdall}$ [4] $\textrm{Median}$ [5] $\textrm{RRF}$ [6] $\textrm{iRANK}$ [7] $\textrm{Mean}$ [8] $\textrm{HPA}$ [9] $\textrm{PostNDCG}$ [9] $\textrm{ER}$ [10] $\textrm{Mork-H}$ [40] $\textrm{CG}$ [11] $\textrm{DIBRA}$ [12] $\textrm{Borda-Score}$ [41]	$\textrm{wBorda}$ [13] $\textrm{CRF}$ [14] $\textrm{CSRA}$ [15] $\textrm{AggRankDE}$ [16] $\textrm{IRA}_\textrm{R}$ [17] $\textrm{IRA}_\textrm{S}$ [17] $\textrm{QI-IRA}$ [18]	$\textrm{SSRA}$ [19]

Directory Structure

│  README.md
│  
├─datasets
│  ├─FLAGR
│  ├─ice-cream
│  ├─MovieLens 1M
│  ├─MQ2008-agg
│  ├─NSCLC
│  ├─Re-ID
│  └─World University Ranking 2022
│      
└─unsupervised
│   ├─matlab
│   │      BordaCount.m
│   │      CG.m
│   │      CombANZ.m
│   │      CombMAX.m
│   │      CombMED.m
│   │      CombMIN.m
│   │      CombMNZ.m
│   │      CombSUM.m
│   │      Condorcet.m
│   │      DIBRA.m
│   │      Dowdall.m
│   │      EnsembleRanking.m
│   │      ER.m
│   │      HPA.m
│   │      hpa_func.m
│   │      ice-cream.mat
│   │      iRank.m
│   │      Matrix-ice-cream.mat
│   │      Mean.m
│   │      Median.m
│   │      PostNDCG.m
│   │      RRF.m
│   │      unsupervised RA methods.ipynb
│   │      
│   └─python
│           BordaCount.py
│           CG.py
│           CombANZ.py
│           CombMAX.py
│           CombMED.py
│           CombMIN.py
│           CombMNZ.py
│           CombSUM.py
│           Comb_Family.py
│           Dowdall.py
│           evaluate.py
│           MarkovChain.py
│           Mean.py
│           Medium.py
│           preprocess.py
│           RRF.py
│           run_algorithm.py
│           scorefunc.py
│           unsupervised RA methods.ipynb
│
├─supervised
│  ├─matlab
│  │  │  compute_AP.m
│  │  │  evaluation.m
│  │  │  m_QT_IRA.m
│  │  │  m_Rank_based_IRA.m
│  │  │  m_Score_based_IRA.m
│  │  │  QT_IRA.m
│  │  │  Rank_based_IRA.m
│  │  │  Score_based_IRA.m
│  │  │  
│  │  └─CSRA
│  │          
│  └─python
│          AggRankDE.py
│          CRF.py
│          Evaluation.py
│          WeightedBorda.py
│          
├─semi-supervised
│      SSRA.py

Get Started

Define the input to the method as a CSV file format. The columns in this CSV file should be organized as follows:

Query, Voter, Item Code, Item Rank

where

Query is the topic for which the preference list is submitted.
Voter is the name of the ranker who submitted a preference list for a particular Query.
Item Code is a unique identifier that identifies each element of the preference lists.
Item Rank is the preference rank assigned to an item by a Voter.

If you need to test our supervised or semi-supervised methods, then relevance judgments are required for the elements of the preference list in the primary input file for each query. It is organized as follows:

Query, 0, Item Code, Relevance

where

Query is the topic for which the preference list is submitted.
0: unused. This value must be always 0.
Item Code is a unique identifier that identifies each element of the preference lists.
Relevance is an integer value that represents the relevance of the item with respect to the mentioned Query. Typically, zero values represent irrelevant and incorrect elements and positive values represent relevant, correct and informative elements.

Similarly, we define the final output of the methods as a CSV file which is organized in the following manner：Query, Item Code, Item Rank.

We also provide a partially processed dataset in our dataset.zip file. You are welcome to use our code and test it here!

Follow-up Plan

We will be updating and adding more RA methods for shared use.

Experiments

Re-identification

In Re-identification (re-ID) datasets, we choose 6 feature extraction methods (BDB [20], BOT [21], Top-DB-Net-RK [22], LightMBN [23], FPB [24], LUPerson [25]) to extract features from both query and gallery images, and then use the Euclidean method to combine the feature information of the combination of query and gallery to get the gallery scores under each query, and then eventually, for each query, we get the 6 basic rankings according to the scores in descending order. We evaluate our method on four image re-ID datasets (Market1501 [26], DukeMTMC-reID [27] and CUHK03 detected and labeled [28])

All experiments are conducted on Intel Xeon Silver 4215 (2.50GHz) and 4 Nvidia RTX A6000. It is important to note that MC1-4 methods are very difficult to test on the full Market1501 and DukeMTMC-reID datasets, requiring more than 40,000 hours in our experimental environment. Therefore, we conduct a cut-off operation for these two datasets on the basic rankings to refine our experiments as follows: we take out top-K items from all basic rankings to form a new itemset, and find the items in this itemset that were not originally present in specific basic ranking to add after the $k_{th}$ item of the basic ranking to finally obtain a new basic ranking. We use the MC1-4 method to aggregate the new basic rankings to a new one $R_{\tau}$. After aggregation, the items except itemset, we randomly sort them to the back of $R_{\tau}$ be the MC1-4 (top-K).

The result of basic rankings (BDB, BOT, Top-DB-Net-RK, LightMBN, FPB, LUPerson) is shown in Table 1.

Table 1: Rank@1 (%) and mAP (%) [29] results for selected feature extraction methods on re-ID datasets.

we use official training sets to train basic re-ID and fully-supervised RA methods, and use official test sets to evaluate all RA methods.Table 2 presents the parameters of the semi-supervised and supervised methods, along with their type and the value that was set during the re-ID experiments. Note that a parameter setting of default means that for each query in the training set, the value taken is equal to the total number of relevant labels.

Table 2: The parameters of the supervised rank aggregation methods.

Table 3 shows the results of the experiment conducted on the four re-ID datasets, representing the quality of all rank aggregation methods.

Table 3: Rank@1 (%) and mAP (%) [30] results for rank aggregation methods on re-ID datasets.

Recommendation System

In recommendation system dataset (MovieLens 1M [31]), we perform a two-step pre-processing before giving them as input to the recommendation algorithms: (i) Binarization of the ratings of the datasets, as the items are considered only as relevant or irrelevant in the top-N recommendation task, whichs means an item is relevant to a user if its rating is greater than the median of the ratings given by the user. (ii) Removal of users and items that do not reach a predefined threshold value regarding frequency of ratings, which means we removed from the dataset infrequent items and users that rated very few items. Items rated by less than 10 users were removed, together with users that had rated less than 10 items [30].

On MovieLens 1M dataset,we divide the movies evaluated by each user into matrices $M_{train}$ and $M_{test}$. Specifically, we use 60% ratings of each user for $M_{train}$, and the remaining for $M_{test}$. Therefore, two matrices share the same set of users but have different movies recommended for these users. We use two criteria, Rank@1 and mAP@10 criteria, to evaluate the performance.

We select six recommendation algorithms [32] (UserUser, BiasedMF, FunkSVD, ImplicitMF, ItemItem, MostPopular) in the experimental phase. All of them are available in the LensKit library. Table 4 presents the parameters of the recommendation algorithms, along with their type and the value that was set during our experiment. The names of the parameters used are consistent with the parameter names available in the LensKit library. We represent the quality of recommendations generated by six recommendation algorithms in Table 5.

Table 4: The parameters of the recommendation algorithms.

Table 5: Rank@1 (%) and mAP@10 (%) [30] results for selected recommendation algorithms on MovieLens 1M dataset.

Different rank aggregation methods will combine the six recommendations into a consensus one. We show their performance in Table 6.

Table 6: Rank@1 (%) and mAP@10 (%) [30] results for rank aggregation methods on MovieLens 1M datasets

Bioinformatics

In bioinformatics, we select a real dataset (NSCLC [33]) related to cancer to conduct our experiment. Because there is no labeled data in the NSCLC dataset, we do not measure supervised and semi-supervised RA methods on NSCLC. The NSCLC dataset consists of four basic rankings which are of length 2270, 275, 543, 3501. The sources of them are [34], [35], [36] and [37]. We consider the recall performance criteria in the aggregated list based on the top 400 and 800 genes. Thus, the result of all unsupervised RA methods is shown in Table 7.

Table 7: Recall@400 (%) and Recall@800 (%) [38] results for unsupervised RA methods on NSCLC datasets.

Social Choices

We select five popular world university rankings: [ARWU], [QS], [THE], [US-NEW] and [URAP], where there is duplication of top-200 universities. In the five university rankings, because some universities appear in one ranking but not in another, we process the data for these five popular world university rankings. Specifically, we take the rank of the basic university ranked in its basic ranking for the duplicates. Furthermore, We first collect the set of all universities for these five university rankings, and if a university that belongs to the set of all universities does not appear in a particular basic ranking, we set this university to be the 201st of the basic ranking, and so on, until all five university rankings are processed. Eventually, we obtain and aggregate five rankings for an equal number of universities. We measure the normality and the overall impartiality [39] to represent the quality of a consensus ranking. The result of all unsupervised RA methods is shown in Table 8.

Table 8: Normality and the overall of impartiality results for unsupervised RA methods on World University Ranking.

References

[1] Fox, E., & Shaw, J. (1994). Combination of multiple searches. NIST special publication SP, 243-243.

[2] Dwork, C., Kumar, R., Naor, M., & Sivakumar, D. (2001, April). Rank aggregation methods for the web. In Proceedings of the 10th international conference on World Wide Web (pp. 613-622).

[3] Aslam, J. A., & Montague, M. (2001, September). Models for metasearch. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 276-284).

[4] Reilly, B. (2002). Social choice in the south seas: Electoral innovation and the borda count in the pacific island countries. International Political Science Review, 23(4), 355-372.

[5] Fagin, R., Kumar, R., & Sivakumar, D. (2003, June). Efficient similarity search and classification via rank aggregation. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data (pp. 301-312).

[6] Cormack, G. V., Clarke, C. L., & Buettcher, S. (2009, July). Reciprocal rank fusion outperforms condorcet and individual rank learning methods. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (pp. 758-759).

[7] Wei, F., Li, W., & Liu, S. (2010). iRANK: A rank‐learn‐combine framework for unsupervised ensemble ranking. Journal of the American Society for Information Science and Technology, 61(6), 1232-1243.

[8] Burges, C., Svore, K., Bennett, P., Pastusiak, A., & Wu, Q. (2011, January). Learning to rank using an ensemble of lambda-gradient models. In Proceedings of the learning to rank Challenge (pp. 25-35). PMLR.

[9] Fujita, S., Kobayashi, H., & Okumura, M. (2020). Unsupervised Ensemble of Ranking Models for News Comments Using Pseudo Answers. In Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14–17, 2020, Proceedings, Part II 42 (pp. 133-140). Springer International Publishing.

[10] Mohammadi, M., & Rezaei, J. (2020). Ensemble ranking: Aggregation of rankings produced by different multi-criteria decision-making methods. Omega, 96, 102254.

[11] Xiao, Y., Deng, H. Z., Lu, X., & Wu, J. (2021). Graph-based rank aggregation method for high-dimensional and partial rankings. Journal of the Operational Research Society, 72(1), 227-236.

[12] Akritidis, L., Fevgas, A., Bozanis, P., & Manolopoulos, Y. (2022). An unsupervised distance-based model for weighted rank aggregation with list pruning. Expert Systems with Applications, 202, 117435.

[13] Pujari, M., & Kanawati, R. (2012, November). Link prediction in complex networks by supervised rank aggregation. In 2012 IEEE 24th International Conference on Tools with Artificial Intelligence (Vol. 1, pp. 782-789). IEEE.

[14] Volkovs, M. N., & Zemel, R. S. (2014). New learning methods for supervised and unsupervised preference aggregation. The Journal of Machine Learning Research, 15(1), 1135-1176.

[15] Yu, Y., Liang, C., Ruan, W., & Jiang, L. (2020, May). Crowdsourcing-Based Ranking Aggregation for Person Re-Identification. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1933-1937). IEEE.

[16] Bałchanowski, M., & Boryczka, U. (2022). Aggregation of Rankings Using Metaheuristics in Recommendation Systems. Electronics, 11(3), 369.

[17] Huang, J., Liang, C., Zhang, Y., Wang, Z., & Zhang, C. (2022). Ranking Aggregation with Interactive Feedback for Collaborative Person Re-identification.

[18] Hu, C., Zhang, H., Liang, C., & Huang, H. (2024). QI-IRA: Quantum-inspired interactive ranking aggregation for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, pp. 1-9).

[19] Chen, S., Wang, F., Song, Y., & Zhang, C. (2008, October). Semi-supervised ranking aggregation. In Proceedings of the 17th ACM conference on Information and knowledge management (pp. 1427-1428).

[20] Dai, Z., Chen, M., Gu, X., Zhu, S., & Tan, P. (2019). Batch dropblock network for person re-identification and beyond. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 3691-3701).

[21] Luo, H., Gu, Y., Liao, X., Lai, S., & Jiang, W. (2019). Bag of tricks and a strong baseline for deep person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 0-0).

[22] Quispe, R., & Pedrini, H. (2021, January). Top-db-net: Top dropblock for activation enhancement in person re-identification. In 2020 25th International conference on pattern recognition (ICPR) (pp. 2980-2987). IEEE.

[23] Herzog, F., Ji, X., Teepe, T., Hörmann, S., Gilg, J., & Rigoll, G. (2021, September). Lightweight multi-branch network for person re-identification. In 2021 IEEE International Conference on Image Processing (ICIP) (pp. 1129-1133). IEEE.

[24] Zhang, S., Yin, Z., Wu, X., Wang, K., Zhou, Q., & Kang, B. (2021). FPB: feature pyramid branch for person re-identification. arXiv preprint arXiv:2108.01901.

[25] Fu, D., Chen, D., Bao, J., Yang, H., Yuan, L., Zhang, L., ... & Chen, D. (2021). Unsupervised pre-training for person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 14750-14759).

[26] Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., & Tian, Q. (2015). Scalable person re-identification: A benchmark. In Proceedings of the IEEE international conference on computer vision (pp. 1116-1124).

[27] Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016, October). Performance measures and a data set for multi-target, multi-camera tracking. In European conference on computer vision (pp. 17-35). Cham: Springer International Publishing.

[28] Li, W., Zhao, R., Xiao, T., & Wang, X. (2014). Deepreid: Deep filter pairing neural network for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 152-159).

[29] Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., & Hoi, S. C. (2021). Deep learning for person re-identification: A survey and outlook. IEEE transactions on pattern analysis and machine intelligence, 44(6), 2872-2893.

[30] Oliveira, S. E., Diniz, V., Lacerda, A., Merschmanm, L., & Pappa, G. L. (2020). Is rank aggregation effective in recommender systems? an experimental analysis. ACM Transactions on Intelligent Systems and Technology (TIST), 11(2), 1-26.

[31] Harper, F. M., & Konstan, J. A. (2015). The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis), 5(4), 1-19.

[32] Ekstrand, M. D. (2020, October). Lenskit for python: Next-generation software for recommender systems experiments. In Proceedings of the 29th ACM international conference on information & knowledge management (pp. 2999-3006).

[33] Wang, B., Law, A., Regan, T., Parkinson, N., Cole, J., Russell, C. D., ... & Baillie, J. K. (2022). Systematic comparison of ranking aggregation methods for gene lists in experimental results. Bioinformatics, 38(21), 4927-4933.

[34] Kerkentzes, K., Lagani, V., Tsamardinos, I., Vyberg, M., & Røe, O. D. (2014). Hidden treasures in “ancient” microarrays: gene-expression portrays biology and potential resistance pathways of major lung cancer subtypes and normal tissue. Frontiers in oncology, 4, 251.

[35] Li, Y., Xiao, X., Ji, X., Liu, B., & Amos, C. I. (2015). RNA-seq analysis of lung adenocarcinomas reveals different gene expression profiles between smoking and nonsmoking patients. Tumor Biology, 36, 8993-9003.

[36] Zhou, Y., Frings, O., Branca, R. M., Boekel, J., le Sage, C., Fredlund, E., ... & Orre, L. M. (2017). microRNAs with AAGUGC seed motif constitute an integral part of an oncogenic signaling network. Oncogene, 36(6), 731-745.

[37] Kim, S. C., Jung, Y., Park, J., Cho, S., Seo, C., Kim, J., ... & Lee, S. (2013). A high-dimensional, deep-sequencing study of lung adenocarcinoma in female never-smokers. PloS one, 8(2), e55596.

[38] Wang, B., Law, A., Regan, T., Parkinson, N., Cole, J., Russell, C. D., ... & Baillie, J. K. (2022). Systematic comparison of ranking aggregation methods for gene lists in experimental results. Bioinformatics, 38(21), 4927-4933.

[39] Feng, S., Deng, Q., Wang, S., Song, L., & Liang, C. (2023, November). An Experimental Study of Unsupervised Rank Aggregation Methods in World University Rankings. In 2023 International Conference on Intelligent Education and Intelligent Research (IEIR) (pp. 1-8). IEEE.

[40] Ivano Azzini., & Giuseppe Munda. (2020). Azzini, I., & Munda, G. (2020). A new approach for identifying the Kemeny median ranking. European Journal of Operational Research, 281(2), 388-401.

[41] Boehmer, Niclas, Robert Bredereck, and Dominik Peters. "Rank aggregation using scoring rules." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 5. 2023.

Contacts

If you encounter any problems, you can contact us via email [email protected]

nercms-mmap / rankaggregation-lib Goto Github PK

rankaggregation-lib's Introduction

Rank Aggregation (RA) methods

Directory Structure

Get Started

Follow-up Plan

Experiments

Re-identification

Recommendation System

Bioinformatics

Social Choices

References

Contacts

rankaggregation-lib's People

Contributors

Stargazers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs