iESBM Benchmark

https://github.com/nju-websoft/iESBM

Last update: 2020-05-20

License: ODC Attribution License (ODC-By)

iESBM: an interpretative Entity Summarization Benchmark on Multiple Datasets.

Download

iESBM benchmark: data of iESBM, include datasets, features and FER results, see iESBM.zip;
Evaluator (python): code for generating evaluation results (FSR results), see code/;
runs: output files generated by selected entity summarizers and their FSR results, see runs.zip.

Datasets

ESBM-D: 125 DBpedia entities from ESBM v1.2
ESBM-L: 50 LinkedMDB entities from ESBM v1.2
FED: 50 DBpedia entities from FACES system

See in_ds_raw

Guidelines

Quick Start

Suppose you want to evaluate your algorithm named 'youralgo', and its summaries generated for entities from the three datasets are in directory 'data/algosumm/youralgo/'. Run the following command:

python code/iesbm_eval.py -mode FSR -algo_name youralgo

Evaluation results will be outputted to directory 'data/out/out_youralgo/'. See the next section for details.

Evaluate Your Results

The evaluator can be used to evaluate any general-purpose entity summarizer with the following steps:

Installation

Environment Requirements

Python 3.x (tested on Python 3.6)
Numpy
Scipy

Installation

Install the evaluator by firstly dowloading the project, and then installing required packages with the following commands:

git clone [email protected]:nju-websoft/iESBM.git iESBM
cd iESBM
pip install -e .

Required Input Format

To evaluate your algorithm, please generate summaries for entities from the three datasets and organize the directory of summaries as follows (see youralgo as example):

├── ${algo_name}
│   ├── ${ds_name}
│   │   ├── ${eid}
│   │   │   ├── ${eid}_top5.nt
│   │   │   ├── ${eid}_top10.nt

where

${algo_name} is the name of your entity summarization algorithm, e.g. 'relin', 'diversum', 'youralgo';
${ds_name} is the alias for dataset, 'dbpedia' for ESBM-D, 'lmdb' for ESBM-L, 'dsfaces' for FED;
${eid} is an integer as the unique identifier for each entity, see elist.txt file in each dataset.

Run the Evaluator

Please put the folder ${algo_name}/ under directory 'data/algosumm/', and run iesbm_eval.py by the following command:

python code/iesbm_eval.py -algo_name ${algo_name} [-feature_name ${feature_name} -ds_name ${ds_name} -topk ${topk} -mode ${mode}]

where parameter -algo_name is necessary when you want to get 'FSR' results of an algorithm, and optional parameters:

-feature_name accept values: 'LFoP', 'GFoP', 'GFoV', 'IoPV', 'DoP' and 'DoV';
-topk accept two values: 'top5' for k=5 summaries, and 'top10' for k=10 summaries;
-mode accept three values: 'FER' for only output FER results, 'FSR' for only output FSR results, and 'all' for output both.

Output

For each setting (dataset, topk, feature), the evaluator will:

(0) Generate parsed files:

During the preprocesing of summary files, triples in summaries will be converted to triple ids, and these ids will be printed files in directory out_${algo_name}/algo_parsed/.

(1) Generate an output file:

The evaluator will output the evaluation results for summaries to file out_${algo_name}/algo_metrics/FSR_${feature_name}_${ds_name}_${topk}.txt. Each line in the file includes the following items (items are seperated by tab, see FSR_GFoP_dbpedia_top5.txt as example):

${eid}, ${FSR_of_e}

(2) Print statistical results:

Statistical information of the evaluation results will be printed to the console, including the following items:

${feature_name}, ${ds_name}, ${topk}, ${mean_FSR}, ${std_FSR}, ${significance_with_FER}

where ${significance_with_FER} composed of two values: t-statistic and p-value of the t-test. Meanwhile, these results will be outputted to file out_${algo_name}/result_statics_FSR.txt, see out_youralgo/result_statics_FSR.txt as example.

Add New Feature

You can add customized features to the evaluator according to following process:

Add Triple-level Feature

Firstly, compute feature score for each triple in dataset ${ds_name}, and output these information to a file named '${feature_name}_${ds_name}.txt' (where ${feature_name} is the name of your new feature, e.g. 'GFoP'). In this file, each line contains the following items (items are splitted by tab, see GFoP_dbpedia.txt as example):

${tid}, ${tscore}

Put this file to directory in/in_ds_feature/.

Open f_imp.py, add a new elif statement to the function get_feature_by_name():

elif fname=='${feature_name}'
    return Feature(ds_name, fname, FType.F_Triple, fpath)

Run iesbm_gen.py to generate FER files for this new feature:

python code/iesbm_gen.py ${feature_name}

Each line of the FER file contains the following items (splitted by tab, see FER_GFoP_dbpedia_top5.txt as example)

${eid}, ${FER_of_e}, ${average_score_of_golds}, ${score_of_desc}

Finally, this new feature can be used by setting parameter '-feature_name ${feature_name}' when running iesbm_eval.py

Add Summary-level Feature

First, implement a new subclass of f_base.Feature and name this class as 'F_${feature_name}' (see class F_DoP, F_DoV in f_imp.py as example). In this class, define the method to get feature score for an entity in function self._get_score_by_sscore().

Then, open f_imp.py, add a new elif-statement to function get_feature_by_name(), to return an object of the newly defined class:

elif fname=='${feature_name}'
    return F_${feature_name}(ds_name, fpath=fpath)

Each line of the FER file contains the following items (splitted by tab, see FER_DoP_dbpedia_top5.txt as example)

${eid}, ${FER_of_e}, ${average_score_of_golds}, ${score_of_desc}

Run iesbm_gen.py to generate FER files for this new feature:

python code/iesbm_gen.py ${feature_name}

Finally, this new feature can be used by setting parameter '-feature_name ${feature_name}' when running iesbm_eval.py

Evaluation Results

Effectiveness of existing features (FER), and evaluation results of several selected entity summarizers (FSR) are presented in the following tables.

You are encouraged to submit the results of your entity summarizer by contacting us. We will add your results to the following tables. Your submission should contain:

Summary files: summaries generated by your entity summarizer;
Evaluation results: evaluation results outputted by our evaluator;
Notes: brief description of your entity summarizer (e.g., name of the summarizer, citation information, parameter settings).

FER Results

FER Results from ground-truth summaries are presented in Table 1 and Table 2. Detailed results of FER are available, see in_ds_fer/.

Table 1. FER on each dataset for k=5.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
ESBM-D	0.561(±0.165)↓	0.913(±0.052)↓	0.759(±0.125)↓	1.275(±0.175)↑	2.478(±0.747)↑	1.016(±0.054)↑
ESBM-L	0.581(±0.125)↓	0.998(±0.029)	1.349(±0.188)↑	0.864(±0.057)↓	3.093(±2.394)↑	1.061(±0.068)↑
FED	0.821(±0.205)↓	1.012(±0.066)	1.148(±0.153)↑	0.958(±0.044)↓	1.699(±0.480)↑	1.016(±0.046)

Table 2. FER on each dataset for k=10.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
ESBM-D	0.569(±0.170)↓	0.902(±0.048)↓	0.753(±0.113)↓	1.267(±0.158)↑	2.080(±0.555)↑	1.002(±0.038)
ESBM-L	0.757(±0.131)↓	0.983(±0.025)↓	1.203(±0.152)↑	0.917(±0.054)↓	2.092(±1.298)↑	1.048(±0.068)↑
FED	0.862(±0.154)↓	0.993(±0.041)	1.065(±0.098)↑	0.981(±0.028)↓	1.601(±0.423)↑	1.018(±0.029)↑

FSR Results

FSR results for several selected entity summarizers are presented in the following tables. Their output files are also available (runs.zip).

Table 3. FSR of selected entity summarizers on ESBM-D for k=5.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
RELIN	0.280(±0.228)	0.869(±0.075)	0.277(±0.098)	1.801(±0.297)	2.351(±0.791)•	0.749(±0.253)
DIVERSUM	0.649(±0.192)	0.910(±0.048)•	0.854(±0.167)	1.175(±0.198)	2.753(±0.925)	1.037(±0.086)
FACES-E	0.623(±0.281)•	0.914(±0.079)•	0.911(±0.208)	1.142(±0.241)	2.494(±0.881)•	0.972(±0.118)
CD	0.334(±0.181)	0.863(±0.062)	0.414(±0.136)	1.620(±0.250)	2.742(±0.886)	1.061(±0.051)
BAFREC	0.585(±0.169)•	0.954(±0.056)	0.908(±0.177)	1.117(±0.198)	2.586(±0.755)	0.980(±0.107)
KAFCA	0.361(±0.248)	0.850(±0.083)	0.646(±0.244)	1.377(±0.294)	2.505(±0.829)•	0.993(±0.116)•
MPSUM	0.434(±0.201)	0.876(±0.072)	0.730(±0.250)•	1.304(±0.291)•	2.742(±0.886)	0.891(±0.187)
ESA	0.266(±0.212)	0.848(±0.084)	0.529(±0.179)	1.535(±0.282)	2.303(±0.827)	0.930(±0.160)
DeepLENS	0.302(±0.219)	0.854(±0.076)	0.656(±0.190)	1.407(±0.267)	2.415(±0.801)•	0.957(±0.115)

Table 4. FSR of selected entity summarizers on ESBM-L for k=5.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
RELIN	0.688(±0.432)•	0.991(±0.047)•	0.600(±0.137)	1.154(±0.050)	2.775(±2.012)•	0.967(±0.189)
DIVERSUM	0.870(±0.381)	0.993(±0.038)•	1.006(±0.247)	0.991(±0.077)	3.869(±3.357)	1.091(±0.075)
FACES-E	0.536(±0.163)•	0.962(±0.079)	1.341(±0.296)•	0.872(±0.092)•	3.848(±3.352)	1.059(±0.103)•
CD	0.470(±0.199)	0.996(±0.073)•	1.009(±0.212)	0.959(±0.079)	3.869(±3.357)	1.102(±0.071)
BAFREC	0.562(±0.201)•	1.020(±0.053)	1.598(±0.491)	0.781(±0.144)	3.485(±3.228)	1.007(±0.088)
KAFCA	0.234(±0.200)	0.954(±0.056)	1.309(±0.395)•	0.884(±0.108)•	3.869(±3.357)	1.104(±0.102)
MPSUM	0.568(±0.201)•	0.979(±0.046)	1.249(±0.428)•	0.908(±0.131)•	3.869(±3.357)	1.083(±0.104)•
ESA	0.514(±0.235)•	1.029(±0.037)	1.241(±0.352)•	0.892(±0.116)•	3.125(±2.613)•	1.013(±0.154)•
DeepLENS	0.361(±0.163)	1.004(±0.037)•	1.412(±0.409)•	0.840(±0.129)•	3.496(±2.343)	1.000(±0.081)

Table 5. FSR of selected entity summarizers on FED for k=5.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
RELIN	0.911(±0.481)•	1.028(±0.157)•	0.652(±0.329)	1.123(±0.097)	1.473(±0.579)	0.761(±0.209)
DIVERSUM	1.339(±0.220)	0.962(±0.056)	1.043(±0.191)	0.989(±0.069)•	1.783(±0.517)	0.981(±0.097)•
FACES	0.860(±0.314)•	0.936(±0.081)	1.489(±0.245)	0.886(±0.084)	1.714(±0.514)•	1.019(±0.124)•
FACES-E	0.860(±0.314)•	0.936(±0.081)	1.489(±0.245)	0.886(±0.084)	1.714(±0.514)•	1.019(±0.124)•
CD	0.799(±0.206)•	1.042(±0.075)	0.699(±0.226)	1.118(±0.076)	1.783(±0.517)	1.060(±0.066)
LinkSUM	0.976(±0.353)	0.987(±0.080)•	1.656(±0.250)	0.797(±0.089)	1.460(±0.474)	1.062(±0.074)
BAFREC	0.928(±0.273)	0.949(±0.078)	1.658(±0.304)	0.811(±0.089)	1.768(±0.516)	0.975(±0.119)•
KAFCA	0.636(±0.248)	0.999(±0.116)•	0.864(±0.363)	1.024(±0.092)	1.699(±0.518)•	0.909(±0.134)
MPSUM	0.878(±0.245)•	0.918(±0.067)	1.344(±0.289)	0.949(±0.095)•	1.783(±0.517)	0.821(±0.225)
ESA	0.842(±0.323)•	1.090(±0.113)	0.813(±0.232)	1.039(±0.075)	1.378(±0.408)	0.875(±0.136)
DeepLENS	0.823(±0.476)•	1.056(±0.095)	1.166(±0.375)•	0.926(±0.124)•	1.481(±0.486)	0.863(±0.131)

Table 6. FSR of selected entity summarizers on ESBM-D for k=10.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
RELIN	0.392(±0.217)	0.879(±0.063)	0.374(±0.112)	1.655(±0.228)	2.015(±0.627)•	0.872(±0.155)
DIVERSUM	0.413(±0.135)	0.861(±0.048)	0.745(±0.164)•	1.299(±0.230)•	2.753(±0.925)	1.013(±0.056)•
FACES-E	0.516(±0.182)	0.897(±0.053)•	0.770(±0.146)•	1.270(±0.210)•	2.453(±0.842)	0.985(±0.062)
CD	0.393(±0.155)	0.871(±0.045)	0.555(±0.145)	1.467(±0.219)	2.538(±0.741)	1.026(±0.045)
BAFREC	0.629(±0.191)	0.945(±0.049)	0.850(±0.148)	1.171(±0.181)	1.926(±0.543)	0.968(±0.071)
KAFCA	0.443(±0.223)	0.883(±0.069)	0.661(±0.195)	1.359(±0.257)	2.199(±0.721)	0.972(±0.065)
MPSUM	0.405(±0.162)	0.880(±0.060)	0.686(±0.158)	1.349(±0.210)	2.612(±0.765)	0.971(±0.066)
ESA	0.309(±0.222)	0.839(±0.061)	0.606(±0.149)	1.442(±0.222)	2.088(±0.610)•	0.965(±0.082)
DeepLENS	0.334(±0.209)	0.827(±0.066)	0.674(±0.150)	1.367(±0.207)	2.070(±0.593)•	0.994(±0.058)•

Table 7. FSR of selected entity summarizers on ESBM-L for k=10.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
RELIN	0.865(±0.260)	1.000(±0.040)•	0.634(±0.113)	1.143(±0.045)	1.962(±1.646)•	0.949(±0.111)
DIVERSUM	0.570(±0.266)	0.965(±0.040)	1.221(±0.296)•	0.922(±0.082)•	3.869(±3.357)	1.083(±0.071)
FACES-E	0.470(±0.184)	0.953(±0.051)	1.398(±0.212)	0.867(±0.055)	3.856(±3.353)	1.057(±0.083)•
CD	0.560(±0.171)	0.992(±0.043)•	1.302(±0.317)•	0.885(±0.100)•	2.904(±2.170)	1.077(±0.040)
BAFREC	0.714(±0.146)	1.005(±0.057)	1.360(±0.253)	0.861(±0.074)	2.235(±1.506)•	0.937(±0.075)
KAFCA	0.407(±0.142)	0.969(±0.043)	1.336(±0.337)•	0.874(±0.099)•	3.119(±2.576)	1.069(±0.065)
MPSUM	0.564(±0.192)	0.977(±0.031)•	1.261(±0.280)•	0.909(±0.082)•	3.428(±2.898)	1.079(±0.065)
ESA	0.662(±0.232)	0.993(±0.037)•	1.187(±0.207)•	0.919(±0.071)•	2.257(±1.559)•	1.020(±0.082)•
DeepLENS	0.643(±0.188)	0.974(±0.036)•	1.210(±0.267)•	0.910(±0.087)•	2.284(±0.968)•	1.061(±0.075)•

Table 8. FSR of selected entity summarizers on FED for k=10.

	LFoP	GFoP	GFoV	IoPV	DoP	DoV
RELIN	0.883(±0.345)•	1.042(±0.089)	0.545(±0.152)	1.151(±0.054)	1.495(±0.505)•	0.889(±0.082)
DIVERSUM	1.021(±0.207)	0.943(±0.050)	1.115(±0.157)•	0.978(±0.050)•	1.783(±0.517)	1.011(±0.054)•
FACES	0.905(±0.235)•	0.928(±0.060)	1.315(±0.219)	0.933(±0.063)	1.584(±0.464)•	1.012(±0.055)•
FACES-E	0.905(±0.235)•	0.928(±0.060)	1.315(±0.219)	0.933(±0.063)	1.584(±0.464)•	1.012(±0.055)•
CD	0.735(±0.175)	1.022(±0.060)	0.840(±0.199)	1.050(±0.063)	1.783(±0.517)	1.055(±0.047)
LinkSUM	1.028(±0.224)	0.964(±0.061)	1.366(±0.186)	0.893(±0.054)	1.301(±0.337)	1.052(±0.049)
BAFREC	0.870(±0.181)•	0.926(±0.046)	1.433(±0.234)	0.890(±0.064)	1.634(±0.463)•	0.998(±0.057)•
KAFCA	0.680(±0.223)	0.984(±0.081)•	0.972(±0.246)•	0.996(±0.070)•	1.624(±0.510)•	0.975(±0.078)
MPSUM	0.804(±0.174)	0.909(±0.051)	1.256(±0.168)	0.954(±0.052)	1.783(±0.517)	0.958(±0.090)
ESA	0.832(±0.290)•	1.047(±0.080)	0.896(±0.183)	1.020(±0.060)	1.292(±0.365)	0.926(±0.080)
DeepLENS	0.863(±0.377)•	0.999(±0.092)•	1.116(±0.252)•	0.955(±0.077)•	1.334(±0.491)	0.908(±0.108)

References

[1] Gong Cheng, Thanh Tran, Yuzhong Qu: RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization. International Semantic Web Conference (1) 2011: 114-129.
[2] Marcin Sydow, Mariusz Pikula, Ralf Schenkel: The notion of diversity in graphical entity summarisation on semantic knowledge graphs. J. Intell. Inf. Syst. 41(2): 109-149 (2013).
[3] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth: FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering. AAAI 2015: 116-122.
[4] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth, Gong Cheng: Gleaning Types for Literals in RDF Triples with Application to Entity Summarization. ESWC 2016: 85-100.
[5] Danyun Xu, Liang Zheng, Yuzhong Qu: CD at ENSEC 2016: Generating Characteristic and Diverse Entity Summaries. SumPre@ESWC 2016.
[6] Andreas Thalhammer, Nelia Lasierra, Achim Rettinger: LinkSUM: Using Link Analysis to Summarize Entity Data. ICWE 2016: 244-261.
[7] Hermann Kroll, Denis Nagel and Wolf-Tilo Balke: BAFREC: Balancing Frequency and Rarity for Entity Characterization in Linked Open Data. EYRE 2018.
[8] Eun-Kyung Kim and Key-Sun Choi: Entity Summarization Based on Formal Concept Analysis. EYRE 2018.
[9] Dongjun Wei, Shiyuan Gao, Yaxin Liu, Zhibing Liu and Longtao Huang: MPSUM: Entity Summarization with Predicate-based Matching. EYRE 2018.
[10] Dongjun Wei, Yaxin Liu, Fuqing Zhu, Liangjun Zhang, Wei Zhou, Jizhong Han and Songlin Hu: ESA: Entity Summarization with Attention. EYRE 2019.
[11] Qingxia Liu, Gong Cheng and Yuzhong Qu: DeepLENS: Deep Learning for Entity Summarization. arXiv preprint 2020. arXiv:2003.03736.

Contact

If you have any questions or suggestions, please feel free to contact Qingxia Liu and Gong Cheng.

nju-websoft / iesbm Goto Github PK

iesbm's Introduction

iESBM Benchmark

Download

Datasets

Guidelines

Quick Start

Evaluate Your Results

Installation

Environment Requirements

Installation

Required Input Format

Run the Evaluator

Output

Add New Feature

Add Triple-level Feature

Add Summary-level Feature

Evaluation Results

FER Results

FSR Results

References

Contact

iesbm's People

Contributors

Stargazers

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Jobs