GithubHelp home page GithubHelp logo

chandanksahu / reqlist_reqnet_reqsim Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 49.02 MB

ReqList, ReqNet and ReqSim Datasets for the publication `A Network and Semantic Similarity Dataset of Requirements from the Tree Structure of System Requirement Specifications'

HTML 100.00%
dataset graph llm network-analysis nlp requirements systems-engineering tree

reqlist_reqnet_reqsim's Introduction

ReqList, ReqNet and ReqSim DOI


Dataset for the publication:

ReqNet and ReqSim: A Network and Semantic Similarity Dataset of Requirements from the Tree Structure of System Requirement Specifications

Abstract

Systems are developed as a solution to the problem space defined by their requirements. The requirements are acquired during the elicitation process. The creative nature of the elicitation process, proprietary nature of requirements, the need of extensive preprocessing and the diverse techniques for analysis restricts the development of a requirement dataset. There exists no formal or informal method to create a requirement dataset. Thus, we devise a semi-formal method to create a multi-purpose requirement dataset that harnesses human knowledge in the system requirement specification documents (SyRSDs) to facilitate the deployment of modern computing algorithms. Our dataset has three forms.

  1. ReqList, a list of requirements from $86$ distinct systems with their document structure in pure text form. The $12701$ requirements are ready to leverage natural language processing techniques and unsupervised machine learning techniques.
  2. ReqNet, a large network of requirements consisting of $17375$ nodes to deploy graph-theoretic algorithms for requirement engineering. ReqNet portrays small-world network characteristics with an average distance of $\approx 9.5619$ links.
  3. ReqSim, a dataset consisting of $10933$ pairs of requirements annotated with their similarity scores. ReqSim enables sentence-level supervised learning tasks to exploit the semantics of requirements. The similarity scores are coherent with human knowledge.

Our dataset is theoretically grounded by the tree structure of SyRSDs. We devise a method to extract a network from the SyRSDs and mathematically prove that the extracted network is a tree. The tree structure resonates with the hierarchical nature of the requirement allocation process.

Graphical Abstract

image

Citation Information

If you find this dataset useful, please cite:

@article{10.1115/1.4065786,
    author = {Sahu, Chandan Kumar and Rai, Rahul and Wiecek, Margaret and Gorsich, David},
    title = "{ReqNet and ReqSim: A Network and Semantic Similarity Dataset of Requirements from the Tree Structure of System Requirement Specifications}",
    journal = {Journal of Computing and Information Science in Engineering},
    pages = {1-15},
    year = {2024},
    month = {06},
    issn = {1530-9827},
    doi = {10.1115/1.4065786},
    url = {https://doi.org/10.1115/1.4065786},
}

Acknowledgement

We thank the authors of PURE: A Dataset of Public Requirements Documents for making their dataset publicly available.

reqlist_reqnet_reqsim's People

Contributors

chandanksahu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.