GithubHelp home page GithubHelp logo

bigdata-ustc / educdm Goto Github PK

View Code? Open in Web Editor NEW
112.0 9.0 50.0 4.51 MB

The Model Zoo of Cognitive Diagnosis Models, including classic Item Response Ranking (IRT), Multidimensional Item Response Ranking (MIRT), Deterministic Input, Noisy "And" model(DINA), and advanced Fuzzy Cognitive Diagnosis Framework (FuzzyCDF), Neural Cognitive Diagnosis Model (NCDM) and Item Response Ranking framework (IRR).

License: Apache License 2.0

Python 98.77% Makefile 1.23%
cognitive-diagnosis-models model-zoo psychometrics dina fuzzycdf neuralcdm irt item-response-theory cdm students

educdm's Introduction

EduCDM

PyPI test codecov Download License DOI

The Model Zoo of Cognitive Diagnosis Models, including classic Item Response Ranking (IRT), Multidimensional Item Response Ranking (MIRT), Deterministic Input, Noisy "And" model(DINA), and advanced Fuzzy Cognitive Diagnosis Framework (FuzzyCDF), Neural Cognitive Diagnosis Model (NCDM), Item Response Ranking framework (IRR), Incremental Cognitive Diagnosis (ICD) and Knowledge-association baesd extension of NeuralCD (KaNCD).

Brief introduction to CDM

Cognitive diagnosis model (CDM) for intelligent educational systems is a type of model that infers students' knowledge states from their learning behaviors (especially exercise response logs).

Typically, the input of a CDM could be the students' response logs of items (i.e., exercises/questions), the Q-matrix that denotes the correlation between items and knowledge concepts (skills). The output is the diagnosed student knowledge states, such as students' abilities and students' proficiencies on each knowledge concepts.

Traditional CDMs include:

  • IRT: item response theory, a continuous unidimensional CDM with logistic-like item response function.
  • MIRT: Multidimensional item response theory, a continuous multidimensional CDM with logistic-like item response function. Mostly extended from unidimensional IRT.
  • DINA: deterministic input, noisy "and" model, a discrete multidimensional CDM. Q-matrix is used to model the effect of knowledge concepts in the cognitive process, as well as guessing and slipping factors.

etc.

More recent researches about CDMs:

  • FuzzyCDF: fuzzy cognitive diagnosis framework, a continuous multidimensional CDM for students' cognitive modeling with both objective and subjective items.
  • NeuralCD: neural cognitive diagnosis framework, a neural-network-based general cognitive diagnosis framework. In this repository we provide the basic implementation NCDM.
  • IRR: item response ranking framework, a pairwise cognitive diagnosis framework. In this repository we provide the several implementations for most of CDMs.
  • [ICD]: Incremental Cognitive Diagnosis, a framework that tailor cognitive diagnosis into the online scenario of intelligent education. In this repository we provide the several implementations for most of CDMs.
  • KaNCD: extended from the NeuralCD framework. We use high-order latent traits of students, exercises and knowledge concepts to capture latent associations among knowledge concepts.

List of models

Installation

Git and install with pip:

git clone https://github.com/bigdata-ustc/EduCDM.git
cd path/to/code
pip install .

Or directly install from pypi:

pip install EduCDM

Contribute

EduCDM is still under development. More algorithms and features are going to be added and we always welcome contributions to help make EduCDM better. If you would like to contribute, please follow this guideline.

Citation

If this repository is helpful for you, please cite our work

@misc{bigdata2021educdm,
  title={EduCDM},
  author={bigdata-ustc},
  publisher = {GitHub},
  journal = {GitHub repository},
  year = {2021},
  howpublished = {\url{https://github.com/bigdata-ustc/EduCDM}},
}

Reference

[1] Liu Q, Wu R, Chen E, et al. Fuzzy cognitive diagnosis for modelling examinee performance[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2018, 9(4): 1-26.

[2] Wang F, Liu Q, Chen E, et al. Neural cognitive diagnosis for intelligent education systems[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(04): 6153-6161.

[3] Tong S, Liu Q, Yu R, et al. Item response ranking for cognitive diagnosis[C]. IJCAI, 2021.

[4] Wang F, Liu Q, Chen E, et al. NeuralCD: A General Framework for Cognitive Diagnosis. IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE), accepted, 2022.

educdm's People

Contributors

fannazya avatar legionking avatar ljyustc avatar randolphvi avatar tswsxk avatar vivihong200709 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

educdm's Issues

代码运行中遇到的问题

请问您的educdm代码怎么运行的,readme过于简单了,其中没有说明项目的运行(比如IRR的项目)

Unstable DINA coverage test

🐛 Description

(A clear and concise description of what the bug is.)

Error Message

The coverage report of DINA is not stable @Ljyustc

To Reproduce

See line 74 - 75, the result is related to the initialization. In some case, line 74 - 75 is skiiped

Steps to reproduce

Rerun pytest for several times

Environment

Environment Information

Operating System: Windows 8

Python Version: python3.8

Additional context

[FEATURE] add RCD model

Description

(A clear and concise description of what the feature is.)

  • If the proposal is about an algorithm or a model, provide mock examples if possible. In addition, you may need to carefully follow the guidance

References

[1]

Logical bug in IRTNet.forward

🐛 Description

I was browsing this repo for IRT implementations and found (I think) a theoretical bug in the implementation of IRTNet.

IRTNet.forward is defined here
https://github.com/bigdata-ustc/EduCDM/blob/main/EduCDM/IRT/GD/IRT.py#L30

    def forward(self, user, item):
        theta = torch.squeeze(self.theta(user), dim=-1)
        a = torch.squeeze(self.a(item), dim=-1)
        b = torch.squeeze(self.b(item), dim=-1)
        c = torch.squeeze(self.c(item), dim=-1)
        return torch.sigmoid(self.irf(theta, a, b, c, **self.irf_kwargs))

And the logic is that the output of irf is passed through the sigmoid function. This is fine if the output of irf itself is a "logit".

The IRF function is defined here:
https://github.com/bigdata-ustc/EduCDM/blob/main/EduCDM/IRT/irt.py#L10

def irf(theta, a, b, c, D=1.702, *, F=np):
    return c + (1 - c) / (1 + F.exp(-D * a * (theta - b)))

If you look at this you can see that it is already depicting sigmoid behaviour (assuming, of course, that 0 <= c <= 1). In other words, irf is returning probabilities, and not logits. As a result, the forward function above is actually doing this:

1 / (1 + exp(-(c + (1 - c) / (1 + F.exp(-D * a * (theta - b)))))

which I think is probably a bug.

If I haven't misunderstood, I have two recommendations:

  • Simply remove the torch.sigmoid call from forward
  • (optional) it may be worth passing c through a sigmoid function to ensure it doesn't go negative or above 1. (Perhaps selectable in irf_kwargs?)

i.e.

    def forward(self, user, item):
        theta = torch.squeeze(self.theta(user), dim=-1)
        a = torch.squeeze(self.a(item), dim=-1)
        b = torch.squeeze(self.b(item), dim=-1)
        c = torch.squeeze(self.c(item), dim=-1)
        if self.irf_kwargs.get("squash_c", True):
            c = torch.sigmoid(c)
        return self.irf(theta, a, b, c, **self.irf_kwargs)  # May want to clip values if c not constrained

Edit: I noticed that this torch.sigmoid(irf(...)) pattern also happens in MIRT, and possible elsewhere too.
Edit 2: I also realise that because sigmoid is monotonic, it doesn't really change the optimal solution. However, it does seem unnecessary to differentiate through sigmoid twice.


Error Message

NA

To Reproduce

NA

Environment

Environment Information

Operating System: NA

Python Version: NA

Additional context

About the preprocess of datasets

您好,
非常感谢你们的出色的工作!
我有一个关于数据集 的问题:
请问Edudata中的‘a0910’数据集 是 处理自 ‘2009-2010 ASSISTment Skill Builder Data’ 吗?
我处理得到数据 和 这个数据集的 有一点不同,而且和NCD,RCD的数据集也不太相同。
请问是否方便提供数据预处理的脚本,谢谢

IRT and Random Selection

The provided item response theory dataset was run twice, taking the top 10% of the most difficult questions b, and the repetition rate of the results obtained twice was less than 5%. This proves that IRT approximates random selection. Why is this?

Q: the Definition of parameters of EM-IRT

I tried running the EM-IRT code as a baseline model, but the definition of some parameters confused me...
I wonder the type and meaning of the parameter R, skip_value and D(which is initialized to 1.702).
Thank you!

在运行ICD代码时出现报错

在运行examples的时候,对于代码from baize import config_logging,执行的时候报错,报错内容为ImportError: cannot import name 'config_logging' from 'baize' (E:\anaconda3\envs\torch\lib\site-packages\baize_init_.cp39-win_amd64.pyd)

Fail to run examples/FuzzyCDF/prepare_dataset.ipynb in MacOS and Linux Server

🐛 Description

In MacOS and Linux, fail to run examples/FuzzyCDF/prepare_dataset.ipynb if not install rar and unrar.

Error Message

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

TypeError Traceback (most recent call last)
/tmp/ipykernel_32072/1971457670.py in
2 from EduData import get_data
3
----> 4 get_data("math2015", "../../data")

~/miniconda3/lib/python3.9/site-packages/EduData/DataSet/download_data/download_data.py in get_data(dataset, data_dir, override, url_dict)
221
222 try:
--> 223 return download_data(url, data_dir, override)
224 except FileExistsError:
225 logger.info("file existed, skipped")

~/miniconda3/lib/python3.9/site-packages/EduData/DataSet/download_data/download_data.py in download_data(url, data_dir, override, bloom_filter)
188 os.makedirs(data_dir, exist_ok=True)
189 save_path = path_append(data_dir, url.split('/')[-1], to_str=True)
--> 190 _data_dir = download_file(url, save_path, override)
191 bloom_filter.add(url)
192 return _data_dir

~/miniconda3/lib/python3.9/site-packages/EduData/DataSet/download_data/download_data.py in download_file(url, save_path, override, chunksize)
127
128 mode = 'wb+'
--> 129 content_len = int(res.headers.get('content-length'))
130 # Check if server supports range feature, and works as expected.
131 if res.status_code == 206:

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

To Reproduce

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

1.run examples/FuzzyCDF/prepare_dataset.ipynb

What have you tried to solve it?

1.Install rar and unrar in MacOS ,so it work well in local server
2.Fail to install rar and unrar in lab server because of permission

Environment

Environment Information

Operating System:
MacOS

Python Version: (e.g., python3.6, anaconda/python3.7, venv/python3.8)
python 3.9.5

Additional context

C-Dina 代码

Description

根据Dina修改的C-Dina代码

References

A Cognitive Diagnosis Model for Continuous Response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.