A collection of Knowledge Tracing model implementations with PyTorch

License: MIT License

Python 100.00%

deep-learning dkt dkt-plus dkvmn gkt knowledge-tracing kqn pytorch sakt

knowledge-tracing-collection-pytorch's People

Contributors

Stargazers

Watchers

knowledge-tracing-collection-pytorch's Issues

RuntimeError: Expected a 'cuda:0' generator device but found 'cpu'

File "E:/git/python/knowledge-tracing-collection-pytorch/train.py", line 85, in main
dataset, [train_size, test_size], generator=torch.Generator(device='cpu')
File "D:\conda\envs\study-ml\lib\site-packages\torch\utils\data\dataset.py", line 353, in random_split
indices = randperm(sum(lengths), generator=generator).tolist()
RuntimeError: Expected a 'cuda:0' generator device but found 'cpu'

Why is this error always being reminded on my machine, when I try the default generator, both the generator on the CPU and the generator on the GPU get an error

Some questions regarding implementation

Hey hcnoh,

Thank you very much for your implementation. It's a well-organized and clear one.
I would like to clarify some doubts, if possible.

1 - From assist2009.py, in preprocess() you perform the following operation:
df = df.drop_duplicates(subset=["order_id", "skill_name"])
Why you drop these duplicates? We can have multiple interactions with repeated order_id and skill_name/skill_id attributes right?

2 - From what I've seen in previous KT implementations, the input data (x) consists of a merge between the skill_id and the correct attributes (this would create a new synthetic feature). In turn, the data for prediction (y) will just be the correct labels. When analyzing your implementation, I can't understand how (or when) you make that merge (for both skill_id and correct) in the input data x. Can you clarify this for me?

3 - There is a specific part of the code that I am having some difficulties in understanding what is done. These are two lines fromdkt.py:

y = self(q.long(), r.long())
y = (y * one_hot(qshft.long(), self.num_q)).sum(-1)

Can you enlighten me what is done here?

Thanks in advance!
Regards,
Bernardo

Question regarding predictions

I have a question regarding your implementation. I'm doing some introspection in order to get some meaningful results regarding prediction in DKT. A modified version of the function main of the file train.py give me a trained model which I (originally) named model. To get my predictions, I use the following code:

model.eval()
output = model(q_seq, a_seq), where the inputs represent a sequence of $N$ questions and answers respectively

I struggle to fully the understand the output, which is a 2D array of shape $NxM$, where $M$ is the number of skills. I guess that for $\forall i \in N$, we get the probability of correctly guessing a question related with each one of the $M$ skills in the next interaction $i+1$, given the $i$-th and the past interactions with the system. Is this correct? If so, I noticed that these probabilities orbit around 0.5 (randomness), even though I get a good AUC in the training.

As it is currently the implementation, is it possible to generate a sequence of user answers given only a sequence of questions? I guess the answer is no and I myself programmed a function that does this, in case you want to add it. Assuming that my interpretation of the output is correct (above), I think my function does work. The likelihood of the sequences generated is low, mainly because the probabilities orbit around 0.5, as mentioned.

About data=> q_seq, qshft_seqs

for q_seq, r_seq in batch:
q_seqs.append(FloatTensor(q_seq[:-1]))
r_seqs.append(FloatTensor(r_seq[:-1]))
qshft_seqs.append(FloatTensor(q_seq[1:]))
rshft_seqs.append(FloatTensor(r_seq[1:]))

Through the code above
When making q_seqs and qshft_seqs, q_seqs should cut one of the ends, and qshft_seqs should make data to include the cut end from q_seqs so that qshft_seqs will get the query value we want?

On the code, are you making qshft_seqs through q_seqs with cut ends?

Am I wrong??

Please answer me if you have a chance!

RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

Hello, bro. Great work!
When I tried to run the code, I got this following:

python train.py --model_name dkt --dataset_name ASSIST2009

Traceback (most recent call last):
  File "train.py", line 163, in <module>
    main(args.model_name, args.dataset_name)
  File "train.py", line 92, in main
    train_dataset, test_dataset = random_split(
  File "/home/yuwei/Studio/env_ervin/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 386, in random_split
    indices = randperm(sum(lengths), generator=generator).tolist()
RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

I have torch "1.10.0+cu113" on 2080Ti and Ubuntu20.
Appreciate your help, bro.

Passing skills/concepts instead of question indices

I believe that for the Assistments2009 dataset (haven't checked others) instead of passing questions with the same concepts as seperate questions, the code seems to be simply using the concept as the input.

A major point of the original paper seemed to be using "unlabeled" data (no labeling concepts), for example from Section 5, Simulated Data:

To understand how the different models can incorporate unlabelled data, we do not provide models with the hidden concept labels (instead the input is simply the exercise index and whether or not the exercise was answered correctly)

From what I could tell this part here was directly using concept labels. So I'm thinking the results for this dataset might be misleading (correct but solving a simpler problem).

def preprocess(self):
    df = pd.read_csv(self.dataset_path).dropna(subset=["skill_name"])\
        .drop_duplicates(subset=["order_id", "skill_name"])\
        .sort_values(by=["order_id"])

    u_list = np.unique(df["user_id"].values)
    q_list = np.unique(df["skill_name"].values)

    u2idx = {u: idx for idx, u in enumerate(u_list)}
    q2idx = {q: idx for idx, q in enumerate(q_list)}

    q_seqs = []
    r_seqs = []

    for u in u_list:
        df_u = df[df["user_id"] == u]

        q_seq = np.array([q2idx[q] for q in df_u["skill_name"]])
        r_seq = df_u["correct"].values

        q_seqs.append(q_seq)
        r_seqs.append(r_seq)

Please let me know if my suspicions are correct. Anyway thanks for putting up this repo, your code is some of the cleanest I've found :)

About overfitting during model training

Hi, thank you very much for your work! I found that most of the models auc converged to the maximum value in the first 20% epoch during the training process, and the auc began to decline, but the train loss decreased all the time. Does this mean that overfitting occurred in the model training process? I hope you can answer this question.

hcnoh / knowledge-tracing-collection-pytorch Goto Github PK

knowledge-tracing-collection-pytorch's People

Contributors

Stargazers

Watchers

Forkers

knowledge-tracing-collection-pytorch's Issues

RuntimeError: Expected a 'cuda:0' generator device but found 'cpu'

Some questions regarding implementation

Question regarding predictions

About data=> q_seq, qshft_seqs

RuntimeError: Expected a 'cuda' device type for generator but found 'cpu'

Passing skills/concepts instead of question indices

About overfitting during model training

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs