Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Oh good catch. I 100% remember running the , and it worked fine ... not sure wha

Error in the code in Listing A.13 (DDP-script.py) about llms-from-scratch HOT 1 CLOSED

rasbt commented on May 16, 2024

Error in the code in Listing A.13 (DDP-script.py)

from llms-from-scratch.

Comments (1)

rasbt commented on May 16, 2024 1

Oh good catch. I 100% remember running the script, and it worked fine ... not sure what happened to it afterwards. But it works again and should be fixe now. Many thanks.

from llms-from-scratch.

Related Issues (20)

Missing encoder.json and vocab.bpe for running bpe_openai_gpt2 (02_bonus_bytepair-encoder/compare-bpe-tiktoken.ipynb) HOT 2
stride value caused skipping one word HOT 2
Make it clear in REAME.md what this repository is for HOT 1
requirements.txt HOT 3
Incorrect code output in the book (2.2 Tokenizing text) HOT 4
Encoding/decoding transformation of the text (2.3 Converting tokens into token IDs) HOT 1
Solution of Excercise 2.1 is included in both main code and solution notebooks (2.5 Byte pair encoding) HOT 1
Several package requirements from bonus material are not specified in requirements.txt (Tokenizers comparison) HOT 1
Question about number of tokens in ChatGPT (2.5 Byte pair encoding) HOT 1
Inconsistencies between the code in the book and the notebooks (2.6 Data sampling with a sliding window) HOT 7
Output of the cell without variable specified (Embedding Layers and Linear Layers) HOT 1
Wrong number of token ids specified in the notebook (2.7 Creating token embeddings) HOT 1
Incorrect description of function torch.arange() (2.8 Encoding word positions) HOT 1
Inconsistencies in output for dropout section (3.5.2 Masking additional attention weights with dropout) HOT 1
Probably a typo in multi-head attention description (3.6.1 Stacking multiple single-head attention layers) HOT 1
Solution for Exercise 3.2 is included in the notebook with main code (3.6.1 Stacking multiple single-head attention layers) HOT 1
Question about implementation of CausalAttention class (3.5.3 Implementing a compact causal self-attention class) HOT 6
Inconsistencies in unsqueeze operation description in the book and in notebook and its necessity (3.6.2 Implementing multi-head attention with weight splits) HOT 4
Solution for Exercise 3.3 is included in the notebook with main code (3.6.2 Implementing multi-head attention with weight splits) HOT 1

Error in the code in Listing A.13 (DDP-script.py) about llms-from-scratch HOT 1 CLOSED

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs