botisan-ai / gpt3-tokenizer Goto Github PK
View Code? Open in Web Editor NEWIsomorphic JavaScript/TypeScript Tokenizer for GPT-3 and Codex Models by OpenAI.
License: MIT License
Isomorphic JavaScript/TypeScript Tokenizer for GPT-3 and Codex Models by OpenAI.
License: MIT License
I am getting this error after implementation
TypeError: gpt3_tokenizer__WEBPACK_IMPORTED_MODULE_3__.GPT3Tokenizer is not a constructor
I started receiving errors about this.cache.hasOwnProperty is not a function
. Digging into the code it looks like tokenizer.ts uses a bit of unsafe code considering this.cache
is a map that allows any passed in value to be used as a token:
if (this.cache.hasOwnProperty(token)) {
return this.cache[token];
}
Instead, this should be:
if (Object.prototype.hasOwnProperty.call(this.cache, token)) {
return this.cache[token];
}
const tokenizer = new GPT3Tokenizer({ type: 'gpt3' });
tokenizer.encode('toString')
will fail. This is because tokenizer.bpe('toString')
returns the javascript function toString()
instead of an actual string representing the token.
The types of the library are wrong: https://arethetypeswrong.github.io/?p=gpt3-tokenizer%401.1.3
Because of this issue the example in the documentation does not work in some environments, at the moment you need to use const tokenizer = new GPT3Tokenizer.default({ type: 'gpt3' });
on those instead.
Is a common issue with TypeScript projects... so common someone had to make a website to detect that issue. If the types are hard to fix (the website above makes fixing it look simple) just add a note in documentation. It's a bit frustrating testing out a new library, the first example in the documentation does not work and having to debug why, then you find the solution and question yourself why you are a JavaScript programmer and how those kind of issues still exist even with TypeScript... ๐
Hi, I get an error when trying to instantiate GPT3Tokenizer
.
const tokenizer = new GPT3Tokenizer({ type: 'gpt3' });
^
TypeError: GPT3Tokenizer is not a constructor
gpt3-tokenizer v1.1.4
Node.js v18.12.1
No matching export in "browser-external:util" for import "TextEncoder"
in Vite (SvelteKit).
> node_modules/gpt3-tokenizer/dist/gpt3-tokenizer.esm.js:1:9: error: No matching export in "browser-external:util" for import "TextEncoder"
1 โ import { TextEncoder, TextDecoder } from 'util';
โต ~~~~~~~~~~~
> node_modules/gpt3-tokenizer/dist/gpt3-tokenizer.esm.js:1:22: error: No matching export in "browser-external:util" for import "TextDecoder"
1 โ import { TextEncoder, TextDecoder } from 'util';
โต ~~~~~~~~~~~
12:57:34 AM [vite] error while updating dependencies:
Error: Build failed with 2 errors:
node_modules/gpt3-tokenizer/dist/gpt3-tokenizer.esm.js:1:9: error: No matching export in "browser-external:util" for import "TextEncoder"
node_modules/gpt3-tokenizer/dist/gpt3-tokenizer.esm.js:1:22: error: No matching export in "browser-external:util" for import "TextDecoder"
at failureErrorWithLog (/src/tymek-cz/node_modules/esbuild/lib/main.js:1493:15)
at /src/tymek-cz/node_modules/esbuild/lib/main.js:1151:28
at runOnEndCallbacks (/src/tymek-cz/node_modules/esbuild/lib/main.js:941:63)
at buildResponseToResult (/src/tymek-cz/node_modules/esbuild/lib/main.js:1149:7)
at /src/tymek-cz/node_modules/esbuild/lib/main.js:1258:14
Edit: this might be related: vitejs/vite#6493
Token size is not accurate if we compare it with GPT-3 Token.
Any help would be helpful.
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.