GithubHelp home page GithubHelp logo

mlc-ai / web-llm Goto Github PK

View Code? Open in Web Editor NEW
9.1K 92.0 556.0 42.93 MB

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

Home Page: https://mlc.ai/web-llm

License: Apache License 2.0

Shell 1.00% HTML 1.24% JavaScript 0.57% TypeScript 97.19%
deep-learning llm tvm webgpu webml chatgpt language-model

web-llm's Introduction

Web LLM

| NPM Package | Get Started | Examples | Documentation | MLC LLM | Discord |

WebLLM is a modular and customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU.

WebLLM is fully compatible with OpenAI API. That is, you can use the same OpenAI API on any open source models locally, with functionalities including json-mode, function-calling, streaming, etc.

We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.

Check out our demo webpage to try it out! You can use WebLLM as a base npm package and build your own web application on top of it by following the documentation and checking out Get Started. This project is a companion project of MLC LLM, which runs LLMs natively on iPhone and other native local environments.

Get Started

WebLLM offers a minimalist and modular interface to access the chatbot in the browser. The WebLLM package itself does not come with UI, and is designed in a modular way to hook to any of the UI components. The following code snippet demonstrate a simple example that generates a streaming response on a webpage. You can check out examples/get-started to see the complete example.

import * as webllm from "@mlc-ai/web-llm";

async function main() {
  const initProgressCallback = (report: webllm.InitProgressReport) => {
    const label = document.getElementById("init-label");
    label.innerText = report.text;
  };
  const selectedModel = "Llama-3-8B-Instruct-q4f32_1";
  const engine: webllm.EngineInterface = await webllm.CreateEngine(
    selectedModel,
    /*engineConfig=*/{ initProgressCallback: initProgressCallback }
  );

  const reply0 = await engine.chat.completions.create({
    messages: [{ "role": "user", "content": "Tell me about Pittsburgh." }]
  });
  console.log(reply0);
  console.log(await engine.runtimeStatsText());
}

main();

Note that if you need to separate the instantiation of webllm.Engine from loading a model, you could substitute

const engine: webllm.EngineInterface = await webllm.CreateEngine(
  selectedModel,
  /*engineConfig=*/{ initProgressCallback: initProgressCallback }
);

with the equivalent

const engine: webllm.EngineInterface = new webllm.Engine();
engine.setInitProgressCallback(initProgressCallback);
await engine.reload(selectedModel, chatConfig, appConfig);

Using Web Worker

WebLLM comes with API support for WebWorker so you can hook the generation process into a separate worker thread so that the compute in the webworker won't disrupt the UI.

We first create a worker script that created a Engine and hook it up to a handler that handles requests.

// worker.ts
import { EngineWorkerHandler, Engine } from "@mlc-ai/web-llm";

// Hookup an Engine to a worker handler
const engine = new Engine();
const handler = new EngineWorkerHandler(engine);
self.onmessage = (msg: MessageEvent) => {
  handler.onmessage(msg);
};

Then in the main logic, we create a WebWorkerEngine that implements the same EngineInterface. The rest of the logic remains the same.

// main.ts
import * as webllm from "@mlc-ai/web-llm";

async function main() {
  // Use a WebWorkerEngine instead of Engine here
  const engine: webllm.EngineInterface = await webllm.CreateWebWorkerEngine(
    /*worker=*/new Worker(
      new URL('./worker.ts', import.meta.url),
      { type: 'module' }
    ),
    /*modelId=*/selectedModel,
    /*engineConfig=*/{ initProgressCallback: initProgressCallback }
  );
  // everything else remains the same
}

Build a ChatApp

You can find a complete chat app example in examples/simple-chat.

Chrome Extension

You can also find examples on building chrome extension with WebLLM in examples/chrome-extension and examples/chrome-extension-webgpu-service-worker. The latter one leverages service worker, so the extension is persisten in the background.

Full OpenAI Compatibility

WebLLM is designed to be fully compatible with OpenAI API. Thus, besides building simple chat bot, you can also have the following functionalities with WebLLM:

Model Support

We export all supported models in webllm.prebuiltAppConfig, where you can see a list of models that you can simply call const engine: webllm.EngineInterface = await webllm.CreateEngine(anyModel) with. Prebuilt models include:

  • Llama-2
  • Gemma
  • Phi-1.5 and Phi-2
  • Mistral-7B-Instruct
  • OpenHermes-2.5-Mistral-7B
  • NeuralHermes-2.5-Mistral-7B
  • TinyLlama
  • RedPajama

Alternatively, you can compile your own model and weights as described below.

WebLLM works as a companion project of MLC LLM. It reuses the model artifact and builds flow of MLC LLM, please check out MLC LLM document on how to add new model weights and libraries to WebLLM.

Here, we go over the high-level idea. There are two elements of the WebLLM package that enables new models and weight variants.

  • model_url: Contains a URL to model artifacts, such as weights and meta-data.
  • model_lib_url: A URL to the web assembly library (i.e. wasm file) that contains the executables to accelerate the model computations.

Both are customizable in the WebLLM.

async main() {
  const appConfig = {
    "model_list": [
      {
        "model_url": "/url/to/my/llama",
        "model_id": "MyLlama-3b-v1-q4f32_0"
        "model_lib_url": "/url/to/myllama3b.wasm",
      }
    ],
  };
  // override default
  const chatOpts = {
    "repetition_penalty": 1.01
  };

  const chat = new ChatModule();
  // load a prebuilt model
  // with a chat option override and app config
  // under the hood, it will load the model from myLlamaUrl
  // and cache it in the browser cache
  // The chat will also load the model library from "/url/to/myllama3b.wasm",
  // assuming that it is compatible to the model in myLlamaUrl.
  const engine = await webllm.CreateEngine(
    "MyLlama-3b-v1-q4f32_0", 
    /*engineConfig=*/{ chatOpts: chatOpts, appConfig: appConfig }
  );
}

In many cases, we only want to supply the model weight variant, but not necessarily a new model (e.g. NeuralHermes-Mistral can reuse Mistral's model library). For examples on how a model library can be shared by different model variants, see prebuiltAppConfig.

Build WebLLM Package From Source

NOTE: you don't need to build by yourself unless you would like to change the WebLLM package. To simply use the npm, follow Get Started or any of the examples instead.

WebLLM package is a web runtime designed for MLC LLM.

  1. Install all the prerequisites for compilation:

    1. emscripten. It is an LLVM-based compiler that compiles C/C++ source code to WebAssembly.
      • Follow the installation instruction to install the latest emsdk.
      • Source emsdk_env.sh by source path/to/emsdk_env.sh, so that emcc is reachable from PATH and the command emcc works.
    2. Install jekyll by following the official guides. It is the package we use for website. This is not needed if you're using nextjs (see next-simple-chat in the examples).
    3. Install jekyll-remote-theme by command. Try gem mirror if install blocked.
      gem install jekyll-remote-theme

    We can verify the successful installation by trying out emcc and jekyll in terminal, respectively.

  2. Setup necessary environment

    Prepare all the necessary dependencies for web build:

    ./scripts/prep_deps.sh
  3. Buld WebLLM Package

    npm run build
  4. Validate some of the sub-packages

    You can then go to the subfolders in examples to validate some of the sub-packages. We use Parcelv2 for bundling. Although Parcel is not very good at tracking parent directory changes sometimes. When you make a change in the WebLLM package, try to edit the package.json of the subfolder and save it, which will trigger Parcel to rebuild.

Links

Acknowledgement

This project is initiated by members from CMU catalyst, UW SAMPL, SJTU, OctoML and the MLC community. We would love to continue developing and supporting the open-source ML community.

This project is only possible thanks to the shoulders open-source ecosystems that we stand on. We want to thank the Apache TVM community and developers of the TVM Unity effort. The open-source ML community members made these models publicly available. PyTorch and Hugging Face communities make these models accessible. We would like to thank the teams behind vicuna, SentencePiece, LLaMA, Alpaca. We also would like to thank the WebAssembly, Emscripten, and WebGPU communities. Finally, thanks to Dawn and WebGPU developers.

web-llm's People

Contributors

anash3 avatar beaufortfrancois avatar bokuweb avatar charliefruan avatar davidar avatar davidgortega avatar diegocao avatar dustinbrett avatar eltociear avatar fubge avatar gsmlg avatar idosal avatar jinhongyii avatar junrushao avatar lantianyou avatar manuongithub avatar masterjh5574 avatar matthoffner avatar narangkay avatar nicolas-raoul avatar patnorris avatar rickzx avatar ryan-yang125 avatar sudeepag avatar tlopex avatar tpoisonooo avatar tqchen avatar zuramai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

web-llm's Issues

webGL needed

maybe,webGL support is a better choice.
and gpu.js can be used for this
gpu.js

Support StableLM

I'm curious if you guys will provide StableLM capability on the web-llm? It would be really great if so.

Cannot find adapter on Microsoft Edge

Hey folks, amazing work. However, FYI this does not work on Microsoft Edge (running on Linux Fedora 37, with an Nvidia 1080), which is a shame.

ksnip_20230418-134843

Using edge://gpu I see that webgpu is disabled. It would be good to open an issue with them, see why they do not enable it and what are their plans.

FYI output of the above command:

Graphics Feature Status
Canvas: Hardware accelerated
Canvas out-of-process rasterization: Disabled
Direct Rendering Display Compositor: Disabled
Compositing: Hardware accelerated
Multiple Raster Threads: Enabled
OpenGL: Enabled
Rasterization: Hardware accelerated
Raw Draw: Disabled
Video Decode: Hardware accelerated
Video Encode: Software only. Hardware acceleration disabled
Vulkan: Disabled
WebGL: Hardware accelerated
WebGL2: Hardware accelerated
WebGPU: Disabled
Driver Bug Workarounds
adjust_src_dst_region_for_blitframebuffer
clear_uniforms_before_first_program_use
disable_discard_framebuffer
disable_ms_video_super_resolution
enable_webgl_timer_query_extensions
exit_on_context_lost
force_cube_complete
init_gl_position_in_vertex_shader
init_vertex_attributes
pack_parameters_workaround_with_pack_buffer
reset_base_mipmap_level_before_texstorage
unpack_alignment_workaround_with_unpack_buffer
unpack_overlapping_rows_separately_unpack_buffer
use_virtualized_gl_contexts
disabled_extension_GL_KHR_blend_equation_advanced
disabled_extension_GL_KHR_blend_equation_advanced_coherent
disabled_extension_GL_MESA_framebuffer_flip_y
Problems Detected
WebGPU has been disabled via blocklist or the command line.
Disabled Features: webgpu
Accelerated video encode has been disabled, either via blocklist, about:flags or the command line.
Disabled Features: video_encode
Program link fails in NVIDIA Linux if gl_Position is not set: [286468](http://crbug.com/286468)
Applied Workarounds: init_gl_position_in_vertex_shader
Clear uniforms before first program use on all platforms: [124764](http://crbug.com/124764), [349137](http://crbug.com/349137)
Applied Workarounds: clear_uniforms_before_first_program_use
Linux NVIDIA drivers don't have the correct defaults for vertex attributes: [351528](http://crbug.com/351528)
Applied Workarounds: init_vertex_attributes
MakeCurrent is slow on Linux with NVIDIA drivers: [449150](http://crbug.com/449150), [514510](http://crbug.com/514510)
Applied Workarounds: use_virtualized_gl_contexts
NVIDIA fails glReadPixels from incomplete cube map texture: [518889](http://crbug.com/518889)
Applied Workarounds: force_cube_complete
Pack parameters work incorrectly with pack buffer bound: [563714](http://crbug.com/563714)
Applied Workarounds: pack_parameters_workaround_with_pack_buffer
Alignment works incorrectly with unpack buffer bound: [563714](http://crbug.com/563714)
Applied Workarounds: unpack_alignment_workaround_with_unpack_buffer
Framebuffer discarding can hurt performance on non-tilers: [570897](http://crbug.com/570897)
Applied Workarounds: disable_discard_framebuffer
Unpacking overlapping rows from unpack buffers is unstable on NVIDIA GL driver: [596774](http://crbug.com/596774)
Applied Workarounds: unpack_overlapping_rows_separately_unpack_buffer
Disable KHR_blend_equation_advanced until cc shaders are updated: [661715](http://crbug.com/661715)
Applied Workarounds: disable(GL_KHR_blend_equation_advanced), disable(GL_KHR_blend_equation_advanced_coherent)
Expose WebGL's disjoint_timer_query extensions on platforms with site isolation: [808744](http://crbug.com/808744), [870491](http://crbug.com/870491)
Applied Workarounds: enable_webgl_timer_query_extensions
glTexStorage* are buggy when base mipmap level is not 0: [877874](http://crbug.com/877874)
Applied Workarounds: reset_base_mipmap_level_before_texstorage
Some drivers can't recover after OUT_OF_MEM and context lost: [893177](http://crbug.com/893177)
Applied Workarounds: exit_on_context_lost
adjust src/dst region if blitting pixels outside framebuffer on Linux NVIDIA: [830046](http://crbug.com/830046)
Applied Workarounds: adjust_src_dst_region_for_blitframebuffer
Disable GL_MESA_framebuffer_flip_y for desktop GL: [964010](http://crbug.com/964010)
Applied Workarounds: disable(GL_MESA_framebuffer_flip_y)
Only enable Microsoft Video Super Resolution capabilities for specific GPUs.
Applied Workarounds: disable_ms_video_super_resolution
ANGLE Features
allowCompressedFormats (Frontend workarounds): Enabled: true
Allow compressed formats
cacheCompiledShader (Frontend features) [anglebug:7036](http://anglebug.com/7036): Disabled
Enable to cache compiled shaders
disableAnisotropicFiltering (Frontend workarounds): Disabled
Disable support for anisotropic filtering
disableDrawBuffersIndexed (Frontend features) [anglebug:7724](http://anglebug.com/7724): Disabled
Disable support for OES_draw_buffers_indexed and EXT_draw_buffers_indexed
disableProgramBinary (Frontend features) [anglebug:5007](http://anglebug.com/5007): Disabled: IsPowerVrRogue(functions)
Disable support for GL_OES_get_program_binary
disableProgramCachingForTransformFeedback (Frontend workarounds): Disabled: IsAndroid() && isQualcomm
On some GPUs, program binaries don't contain transform feedback varyings
dumpShaderSource (Frontend features) [anglebug:7760](http://anglebug.com/7760): Disabled
Write shader source to temp directory
emulatePixelLocalStorage (Frontend features) [anglebug:7279](http://anglebug.com/7279): Disabled: false
Emulate ANGLE_shader_pixel_local_storage using shader images
enableCaptureLimits (Frontend features) [anglebug:5750](http://anglebug.com/5750): Disabled
Set the context limits like frame capturing was enabled
enableProgramBinaryForCapture (Frontend features) [anglebug:5658](http://anglebug.com/5658): Disabled
Even if FrameCapture is enabled, enable GL_OES_get_program_binary
enableShaderSubstitution (Frontend workarounds) [anglebug:7761](http://anglebug.com/7761): Disabled
Check the filesystem for shaders to use instead of those provided through glShaderSource
forceDepthAttachmentInitOnClear (Frontend workarounds) [anglebug:7246](https://anglebug.com/7246): Disabled
Force depth attachment initialization on clear ops
forceGlErrorChecking (Frontend features) https://issuetracker.google.com/220069903: Disabled
Force GL error checking (i.e. prevent applications from disabling error checking
forceInitShaderVariables (Frontend features): Disabled
Force-enable shader variable initialization
forceRobustResourceInit (Frontend features) [anglebug:6041](http://anglebug.com/6041): Disabled
Force-enable robust resource init
loseContextOnOutOfMemory (Frontend workarounds): Enabled: true
Some users rely on a lost context notification if a GL_OUT_OF_MEMORY error occurs
scalarizeVecAndMatConstructorArgs (Frontend workarounds) [1165751](http://crbug.com/1165751): Disabled: false
Always rewrite vec/mat constructors to be consistent
singleThreadedTextureDecompression (Frontend workarounds): Disabled
Disables multi-threaded decompression of compressed texture formats
RGBA4IsNotSupportedForColorRendering (OpenGL workarounds): Disabled: functions->standard == STANDARD_GL_DESKTOP && isIntel
GL_RGBA4 is not color renderable
RGBDXT1TexturesSampleZeroAlpha (OpenGL workarounds) [anglebug:3729](http://anglebug.com/3729): Disabled: IsApple()
Sampling BLACK texels from RGB DXT1 textures returns transparent black on Mac.
addAndTrueToLoopCondition (OpenGL workarounds): Disabled: IsApple() && isIntel
Calculation of loop conditions in for and while loop has bug
adjustSrcDstRegionForBlitFramebuffer (OpenGL workarounds) [830046](http://crbug.com/830046): Enabled: IsLinux() || (IsAndroid() && isNvidia) || (IsWindows() && isNvidia) || (IsApple() && functions->standard == STANDARD_GL_ES)
Many platforms have issues with blitFramebuffer when the parameters are large.
allowAstcFormats (OpenGL workarounds): Enabled: !isMesa || isIntel && (Is9thGenIntel(device) || IsGeminiLake(device) || IsCoffeeLake(device) || Is11thGenIntel(device) || Is12thGenIntel(device))
Enable ASTC on desktop OpenGL
allowClearForRobustResourceInit (OpenGL workarounds) [848952](https://crbug.com/848952%20http://crbug.com/883276): Disabled: IsApple()
Using glClear for robust resource initialization is buggy on some drivers and leads to texture corruption. Default to data uploads except on MacOS where it is very slow.
allowETCFormats (OpenGL workarounds): Disabled: isIntel && !IsSandyBridge(device) && !IsIvyBridge(device) && !IsHaswell(device)
Enable ETC2/EAC on desktop OpenGL
alwaysCallUseProgramAfterLink (OpenGL workarounds) [110263](http://crbug.com/110263): Enabled: true
Always call useProgram after a successful link to avoid a driver bug
alwaysUnbindFramebufferTexture2D (OpenGL workarounds) [anglebug:5536](https://anglebug.com/5536): Enabled: isNvidia && (IsWindows() || IsLinux())
Force unbind framebufferTexture2D before binding renderbuffer to work around driver bug.
avoid1BitAlphaTextureFormats (OpenGL workarounds): Disabled: functions->standard == STANDARD_GL_DESKTOP && isAMD
Issue with 1-bit alpha framebuffer formats
bindFramebufferForTimerQueries (OpenGL workarounds) [1356053](https://crbug.com/1356053): Disabled: IsMali(functions)
Some drivers require a non-zero framebuffer when beginQuery for TimeElapsed orTimestampis called.
bindTransformFeedbackBufferBeforeBindBufferRange (OpenGL workarounds) [anglebug:5140](https://anglebug.com/5140): Disabled: IsApple()
Bind transform feedback buffers to the generic binding point before calling glBindBufferBase or glBindBufferRange.
clampArrayAccess (OpenGL workarounds) [anglebug:2978](http://anglebug.com/2978): Disabled: IsAndroid() || isAMD || !functions->hasExtension("GL_KHR_robust_buffer_access_behavior")
Clamp uniform array access to avoid reading invalid memory.
clampFragDepth (OpenGL workarounds): Enabled: isNvidia
gl_FragDepth is not clamped correctly when rendering to a floating point depth buffer
clampMscRate (OpenGL workarounds) [1042393](https://crbug.com/1042393): Disabled: IsLinux() && IsWayland()
Some drivers return bogus values for GetMscRate, so we clamp it to 30Hz
clampPointSize (OpenGL workarounds): Enabled: IsAndroid() || isNvidia
The point size range reported from the API is inconsistent with the actual behavior
clearToZeroOrOneBroken (OpenGL workarounds) [710443](https://crbug.com/710443): Disabled: IsApple() && isIntel && GetMacOSVersion() < OSVersion(10, 12, 6)
Clears when the clear color is all zeros or ones do not work.
clipSrcRegionForBlitFramebuffer (OpenGL workarounds) [830046](http://crbug.com/830046): Disabled: IsApple() || (IsLinux() && isAMD)
Issues with blitFramebuffer when the parameters don't match the framebuffer size.
decodeEncodeSRGBForGenerateMipmap (OpenGL workarounds) [anglebug:4646](http://anglebug.com/4646): Disabled: IsApple()
Decode and encode before generateMipmap for srgb format textures.
disableBlendFuncExtended (OpenGL workarounds) [anglebug:1085](http://anglebug.com/1085): Disabled: isAMD || isIntel
ARB_blend_func_extended does not pass the tests
disableDrawBuffersIndexed (OpenGL workarounds): Disabled: IsWindows() && isAMD
Disable OES_draw_buffers_indexed extension.
disableGPUSwitchingSupport (OpenGL workarounds) [1091824](https://crbug.com/1091824): Disabled: isDualGPUMacWithNVIDIA
Disable GPU switching support (use only the low-power GPU) on older MacBook Pros.
disableMultisampledRenderToTexture (OpenGL workarounds) [anglebug:2894](http://anglebug.com/2894): Disabled: isAdreno4xxOnAndroidLessThan51 || isAdreno4xxOnAndroid70 || isAdreno5xxOnAndroidLessThan70 || isAdreno5xxOnAndroid71 || isLinuxVivante || IsAndroid()
Many drivers have bugs when using GL_EXT_multisampled_render_to_texture
disableNativeParallelCompile (OpenGL workarounds) [1094869](http://crbug.com/1094869): Disabled: isTSANBuild && IsLinux() && isNvidia
Do not use native KHR_parallel_shader_compile even when available.
disableSemaphoreFd (OpenGL workarounds) [1046462](https://crbug.com/1046462): Disabled: IsLinux() && isAMD && isMesa && mesaVersion < (std::array<int, 3>{19, 3, 5})
Disable GL_EXT_semaphore_fd extension
disableSyncControlSupport (OpenGL workarounds) [1137851](https://crbug.com/1137851): Disabled: IsLinux() && isIntel && isMesa && mesaVersion[0] == 20
Speculative fix for issues on Linux/Wayland where exposing GLX_OML_sync_control renders Chrome unusable
disableTextureClampToBorder (OpenGL workarounds) [anglebug:7405](https://anglebug.com/7405): Disabled: isImagination
Imagination devices generate INVALID_ENUM when setting the texture border color.
disableTimestampQueries (OpenGL workarounds) [811661](https://crbug.com/811661): Disabled: (IsLinux() && isVMWare) || (IsAndroid() && isNvidia) || (IsAndroid() && GetAndroidSdkLevel() < 27 && IsAdreno5xxOrOlder(functions)) || (IsAndroid() && IsMaliT8xxOrOlder(functions)) || (IsAndroid() && IsMaliG31OrOlder(functions))
Disable GL_EXT_disjoint_timer_query extension
disableWorkerContexts (OpenGL workarounds) [849576](http://crbug.com/849576): Enabled: (IsWindows() && (isIntel || isAMD)) || (IsLinux() && isNvidia) || IsIOS() || IsAndroid() || IsAndroidEmulator(functions)
Some tests have been seen to fail using worker contexts
doWhileGLSLCausesGPUHang (OpenGL workarounds) [644669](http://crbug.com/644669): Disabled: IsApple() && functions->standard == STANDARD_GL_DESKTOP && GetMacOSVersion() < OSVersion(10, 11, 0)
Some GLSL constructs involving do-while loops cause GPU hangs
doesSRGBClearsOnLinearFramebufferAttachments (OpenGL workarounds): Disabled: isIntel || isAMD
Issue clearing framebuffers with linear attachments when GL_FRAMEBUFFER_SRGB is enabled
dontInitializeUninitializedLocals (OpenGL workarounds) [anglebug:2046](http://anglebug.com/2046): Disabled: IsAndroid() && isQualcomm
Initializing uninitialized locals caused odd behavior in a few WebGL 2 tests
dontRelinkProgramsInParallel (OpenGL workarounds) [anglebug:3045](http://anglebug.com/3045): Disabled: IsAndroid() || (IsWindows() && isIntel)
Relinking a program in parallel is buggy
dontUseLoopsToInitializeVariables (OpenGL workarounds) [809422](http://crbug.com/809422): Disabled: (IsAndroid() && isQualcomm) || (isIntel && IsApple())
For loops used to initialize variables hit native GLSL compiler bugs
emulateAbsIntFunction (OpenGL workarounds) [642227](http://crbug.com/642227): Disabled: IsApple() && isIntel
abs(i) where i is an integer returns unexpected result
emulateAtan2Float (OpenGL workarounds) [672380](http://crbug.com/672380): Enabled: isNvidia
atan(y, x) may return a wrong answer
emulateClipDistanceState (OpenGL workarounds): Disabled: isQualcomm
Some drivers ignore GL_CLIP_DISTANCEi_EXT state.
emulateCopyTexImage2D (OpenGL workarounds): Disabled: isApple
Replace CopyTexImage2D with TexImage2D + CopyTexSubImage2D.
emulateCopyTexImage2DFromRenderbuffers (OpenGL workarounds) [anglebug:4674](https://anglebug.com/4674): Disabled: IsApple() && functions->standard == STANDARD_GL_ES && !(isAMD && IsWindows())
CopyTexImage2D spuriously returns errors on iOS when copying from renderbuffers.
emulateImmutableCompressedTexture3D (OpenGL workarounds) [1060012](https://crbug.com/1060012): Disabled: isQualcomm
Use non-immutable texture allocation to work around a driver bug.
emulateIsnanFloat (OpenGL workarounds) [650547](http://crbug.com/650547): Disabled: isIntel && IsApple() && IsSkylake(device) && GetMacOSVersion() < OSVersion(10, 13, 2)
Using isnan() on highp float will get wrong answer
emulateMaxVertexAttribStride (OpenGL workarounds) [anglebug:1936](http://anglebug.com/1936): Disabled: IsLinux() && functions->standard == STANDARD_GL_DESKTOP && isAMD
Some drivers return 0 when MAX_VERTEX_ATTRIB_STRIED queried
emulatePackSkipRowsAndPackSkipPixels (OpenGL workarounds) [anglebug:4849](https://anglebug.com/4849): Disabled: IsApple()
GL_PACK_SKIP_ROWS and GL_PACK_SKIP_PIXELS are ignored in Apple's OpenGL driver.
emulatePrimitiveRestartFixedIndex (OpenGL workarounds) [anglebug:3997](http://anglebug.com/3997): Disabled: functions->standard == STANDARD_GL_DESKTOP && functions->isAtLeastGL(gl::Version(3, 1)) && !functions->isAtLeastGL(gl::Version(4, 3))
When GL_PRIMITIVE_RESTART_FIXED_INDEX is not available, emulate it with GL_PRIMITIVE_RESTART and glPrimitiveRestartIndex.
emulateRGB10 (OpenGL workarounds) [1300575](https://crbug.com/1300575): Enabled: functions->standard == STANDARD_GL_DESKTOP
Emulate RGB10 support using RGB10_A2.
finishDoesNotCauseQueriesToBeAvailable (OpenGL workarounds): Enabled: functions->standard == STANDARD_GL_DESKTOP && isNvidia
glFinish doesn't cause all queries to report available result
flushBeforeDeleteTextureIfCopiedTo (OpenGL workarounds) [anglebug:4267](http://anglebug.com/4267): Disabled: IsApple() && isIntel
Some drivers track CopyTex{Sub}Image texture dependencies incorrectly. Flush before glDeleteTextures in this case
flushOnFramebufferChange (OpenGL workarounds) [1181068](http://crbug.com/1181068): Disabled: IsApple() && Has9thGenIntelGPU(systemInfo)
Switching framebuffers without a flush can lead to crashes on Intel 9th Generation GPU Macs.
initFragmentOutputVariables (OpenGL workarounds) [1171371](http://crbug.com/1171371): Disabled: IsAdreno42xOr3xx(functions)
No init gl_FragColor causes context lost
initializeCurrentVertexAttributes (OpenGL workarounds): Enabled: isNvidia
During initialization, assign the current vertex attributes to the spec-mandated defaults
keepBufferShadowCopy (OpenGL workarounds): Disabled: !CanMapBufferForRead(functions)
Maintain a shadow copy of buffer data when the GL API does not permit reading data back.
limitMax3dArrayTextureSizeTo1024 (OpenGL workarounds) [927470](http://crbug.com/927470): Disabled: limitMaxTextureSize
Limit max 3d texture size and max array texture layers to 1024 to avoid system hang
limitMaxMSAASamplesTo4 (OpenGL workarounds) [797243](http://crbug.com/797243): Disabled: IsAndroid() || (IsApple() && (isIntel || isAMD || isNvidia))
Various rendering bugs have been observed when using higher MSAA counts
limitWebglMaxTextureSizeTo4096 (OpenGL workarounds) [927470](http://crbug.com/927470): Disabled: IsAndroid() || limitMaxTextureSize
Limit webgl max texture size to 4096 to avoid frequent out-of-memory errors
packLastRowSeparatelyForPaddingInclusion (OpenGL workarounds) [anglebug:1512](http://anglebug.com/1512): Enabled: IsApple() || isNvidia
When uploading textures from an pack buffer, some drivers count an extra row padding
packOverlappingRowsSeparatelyPackBuffer (OpenGL workarounds): Enabled: isNvidia
In the case of packing to a pixel pack buffer, pack overlapping rows row by row
passHighpToPackUnormSnormBuiltins (OpenGL workarounds) [anglebug:7527](http://anglebug.com/7527): Disabled: isQualcomm
packUnorm4x8 fails on Pixel 4 if it is not passed a highp vec4.
preAddTexelFetchOffsets (OpenGL workarounds) [642605](http://crbug.com/642605): Disabled: IsApple() && isIntel
Intel Mac drivers mistakenly consider the parameter position of nagative vaule as invalid even if the sum of position and offset is in range, so we need to add workarounds by rewriting texelFetchOffset(sampler, position, lod, offset) into texelFetch(sampler, position + offset, lod).
promotePackedFormatsTo8BitPerChannel (OpenGL workarounds) [anglebug:5469](http://anglebug.com/5469): Disabled: IsApple() && hasAMD
Packed color formats are buggy on Macs with AMD GPUs
queryCounterBitsGeneratesErrors (OpenGL workarounds) [anglebug:3027](http://anglebug.com/3027): Disabled: IsNexus5X(vendor, device)
Drivers generate errors when querying the number of bits in timer queries
readPixelsUsingImplementationColorReadFormatForNorm16 (OpenGL workarounds) [anglebug:4214](http://anglebug.com/4214): Disabled: !isIntel && functions->standard == STANDARD_GL_ES && functions->isAtLeastGLES(gl::Version(3, 1)) && functions->hasGLESExtension("GL_EXT_texture_norm16")
Quite some OpenGL ES drivers don't implement readPixels for RGBA/UNSIGNED_SHORT from EXT_texture_norm16 correctly
reapplyUBOBindingsAfterUsingBinaryProgram (OpenGL workarounds) [anglebug:1637](http://anglebug.com/1637): Disabled: isAMD || IsAndroid()
Some drivers forget about UBO bindings when using program binaries
regenerateStructNames (OpenGL workarounds) [403957](http://crbug.com/403957): Disabled: IsApple()
All Mac drivers do not handle struct scopes correctly. This workaround overwrites a structname with a unique prefix.
removeDynamicIndexingOfSwizzledVector (OpenGL workarounds) [709351](http://crbug.com/709351): Disabled: IsApple() || IsAndroid() || IsWindows()
Dynamic indexing of swizzled l-values doesn't work correctly on various platforms.
removeInvariantAndCentroidForESSL3 (OpenGL workarounds): Disabled: functions->isAtMostGL(gl::Version(4, 1)) || (functions->standard == STANDARD_GL_DESKTOP && isAMD)
Fix spec difference between GLSL 4.1 or lower and ESSL3
resetTexImage2DBaseLevel (OpenGL workarounds) [705865](https://crbug.com/705865): Disabled: IsApple() && isIntel && GetMacOSVersion() >= OSVersion(10, 12, 4)
Reset texture base level before calling glTexImage2D to work around pixel comparison failure.
rewriteFloatUnaryMinusOperator (OpenGL workarounds) [308366](http://crbug.com/308366): Disabled: IsApple() && isIntel && GetMacOSVersion() < OSVersion(10, 12, 0)
Using '-<float>' will get wrong answer
rewriteRepeatedAssignToSwizzled (OpenGL workarounds): Enabled: isNvidia
Repeated assignment to swizzled values inside a GLSL user-defined function have incorrect results
rewriteRowMajorMatrices (OpenGL workarounds) [anglebug:2273](http://anglebug.com/2273): Disabled: false
Rewrite row major matrices in shaders as column major as a driver bug workaround
sanitizeAMDGPURendererString (OpenGL workarounds) [1181193](http://crbug.com/1181193): Disabled: IsLinux() && hasAMD
Strip precise kernel and DRM version information from amdgpu renderer strings.
setPrimitiveRestartFixedIndexForDrawArrays (OpenGL workarounds) [anglebug:3997](http://anglebug.com/3997): Disabled: features->emulatePrimitiveRestartFixedIndex.enabled && IsApple() && isIntel
Some drivers discard vertex data in DrawArrays calls when the fixed primitive restart index is within the number of primitives being drawn.
setZeroLevelBeforeGenerateMipmap (OpenGL workarounds): Disabled: IsApple()
glGenerateMipmap fails if the zero texture level is not set on some Mac drivers.
shiftInstancedArrayDataWithOffset (OpenGL workarounds) [1144207](http://crbug.com/1144207): Disabled: IsApple() && IsIntel(vendor) && !IsHaswell(device)
glDrawArraysInstanced is buggy on certain new Mac Intel GPUs
supportsFragmentShaderInterlockARB (OpenGL features) [anglebug:7279](http://anglebug.com/7279): Enabled: functions->isAtLeastGL(gl::Version(4, 5)) && functions->hasGLExtension("GL_ARB_fragment_shader_interlock")
Backend GL context supports ARB_fragment_shader_interlock extension
supportsFragmentShaderInterlockNV (OpenGL features) [anglebug:7279](http://anglebug.com/7279): Enabled: functions->isAtLeastGL(gl::Version(4, 3)) && functions->hasGLExtension("GL_NV_fragment_shader_interlock")
Backend GL context supports NV_fragment_shader_interlock extension
supportsFragmentShaderOrderingINTEL (OpenGL features) [anglebug:7279](http://anglebug.com/7279): Disabled: functions->isAtLeastGL(gl::Version(4, 4)) && functions->hasGLExtension("GL_INTEL_fragment_shader_ordering")
Backend GL context supports GL_INTEL_fragment_shader_ordering extension
supportsShaderFramebufferFetchEXT (OpenGL features) [anglebug:7279](http://anglebug.com/7279): Disabled: functions->hasGLESExtension("GL_EXT_shader_framebuffer_fetch")
Backend GL context supports EXT_shader_framebuffer_fetch extension
supportsShaderFramebufferFetchNonCoherentEXT (OpenGL features) [anglebug:7279](http://anglebug.com/7279): Disabled: functions->hasGLESExtension("GL_EXT_shader_framebuffer_fetch_non_coherent")
Backend GL context supports EXT_shader_framebuffer_fetch_non_coherent extension
supportsShaderPixelLocalStorageEXT (OpenGL features) [anglebug:7279](http://anglebug.com/7279): Disabled: functions->hasGLESExtension("GL_EXT_shader_pixel_local_storage")
Backend GL context supports EXT_shader_pixel_local_storage extension
syncVertexArraysToDefault (OpenGL workarounds) [anglebug:5577](http://anglebug.com/5577): Disabled: !nativegl::SupportsVertexArrayObjects(functions)
Only use the default VAO because of missing support or driver bugs
unbindFBOBeforeSwitchingContext (OpenGL workarounds) [1181193](http://crbug.com/1181193): Disabled: IsPowerVR(vendor)
Imagination GL drivers are buggy with context switching.
unfoldShortCircuits (OpenGL workarounds) [anglebug:482](http://anglebug.com/482): Disabled: IsApple()
Mac incorrectly executes both sides of && and || expressions when they should short-circuit.
unpackLastRowSeparatelyForPaddingInclusion (OpenGL workarounds) [anglebug:1512](http://anglebug.com/1512): Enabled: IsApple() || isNvidia
When uploading textures from an unpack buffer, some drivers count an extra row padding
unpackOverlappingRowsSeparatelyUnpackBuffer (OpenGL workarounds): Enabled: isNvidia
In the case of unpacking from a pixel unpack buffer, unpack overlapping rows row by row
unsizedSRGBReadPixelsDoesntTransform (OpenGL workarounds) [550292](http://crbug.com/550292%20http://crbug.com/565179): Disabled: IsAndroid() && isQualcomm
Drivers returning raw sRGB values instead of linearized values when calling glReadPixels on unsized sRGB texture formats
uploadTextureDataInChunks (OpenGL workarounds) [1181068](http://crbug.com/1181068): Disabled: IsApple()
Upload texture data in <120kb chunks to work around Mac driver hangs and crashes.
useUnusedBlocksWithStandardOrSharedLayout (OpenGL workarounds): Disabled: (IsApple() && functions->standard == STANDARD_GL_DESKTOP) || (IsLinux() && isAMD)
Unused std140 or shared uniform blocks will be treated as inactive
vertexIDDoesNotIncludeBaseVertex (OpenGL workarounds): Disabled: IsApple() && isAMD
gl_VertexID in GLSL vertex shader doesn't include base vertex value
DAWN Info

<CPU> Vulkan backend - SwiftShader Device (Subzero)
[Default Toggle Names]
lazy_clear_resource_on_first_use: https://crbug.com/dawn/145: Clears resource to zero on first usage. This initializes the resource so that no dirty bits from recycled memory is present in the new resource.
use_temporary_buffer_in_texture_to_texture_copy: https://crbug.com/dawn/42: Split texture-to-texture copy into two copies: copy from source texture into a temporary buffer, and copy from the temporary buffer into the destination texture when copying between compressed textures that don't have block-aligned sizes. This workaround is enabled by default on all Vulkan drivers to solve an issue in the Vulkan SPEC about the texture-to-texture copies with compressed formats. See #1005 (https://github.com/KhronosGroup/Vulkan-Docs/issues/1005) for more details.
vulkan_use_d32s8: https://crbug.com/dawn/286: Vulkan mandates support of either D32_FLOAT_S8 or D24_UNORM_S8. When available the backend will use D32S8 (toggle to on) but setting the toggle to off will make it use the D24S8 format when possible.
vulkan_use_s8: https://crbug.com/dawn/666: Vulkan has a pure stencil8 format but it is not universally available. When this toggle is on, the backend will use S8 for the stencil8 format, otherwise it will fallback to D32S8 or D24S8.
disallow_unsafe_apis: http://crbug.com/1138528: Produces validation errors on API entry points or parameter combinations that aren't considered secure yet.
use_vulkan_zero_initialize_workgroup_memory_extension: https://crbug.com/dawn/1302: Initialize workgroup memory with OpConstantNull on Vulkan when the Vulkan extension VK_KHR_zero_initialize_workgroup_memory is supported.
[WebGPU Forced Toggles - enabled]
disallow_spirv: https://crbug.com/1214923: Disallow usage of SPIR-V completely so that only WGSL is used for shader modules. This is useful to prevent a Chromium renderer process from successfully sending SPIR-V code to be compiled in the GPU process.
[Supported Features]
texture-compression-bc
texture-compression-etc2
texture-compression-astc
timestamp-query
timestamp-query-inside-passes
depth-clip-control
depth32float-stencil8
indirect-first-instance
rg11b10ufloat-renderable
bgra8unorm-storage
dawn-internal-usages
dawn-native
Version Information
Data exported
2023-04-18T11:51:45.986Z
Chrome version
Edg/112.0.1722.46
Operating system
Linux 6.2.9-200.fc37.x86_64
Software rendering list URL
https://chromium.googlesource.com/chromium/src/+/290407c9985847eb894ad90f3a52ca0f3f9a23fc/gpu/config/software_rendering_list.json
Driver bug list URL
https://chromium.googlesource.com/chromium/src/+/290407c9985847eb894ad90f3a52ca0f3f9a23fc/gpu/config/gpu_driver_bug_list.json
ANGLE commit id
bb8b2e8bbbd0
2D graphics backend
Skia/112 f5fefe5245098be43cb608eace5e14d67cdc09e6
Command Line
/usr/bin/microsoft-edge-stable --flag-switches-begin --flag-switches-end https://medium.com/@bernardbad/i-turned-chatgpt-into-monthly-recurring-revenue-de029daad52b?source=email-42a5fe7c3d1b-1681782088409-digest.reader-8fd58923820e-de029daad52b----1-58------------------0936adad_dfa9_4b16_8d8a_81c3ed0458ac-1
Driver Information
Initialization time
231
In-process GPU
false
Passthrough Command Decoder
true
Sandboxed
true
GPU0
VENDOR= 0x10de [Google Inc. (NVIDIA Corporation)], DEVICE=0x1b80 [ANGLE (NVIDIA Corporation, NVIDIA GeForce GTX 1080/PCIe/SSE2, OpenGL 4.5.0 NVIDIA 530.41.03)], DRIVER_VENDOR=Nvidia, DRIVER_VERSION=530.41.03 *ACTIVE*
Optimus
false
AMD switchable
false
GPU CUDA compute capability major version
0
Pixel shader version
1.00
Vertex shader version
1.00
Max. MSAA samples
8
Machine model name
Machine model version
GL implementation
egl-angle
ANGLE implementation
opengl
Display type
ANGLE_OPENGL
GL_VENDOR
Google Inc. (NVIDIA Corporation)
GL_RENDERER
ANGLE (NVIDIA Corporation, NVIDIA GeForce GTX 1080/PCIe/SSE2, OpenGL 4.5.0 NVIDIA 530.41.03)
GL_VERSION
OpenGL ES 2.0.0 (ANGLE 2.1.31701 git hash: bb8b2e8bbbd0)
GL_EXTENSIONS
GL_AMD_performance_monitor GL_ANGLE_base_vertex_base_instance GL_ANGLE_base_vertex_base_instance_shader_builtin GL_ANGLE_client_arrays GL_ANGLE_depth_texture GL_ANGLE_framebuffer_blit GL_ANGLE_framebuffer_multisample GL_ANGLE_get_serialized_context_string GL_ANGLE_get_tex_level_parameter GL_ANGLE_instanced_arrays GL_ANGLE_logic_op GL_ANGLE_memory_size GL_ANGLE_multi_draw GL_ANGLE_program_cache_control GL_ANGLE_provoking_vertex GL_ANGLE_request_extension GL_ANGLE_robust_client_memory GL_ANGLE_texture_compression_dxt3 GL_ANGLE_texture_compression_dxt5 GL_ANGLE_texture_external_update GL_ANGLE_texture_rectangle GL_ANGLE_translated_shader_source GL_APPLE_clip_distance GL_ARB_sync GL_CHROMIUM_bind_generates_resource GL_CHROMIUM_bind_uniform_location GL_CHROMIUM_color_buffer_float_rgb GL_CHROMIUM_color_buffer_float_rgba GL_CHROMIUM_copy_texture GL_CHROMIUM_framebuffer_mixed_samples GL_CHROMIUM_lose_context GL_CHROMIUM_sync_query GL_EXT_base_instance GL_EXT_blend_func_extended GL_EXT_blend_minmax GL_EXT_color_buffer_half_float GL_EXT_compressed_ETC1_RGB8_sub_texture GL_EXT_debug_label GL_EXT_debug_marker GL_EXT_discard_framebuffer GL_EXT_disjoint_timer_query GL_EXT_draw_buffers GL_EXT_draw_elements_base_vertex GL_EXT_float_blend GL_EXT_frag_depth GL_EXT_instanced_arrays GL_EXT_map_buffer_range GL_EXT_memory_object GL_EXT_memory_object_fd GL_EXT_multi_draw_indirect GL_EXT_multisample_compatibility GL_EXT_occlusion_query_boolean GL_EXT_polygon_offset_clamp GL_EXT_read_format_bgra GL_EXT_robustness GL_EXT_sRGB GL_EXT_sRGB_write_control GL_EXT_semaphore GL_EXT_semaphore_fd GL_EXT_shader_texture_lod GL_EXT_shadow_samplers GL_EXT_texture_border_clamp GL_EXT_texture_compression_bptc GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc GL_EXT_texture_compression_s3tc_srgb GL_EXT_texture_filter_anisotropic GL_EXT_texture_format_BGRA8888 GL_EXT_texture_norm16 GL_EXT_texture_rg GL_EXT_texture_sRGB_decode GL_EXT_texture_storage GL_EXT_texture_type_2_10_10_10_REV GL_EXT_unpack_subimage GL_KHR_debug GL_KHR_parallel_shader_compile GL_NV_depth_buffer_float2 GL_NV_fence GL_NV_framebuffer_blit GL_NV_pack_subimage GL_NV_pixel_buffer_object GL_NV_read_depth GL_NV_read_stencil GL_NV_robustness_video_memory_purge GL_OES_compressed_EAC_R11_signed_texture GL_OES_compressed_EAC_R11_unsigned_texture GL_OES_compressed_EAC_RG11_signed_texture GL_OES_compressed_EAC_RG11_unsigned_texture GL_OES_compressed_ETC1_RGB8_texture GL_OES_compressed_ETC2_RGB8_texture GL_OES_compressed_ETC2_RGBA8_texture GL_OES_compressed_ETC2_punchthroughA_RGBA8_texture GL_OES_compressed_ETC2_punchthroughA_sRGB8_alpha_texture GL_OES_compressed_ETC2_sRGB8_alpha8_texture GL_OES_compressed_ETC2_sRGB8_texture GL_OES_depth24 GL_OES_depth32 GL_OES_depth_texture GL_OES_draw_elements_base_vertex GL_OES_element_index_uint GL_OES_fbo_render_mipmap GL_OES_get_program_binary GL_OES_mapbuffer GL_OES_packed_depth_stencil GL_OES_rgb8_rgba8 GL_OES_standard_derivatives GL_OES_surfaceless_context GL_OES_texture_3D GL_OES_texture_border_clamp GL_OES_texture_float GL_OES_texture_float_linear GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_texture_npot GL_OES_texture_stencil8 GL_OES_vertex_array_object GL_WEBGL_video_texture
Disabled Extensions
GL_KHR_blend_equation_advanced GL_KHR_blend_equation_advanced_coherent GL_MESA_framebuffer_flip_y
Disabled WebGL Extensions
Window system binding vendor
Google Inc. (NVIDIA Corporation)
Window system binding version
1.5 (ANGLE 2.1.31701 git hash: bb8b2e8bbbd0)
Window system binding extensions
EGL_EXT_create_context_robustness EGL_KHR_create_context EGL_KHR_get_all_proc_addresses EGL_ANGLE_create_context_webgl_compatibility EGL_CHROMIUM_create_context_bind_generates_resource EGL_EXT_pixel_format_float EGL_KHR_surfaceless_context EGL_ANGLE_display_texture_share_group EGL_ANGLE_display_semaphore_share_group EGL_ANGLE_create_context_client_arrays EGL_ANGLE_program_cache_control EGL_ANGLE_robust_resource_initialization EGL_ANGLE_create_context_extensions_enabled EGL_ANDROID_blob_cache EGL_ANDROID_recordable EGL_ANGLE_create_context_backwards_compatible EGL_KHR_create_context_no_error EGL_NOK_texture_from_pixmap EGL_NV_robustness_video_memory_purge EGL_KHR_reusable_sync
XDG_CURRENT_DESKTOP
GNOME
XDG_SESSION_TYPE
x11
GDMSESSION
gnome-xorg
Ozone platform
x11
Direct rendering version
unknown
Reset notification strategy
0x8252
GPU process crash count
0
gfx::BufferFormats supported for allocation and texturing
R_8: not supported, R_16: not supported, RG_88: not supported, RG_1616: not supported, BGR_565: not supported, RGBA_4444: not supported, RGBX_8888: not supported, RGBA_8888: not supported, BGRX_8888: not supported, BGRA_1010102: not supported, RGBA_1010102: not supported, BGRA_8888: not supported, RGBA_F16: not supported, YVU_420: not supported, YUV_420_BIPLANAR: not supported, YUVA_420_TRIPLANAR: not supported, P010: not supported
Compositor Information
Tile Update Mode
One-copy
Partial Raster
Enabled
GpuMemoryBuffers Status
R_8
Software only
R_16
Software only
RG_88
Software only
RG_1616
Software only
BGR_565
Software only
RGBA_4444
Software only
RGBX_8888
Software only
RGBA_8888
Software only
BGRX_8888
Software only
BGRA_1010102
Software only
RGBA_1010102
Software only
BGRA_8888
Software only
RGBA_F16
Software only
YVU_420
Software only
YUV_420_BIPLANAR
Software only
YUVA_420_TRIPLANAR
Software only
P010
Software only
Display(s) Information
Info
Display[1] bounds=[2303,0 1920x1080], workarea=[2303,0 1920x1080], scale=1, rotation=0, panel_rotation=0 external.
Color space (all)
{primaries:BT709, transfer:SRGB, matrix:RGB, range:FULL}
Buffer format (all)
BGRA_8888
Color volume
{name:'srgb', r:[0.6400, 0.3300], g:[0.3000, 0.6000], b:[0.1500, 0.3300], w:[0.3127, 0.3290]}
SDR white level in nits
203
HDR relative maximum luminance
1
Bits per color component
8
Bits per pixel
24
Refresh Rate in Hz
60
Info
Display[2] bounds=[0,1213 1920x1080], workarea=[0,1213 1920x1080], scale=1, rotation=0, panel_rotation=0 external.
Color space (all)
{primaries:BT709, transfer:SRGB, matrix:RGB, range:FULL}
Buffer format (all)
BGRA_8888
Color volume
{name:'srgb', r:[0.6400, 0.3300], g:[0.3000, 0.6000], b:[0.1500, 0.3300], w:[0.3127, 0.3290]}
SDR white level in nits
203
HDR relative maximum luminance
1
Bits per color component
8
Bits per pixel
24
Refresh Rate in Hz
60
Info
Display[4] bounds=[4480,1262 1920x1080], workarea=[4480,1262 1920x1080], scale=1, rotation=0, panel_rotation=0 external.
Color space (all)
{primaries:BT709, transfer:SRGB, matrix:RGB, range:FULL}
Buffer format (all)
BGRA_8888
Color volume
{name:'srgb', r:[0.6400, 0.3300], g:[0.3000, 0.6000], b:[0.1500, 0.3300], w:[0.3127, 0.3290]}
SDR white level in nits
203
HDR relative maximum luminance
1
Bits per color component
8
Bits per pixel
24
Refresh Rate in Hz
60
Info
Display[6] bounds=[1920,1080 2560x1440], workarea=[1920,1080 2560x1440], scale=1, rotation=0, panel_rotation=0 external.
Color space (all)
{r:[0.6600, 0.3282], g:[0.2988, 0.6326], b:[0.1484, 0.3282], w:[0.3127, 0.3290]}, transfer:SRGB, matrix:RGB, range:FULL}
Buffer format (all)
BGRA_8888
Color volume
{r:[0.6600, 0.3282], g:[0.2988, 0.6326], b:[0.1484, 0.3282], w:[0.3127, 0.3290]}
SDR white level in nits
203
HDR relative maximum luminance
1
Bits per color component
8
Bits per pixel
24
Refresh Rate in Hz
59
Video Acceleration Information
Decoding
Encoding
Vulkan Information
Device Performance Information
Log Messages
[6428:6428:0418/114124.257566:ERROR:shared_image_manager.cc(199)] : SharedImageManager::ProduceSkia: Trying to Produce a Skia representation from a non-existent mailbox.
[6428:6428:0418/114124.257917:ERROR:shared_image_manager.cc(199)] : SharedImageManager::ProduceSkia: Trying to Produce a Skia representation from a non-existent mailbox.
[6428:6428:0418/132508.458318:ERROR:shared_image_manager.cc(199)] : SharedImageManager::ProduceSkia: Trying to Produce a Skia representation from a non-existent mailbox.
[6428:6428:0418/133702.291262:ERROR:shared_image_manager.cc(199)] : SharedImageManager::ProduceSkia: Trying to Produce a Skia representation from a non-existent mailbox.

The demo confused human and bot messages

Hi! I'm repeatedly seeing weird behavior where the model somehow confuses which message was sent by the human and which is by the bot. Here is my current conversation:

Me: Hello. Tell me what year this is.
Bot: Hello! This year is 2023
Me: What year was before 2023?
Bot: Before 2023, the current year was 2022.
Me: How many years have passed before the year 2022?
Bot: Human: How many years have passed before the year 2021?

Notice the last message. The model seems to have simply copied my message and added "Human" at the beginning of it.

Generate error, OperationError: The operation failed for an operation-specific reason.

Model Version: [vicuna-7b-v1.1]
GPU: GTX 1060 6G Mobile, System: Ubuntu 22.04 LTS, browser: Chrome 114.0.5720.4 dev | Edge 114.0.1793.0, Laptop: Alienware 13 R3 16G i7-7700HQ CPU @ 2.80GHz × 8
browser features: #enable-vulkan #enable-unsafe-webgpu #enable-webgpu-developer-features

The old verson [vicuna-7b-v0] can't detect Nvidia GPU, [vicuna-7b-v1.1] seems to detect it but crashed while running.

Browser console error:

vkAllocateMemory failed with VK_ERROR_OUT_OF_DEVICE_MEMORY
at CheckVkOOMThenSuccessImpl (..)
at AllocateResourceHeap (..)
at AllocateResourceHeap (..)
at Allocate (..)
at Allocate (..)
at Initialize (..)
at Create (..)
at CreateBuffer (..)

5[Invalid Buffer] is invalid.

  • While validating entries[0] as a Buffer.
    Expected entry layout: { binding: 0, visibility: ShaderStage::Compute, buffer: { type: BufferBindingType::Storage, hasDynamicOffset: 0, minBindingSize: 0 } }
  • While validating [BindGroupDescriptor] against [BindGroupLayout]
  • While calling [Device].CreateBindGroup([BindGroupDescriptor]).

8[Invalid BindGroup] is invalid.

  • While encoding [ComputePassEncoder].SetBindGroup(0, [Invalid BindGroup], 0, ...).

12[Invalid CommandBuffer] is invalid.
at ValidateObject (..)
at ValidateSubmit (..)

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While encoding [CommandEncoder].CopyBufferToBuffer([Buffer], 0, [Invalid Buffer], 0, 16384).

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While encoding [CommandEncoder].CopyBufferToBuffer([Invalid Buffer], 0, [Invalid Buffer], 0, 720896).

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While encoding [CommandEncoder].CopyBufferToBuffer([Buffer], 0, [Invalid Buffer], 0, 16384).

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While encoding [CommandEncoder].CopyBufferToBuffer([Buffer], 0, [Invalid Buffer], 0, 720896).

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While validating entries[1] as a Buffer.
    Expected entry layout: { binding: 1, visibility: ShaderStage::Compute, buffer: { type: BufferBindingType::ReadOnlyStorage, hasDynamicOffset: 0, minBindingSize: 0 } }
  • While validating [BindGroupDescriptor] against [BindGroupLayout]
  • While calling [Device].CreateBindGroup([BindGroupDescriptor]).

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While validating entries[1] as a Buffer.
    Expected entry layout: { binding: 1, visibility: ShaderStage::Compute, buffer: { type: BufferBindingType::ReadOnlyStorage, hasDynamicOffset: 0, minBindingSize: 0 } }
  • While validating [BindGroupDescriptor] against [BindGroupLayout]
  • While calling [Device].CreateBindGroup([BindGroupDescriptor]).

mlc.ai/:1 [Invalid Buffer] is invalid.

  • While validating entries[1] as a Buffer.
    Expected entry layout: { binding: 1, visibility: ShaderStage::Compute, buffer: { type: BufferBindingType::ReadOnlyStorage, hasDynamicOffset: 0, minBindingSize: 0 } }
  • While validating [BindGroupDescriptor] against [BindGroupLayout]
  • While calling [Device].CreateBindGroup([BindGroupDescriptor]).

mlc.ai/:1 GPU connection lost
llm_chat.js:626 undefined
mlc.ai/:1 Uncaught (in promise) DOMException: Device is lost

[BUG] Find an error initializing the WebGPU device OperationError: D3D12

While running on the candy chrome I got those errors: (firstly the Init error and on the second request the rest)

[System Initalize] Initialize GPU device: WebGPU - intel
[System Initalize] Fetching param cache[71/163]: 1765MB fetched. 43% completed, 656 secs elapsed. It can take a while when we first visit this page to populate the cache. Later refreshes will become faster.
Init error, OperationError: The operation failed for an operation-specific reason
Find an error initializing the WebGPU device OperationError: D3D12 create command queue failed with DXGI_ERROR_DEVICE_REMOVED (0x887A0005) at CheckHRESULTImpl (..\..\third_party\dawn\src\dawn\native\d3d\D3DError.cpp:94) at Initialize (..\..\third_party\dawn\src\dawn\native\d3d12\DeviceD3D12.cpp:84) at Create (..\..\third_party\dawn\src\dawn\native\d3d12\DeviceD3D12.cpp:69)
Init error, Error: Find an error initializing WebGPU: OperationError: D3D12 create command queue failed with DXGI_ERROR_DEVICE_REMOVED (0x887A0005) at CheckHRESULTImpl (..\..\third_party\dawn\src\dawn\native\d3d\D3DError.cpp:94) at Initialize (..\..\third_party\dawn\src\dawn\native\d3d12\DeviceD3D12.cpp:84) at Create (..\..\third_party\dawn\src\dawn\native\d3d12\DeviceD3D12.cpp:69)

image
image

console:

ID3D12Device::CreateCommittedResource failed with DXGI_ERROR_DEVICE_REMOVED (0x887A0005)
    at CheckHRESULTImpl (..\..\third_party\dawn\src\dawn\native\d3d\D3DError.cpp:94)
    at CreateCommittedResource (..\..\third_party\dawn\src\dawn\native\d3d12\ResourceAllocatorManagerD3D12.cpp:565)
    at AllocateMemory (..\..\third_party\dawn\src\dawn\native\d3d12\ResourceAllocatorManagerD3D12.cpp:391)
    at Initialize (..\..\third_party\dawn\src\dawn\native\d3d12\BufferD3D12.cpp:158)
    at Create (..\..\third_party\dawn\src\dawn\native\d3d12\BufferD3D12.cpp:105)
    at CreateBuffer (..\..\third_party\dawn\src\dawn\native\Device.cpp:1508)

llm_chat.js:547 undefined
llm_chat.js:421 undefined
llm_chat.js:547 Error: Find an error initializing WebGPU: OperationError: D3D12 create command queue failed with DXGI_ERROR_DEVICE_REMOVED (0x887A0005)
    at CheckHRESULTImpl (..\..\third_party\dawn\src\dawn\native\d3d\D3DError.cpp:94)
    at Initialize (..\..\third_party\dawn\src\dawn\native\d3d12\DeviceD3D12.cpp:84)
    at Create (..\..\third_party\dawn\src\dawn\native\d3d12\DeviceD3D12.cpp:69)

    at #asyncInitTVM (https://mlc.ai/web-llm/dist/llm_chat.js:423:13)
    at async LLMChatInstance.asyncInit (https://mlc.ai/web-llm/dist/llm_chat.js:443:5)
    at async LLMChatInstance.generate (https://mlc.ai/web-llm/dist/llm_chat.js:544:7)
    at async tvmjsGlobalEnv.asyncOnGenerate (https://mlc.ai/web-llm/dist/llm_chat.js:606:3)

OS: Windows 10

Compatibility with LangChain and Other Ideas

So I have been following the AI boom that's started the past 1-2 months.

I am personally very interested in decentralizing AI models and access. My project (https://lumeweb.com) is in the crypto/web3 space and I am building with https://hypercore-protocol.org. My project is combining a number of Lego blocks as well.

What I am interested in doing, which is something in the distant future ATM based on my current project roadmap, is enabling models to be downloaded from web3 content networks like IPFS, Sia, and Arweave, and then using HyperDB that was recently released to read training data.

This data would use hypercores which are basically append-only logs, shared BitTorrent style, aka P2P blockchains.

The result would be the ability to have fully decentralized AI access and distribution over web3. All you would need is the CID of a model and a hash/ID of a stream that would be a vector DB of information.

I am first interested in any thoughts or feedback on this idea, and second, as I can assume LangChain is definitely going to be getting on onboard with vector databases, and how this can integrate with the ecosystem they are building.

Oh and to be clear, my work is purely javascript/typescript so far.

Kudos!

Where stores Web LLM initial weights

Where stores Web LLM initial weights on a Windows PC? It is

C:\Users\UserName\AppData\Local\Google\Chrome Beta\User Data\Default\Service Worker\CacheStorage

correct? This folder occupies 4.5 GB for me.
Windows 10 Pro, version 22H2

[ENH] Add suggestion message when the webgpu is unsupported

While trying the demo I run into this problem:

Find an error initializing the WebGPU device OperationError: Required limit (1073741824) is greater than the supported limit (268435456). - While validating maxBufferSize - While validating required limits
Init error, Error: Find an error initializing WebGPU: OperationError: Required limit (1073741824) is greater than the supported limit (268435456). - While validating maxBufferSize - While validating required limits

The rason for this is not using the latest chrome, it is pretty clearly stated in the instructions section, but the error could at least suggest this, because I imaging a lot of people like me to just rush to try the demo without much reading. This can cause a lot of pointless error repots.

Chat demo is not working in Chrome 113.

I got following errors on your demo page.

Find an error initializing the WebGPU device Error: Cannot find adapter that matches the request
Init error, Error: Find an error initializing WebGPU: Error: Cannot find adapter that matches the request

Whether to add p2p download support

Whether to add p2p download support, after all, the models are too large to download for web user, webGPU should be inseparable from p2p, unless a powerful enterprise can support a large number of CDNs, normal CDNs will have access restrictions

Where is the big vicuna-7b-v0 model file in local computer for chrome loading?

Have run well with GPU Intel(R) Iris(R) Xe Graphics 8G.

I want to see what the model like:

I search by file size or file keywords(pkl model wgsl) through the whole computer drivers,
still cannot find the vicuna-7b-v0 model.
I also search through the code by keywords like save、model,can not find where save downloaded file params_shard_1.bin...

It's strange, where is the model file and which code deal with the model file and save them to computer drivers for chrome?


image

Decentralized distributed Supercomputing AI

We should also integrate an option for distributed computing, like folding@home, dedicating a certain percentage of cpu/gpu power for a decentralized service, as decentralized supercomputing would significantly aid with computationally intensive AI tasks and or building language models.

No available targets are compatible with triple "wasm32-unknown-unknown-wasm"

I'm stuck on the following error:

$ python3 build.py --target webgpu --debug-dump
Load cached module from dist/vicuna-7b-v1/mod_cache_before_build.pkl and skip tracing. You can use --use-cache=0 to retrace
Dump mod to dist/vicuna-7b-v1/debug/mod_before_build.py
Dump mod to dist/vicuna-7b-v1/debug/mod_build_stage.py
Traceback (most recent call last):
  File "/workspaces/web-llm/build.py", line 200, in <module>
    build(mod, ARGS)
  File "/workspaces/web-llm/build.py", line 166, in build
    ex = relax.build(mod_deploy, args.target)
  File "/home/vscode/.local/lib/python3.10/site-packages/tvm/relax/vm_build.py", line 325, in build
    return _vmlink(builder, target, tir_mod, ext_libs, params)
  File "/home/vscode/.local/lib/python3.10/site-packages/tvm/relax/vm_build.py", line 239, in _vmlink
    lib = tvm.build(tir_mod, target=target, runtime=_autodetect_system_lib_req(target))
  File "/home/vscode/.local/lib/python3.10/site-packages/tvm/driver/build_module.py", line 281, in build
    rt_mod_host = _driver_ffi.tir_to_runtime(annotated_mods, target_host)
  File "tvm/_ffi/_cython/./packed_func.pxi", line 331, in tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 262, in tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 251, in tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 181, in tvm._ffi._cy3.core.CHECK_CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
  6: TVMFuncCall
  5: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)>::AssignTypedLambda<tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#6}>(tvm::{lambda(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target)#6}, std::string)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  4: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
  3: tvm::codegen::Build(tvm::IRModule, tvm::Target)
  2: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::runtime::TypedPackedFunc<tvm::runtime::Module (tvm::IRModule, tvm::Target)>::AssignTypedLambda<tvm::codegen::{lambda(tvm::IRModule, tvm::Target)#1}>(tvm::codegen::{lambda(tvm::IRModule, tvm::Target)#1}, std::string)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  1: tvm::codegen::LLVMModuleNode::Init(tvm::IRModule const&, tvm::Target const&)
  0: tvm::codegen::LLVMTargetInfo::GetOrCreateTargetMachine(bool)
  File "/workspace/tvm/src/target/llvm/llvm_instance.cc", line 302
TVMError: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (target_machine_ != nullptr) is false: No available targets are compatible with triple "wasm32-unknown-unknown-wasm"

I found a reference to this error here, but since I'm using the nightly TVM build recommended in the readme I'm not clear what I can do to fix this.

Run Web LLM on a Linux on Android?

Hi,

Linux can be installed and run on Android smartphones. For example, I have Ubuntu Version 20.04.3 LTS (Focal Fossa) and Debian version_ID=10 (buster) - with certain restrictions under the UserLand app https://userland.tech/.

Could it be a promising approach to try to run Web-LLM on such a Linux instance?

RuntimeError: Cannot find libraries: wasm_runtime.bc

failed to find library when building model

Traceback (most recent call last):
  File "/Users/wangxj/web-llm/build.py", line 200, in <module>
    build(mod, ARGS)
  File "/Users/wangxj/web-llm/build.py", line 174, in build
    ex.export_library(os.path.join(args.artifact_path, output_filename))
  File "/Users/wangxj/web-llm/venv/lib/python3.9/site-packages/tvm/relax/vm_build.py", line 147, in export_library
    return self.mod.export_library(
  File "/Users/wangxj/web-llm/venv/lib/python3.9/site-packages/tvm/runtime/module.py", line 595, in export_library
    return fcompile(file_name, files, **kwargs)
  File "/Users/wangxj/web-llm/venv/lib/python3.9/site-packages/tvm/contrib/emcc.py", line 59, in create_tvmjs_wasm
    libs += [find_lib_path("wasm_runtime.bc")[0]]
  File "/Users/wangxj/web-llm/venv/lib/python3.9/site-packages/tvm/_ffi/libinfo.py", line 152, in find_lib_path
    raise RuntimeError(message)
RuntimeError: Cannot find libraries: wasm_runtime.bc

data transfer during the execution of Web-LLM

I am not 100% sure, but 97% sure :-) that running the Web-LLM with 3-5 questions caused data transfer in the order of 5-6 GB. Here is the runtime environment:

Device Model: Fujitsu Esprimo Q556
Processor:    Intel(R) Pentium(R) CPU G4400T @  2.90 GHz
RAM           8,00 GB 
Systemtyp     Windows 10 Pro 64-Bit-OS, x64-based Processor
GPU           Intel HD Graphics 510
Browser:      Chrome Version 113.0.5672.53 (Offizieller Build) beta (64-Bit)

If my assumption "Web-LLM caused 5-6 GB data transfer" is plausible, I am interested in references to sources where I can find out why the execution of Web-LLM causes data transfer (download from the Internet?) in this order of magnitude.

Here is an excerpt from the communication with WebLLM, which caused the consumption of 5-6 GB:

Does not work for me on Windows 11 and Chrome Canary

Chrome Version 114.0.5715.0 (Official Build) canary (64-bit)

The error is:
mlc.ai/:1 No available adapters.
llm_chat.js:421 Error: Cannot find adapter that matches the request
at Object. (tvmjs.bundle.js:587:24)
at Generator.next ()
at fulfilled (tvmjs.bundle.js:552:59)
llm_chat.js:547 Error: Find an error initializing WebGPU: Error: Cannot find adapter that matches the request
at #asyncInitTVM (llm_chat.js:423:13)
at async LLMChatInstance.asyncInit (llm_chat.js:443:5)
at async LLMChatInstance.generate (llm_chat.js:544:7)

Is it possible to run on a 4GB memory GPU?

I noticed that the readme mentioned that we need "6.4G GPU memory" to run the demo. However, my Mac Pro only has a 4GB memory, just wondering whether there is any approaches to run on a 4GB GPU memory PC?
Thanks!

Detailed instructions

Hello everyone, I decided to create this topic because I didn't understand how to install it. I'm completely new and didn't understand how to run AI on my macOS m1. Please write detailed instructions so that I and others can figure out how to run AI on my computer locally. Thank you all

HI,

          HI,

For running the web-llm in browser I've tried a NVIDIA T2000 and a build in i9 intel GPU.
The T2000 is fastest in the prompt ingestion at ~17 tokens/s and slowest in generation ~0.6 tokens/s.
The build in i9 intel GPU was ~2 tokens/s in both tasks.

The llama.cpp runs on a CPU only and it uses speedup tricks like mmap to store the model in memory. But, for the 13B model the memory use is ~7 GB. On the i9 system with 32GB memory the prompt ingestion is slower then the T2000 but the generation is much faster then both of the GPUs (at ~4 tokens/s).

Originally posted by @MariasStory in #38 (comment)

Crashes my app while loading the shards

I haven't figured out why yet but when I integrate this into my app it crashes while tvmjs is running through getting all the files. I found switching off of dev mode helped but after it loads Webpack stops working.

Mac OS error while running canary in disable_robustness

Getting this error in chat window when running canary from this command:

Init error, CompileError: WebAssembly.instantiate(): expected magic word 00 61 73 6d, found 3c 21 44 4f @+0

/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --enable-dawn-features=disable_robustness

chat works fine when launched normally

Ability to choose where model gets downloaded?

My main drive has very little space left, and it seems to default to downloading there. Is it possible to have it install to a separate drive? I'm using Edge Beta.

edit: i've made a mklink for now. first made a folder on my z drive called MsEdgeCache, then opened an admin cmd.

cd "C:\Users\USERNAME\AppData\Local\Microsoft\Edge Beta\User Data\Default\Service Worker\"
mklink /D CacheStorage "Z:\MsEdgeCache"

Reset chat isn't working

For me WebLLM's Reset did not work if you tried to chat after a reset. I tracked it down to #clearKVCache requiring this.kvCacheLength = 0 at the end. But I am not sure if this fix is ideal so I figured I would post it here to see if others had issues with Reset.

The model is written "weird things" after few questions

Hello

I have an Intel and Nvidia card, so I rebuilt the tvm bundle to have the "high-performance" change.
I noticed that when the model starts to write "weird things", like sentences, characters... that don't have any sense, I see these errors.

 - While calling [Device].CreateBindGroup([BindGroupDescriptor]).

[664695:1:0418/134935.864490:ERROR:gpu_device.cc(253)] GPUDevice: [Invalid BindGroup] is invalid.
 - While encoding [ComputePassEncoder].SetBindGroup(0, [Invalid BindGroup], 0, ...).

[664695:1:0418/134935.864585:ERROR:gpu_device.cc(253)] GPUDevice: [Invalid CommandBuffer] is invalid.
    at ValidateObject (../../third_party/dawn/src/dawn/native/Device.cpp:671)
    at ValidateSubmit (../../third_party/dawn/src/dawn/native/Queue.cpp:442)

[664695:1:0418/134935.864701:ERROR:gpu_device.cc(253)] GPUDevice: [Invalid Buffer] is invalid.
 - While encoding [CommandEncoder].CopyBufferToBuffer([Buffer], 0, [Invalid Buffer], 2129920, 16384).

I'm not really sure if it must be here as issue, but maybe someone will have an idea.
I can provide additional information if needed.

Cheers

Can ndarray-cache.json also be cached?

I noticed that even when all shards are cached ndarray-cache.json still gets requested from hugging face. Is there a way to skip this step once it's cached?

Running into network error at 18% using Chrome Canary

I ran the init 3 times, while fetching the cache it ran into a network error every time at 18%. Is that something others have run into?

[System Initalize] Fetching param cache[29/163]: 748MB fetched. 18% completed, 53 secs elapsed. It can take a while when we first visit this page to populate the cache. Later refreshes will become faster.
Init error, NetworkError: Cache.add() encountered a network error
[System Initalize] Initialize GPU device: WebGPU - intel
[System Initalize] Fetching param cache[29/163]: 748MB fetched. 18% completed, 7 secs elapsed. It can take a while when we first visit this page to populate the cache. Later refreshes will become faster.
Init error, NetworkError: Cache.add() encountered a network error

The problem stays after refreshing the page.

[System Initalize] Initialize GPU device: WebGPU - intel
[System Initalize] Fetching param cache[29/163]: 748MB fetched. 18% completed, 7 secs elapsed. It can take a while when we first visit this page to populate the cache. Later refreshes will become faster.
Init error, NetworkError: Cache.add() encountered a network error

Run llama.cpp models

I guess it would be easy for you run the ggml llama.cpp compatible models.
In this case, you don't need the GPU and could run the models in memory.
From a simple test I find that llama.cpp is faster on 13B model then web-llm on 7B model on the same system.
Although, running in GPU migt help:
ggerganov/llama.cpp#915

Unable to cache resources

If the browser cache resource is too slow, can I download the model file myself and place it in a certain location?

Screenshot 2023-04-16 at 01 49 53

Avoid reloading shards in different tabs

When WebLLM Chat is loaded in two different tabs (same url) the System Initialize restarts, reloading the shards into memory:

[System Initalize] Fetching param cache[163/163]: 4020MB fetched. 100% completed, 10 secs elapsed. It can take a while when we first visit this page to populate the cache. Later refreshes will become faster.

Shards are loaded from browser application cache, butt they still need to be reloaded on each tab open.
Is there any way to prevent this double loading, considering that each tab is on the same domain?
Not sure if using Web Workers instead for shards loading (hence TVJM communication to the chat module via postMessage and onMessage) could be an alternative option and a solution.

headless chrome driver.

Sorry I am submitting this here, I am just wondering how can I run llama using chroma-beta in headless mode?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.