Comments (11)
Please install LLamaSharp.Backend.Cuda11
or LLamaSharp.Backend.Cuda12
at first, then set n_gpu_layers
to a positive number.
Generally, if nothing goes wrong, there will be an output like this "offload 20 layers to gpu"
from llamasharp.
I installed the cuda package on to the Llama.Examples project and set the n_gpu_layers to 100. Still it is not using GPU.
from llamasharp.
Could you please provide the outputs when you run the program? (the output when loading the model) Besides, it'e better if your system info is available here.
from llamasharp.
Sure.
0: Run a chat session.
1: Run a LLamaModel to chat.
2: Quantize a model.
3: Get the embeddings of a message.
4: Run a LLamaModel with instruct mode.
5: Load and save state of LLamaModel.
Your choice: 0
Please input your model path: C:\Users\rahul\Downloads\wizardLM-7B.ggmlv3.q4_0.bin
llama.cpp: loading model from C:\Users\rahul\Downloads\wizardLM-7B.ggmlv3.q4_0.bin
llama_model_load_internal: format = ggjt v3 (latest)
llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 0.07 MB
llama_model_load_internal: mem required = 5407.72 MB (+ 1026.00 MB per state)
.
llama_init_from_file: kv self size = 256.00 MB
---------------
Display Devices
---------------
Card name: NVIDIA GeForce GTX 1650
Manufacturer: NVIDIA
Chip type: GeForce GTX 1650
DAC type: Integrated RAMDAC
Device Type: Full Device (POST)
Device Key: Enum\PCI\VEN_10DE&DEV_1F91&SUBSYS_3FFB17AA&REV_A1
Device Status: 0180200A [DN_DRIVER_LOADED|DN_STARTED|DN_DISABLEABLE|DN_NT_ENUMERATOR|DN_NT_DRIVER]
Device Problem Code: No Problem
Driver Problem Code: Unknown
Display Memory: 12109 MB
Dedicated Memory: 3962 MB
Shared Memory: 8147 MB
Current Mode: 1920 x 1080 (32 bit) (60Hz)
HDR Support: Not Supported
Display Topology: Internal
Display Color Space: DXGI_COLOR_SPACE_RGB_FULL_G22_NONE_P709
Color Primaries: Red(0.589844,0.370117), Green(0.349609,0.554688), Blue(0.155273,0.110352), White Point(0.313477,0.329102)
Display Luminance: Min Luminance = 0.500000, Max Luminance = 270.000000, MaxFullFrameLuminance = 270.000000
Monitor Name: Generic PnP Monitor
Monitor Model: unknown
Monitor Id: LGD05E5
Native Mode: 1920 x 1080(p) (59.977Hz)
Output Type: Displayport Embedded
Monitor Capabilities: HDR Not Supported
Display Pixel Format: DISPLAYCONFIG_PIXELFORMAT_32BPP
Advanced Color: Not Supported
Driver Name: C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll,C:\WINDOWS\System32\DriverStore\FileRepository\nvlt.inf_amd64_5adc6075318430cf\nvldumdx.dll
Driver File Version: 27.21.0014.6230 (English)
Driver Version: 27.21.14.6230
DDI Version: 12
Feature Levels: 12_1,12_0,11_1,11_0,10_1,10_0,9_3,9_2,9_1
Driver Model: WDDM 2.7
Hardware Scheduling: Supported:True Enabled:False
Graphics Preemption: Pixel
Compute Preemption: Dispatch
Miracast: Not Supported
Detachable GPU: No
Hybrid Graphics GPU: Not Supported
Power P-states: Not Supported
Virtualization: Paravirtualization
Block List: No Blocks
Catalog Attributes: Universal:False Declarative:True
Driver Attributes: Final Retail
Driver Date/Size: 05-04-2021 05:30:00 AM, 1049064 bytes
WHQL Logo'd: n/a
WHQL Date Stamp: n/a
Device Identifier: {D7B71E3E-5CD1-11CF-4C6C-F51F1BC2D635}
Vendor ID: 0x10DE
Device ID: 0x1F91
SubSys ID: 0x3FFB17AA
Revision ID: 0x00A1
Driver Strong Name: oem7.inf:0f066de306961689:Section171:27.21.14.6230:pci\ven_10de&dev_1f91&subsys_3ffb17aa
Rank Of Driver: 00CF0001
Video Accel:
DXVA2 Modes: {86695F12-340E-4F04-9FD3-9253DD327460} DXVA2_ModeMPEG2_VLD {6F3EC719-3735-42CC-8063-65CC3CB36616} DXVA2_ModeVC1_D2010 DXVA2_ModeVC1_VLD {32FCFE3F-DE46-4A49-861B-AC71110649D5} DXVA2_ModeH264_VLD_Stereo_Progressive_NoFGT DXVA2_ModeH264_VLD_Stereo_NoFGT DXVA2_ModeH264_VLD_NoFGT DXVA2_ModeHEVC_VLD_Main DXVA2_ModeHEVC_VLD_Main10 {20BB8B0A-97AA-4571-8E99-64E60606C1A6} {15DF9B21-06C4-47F1-841E-A67C97D7F312} DXVA2_ModeMPEG4pt2_VLD_Simple DXVA2_ModeMPEG4pt2_VLD_AdvSimple_NoGMC {9947EC6F-689B-11DC-A320-0019DBBC4184} {33FCFE41-DE46-4A49-861B-AC71110649D5} DXVA2_ModeVP9_VLD_Profile0 DXVA2_ModeVP9_VLD_10bit_Profile2 {DDA19DC7-93B5-49F5-A9B3-2BDA28A2CE6E} {6AFFD11E-1D96-42B1-A215-93A31F09A53D} {914C84A3-4078-4FA9-984C-E2F262CB5C9C}
Deinterlace Caps: {6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(YUY2,YUY2) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(UYVY,UYVY) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(YV12,0x32315659) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_PixelAdaptive
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(NV12,0x3231564e) Frames(Prev/Fwd/Back)=(0,0,0) Caps=VideoProcess_YUV2RGB VideoProcess_StretchX VideoProcess_StretchY DeinterlaceTech_BOBVerticalStretch
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC1,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC2,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC3,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(IMC4,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(S340,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{6CB69578-7617-4637-91E5-1C02DB810285}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{F9F19DA5-3B09-4B2F-9D89-C64753E3EAAB}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{5A54A0C9-C7EC-4BD9-8EDE-F3C75DC4393B}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
{335AA36E-7884-43A4-9C91-7F87FAF3E37E}: Format(In/Out)=(S342,UNKNOWN) Frames(Prev/Fwd/Back)=(0,0,0) Caps=
D3D9 Overlay: Supported
DXVA-HD: Supported
DDraw Status: Enabled
D3D Status: Enabled
AGP Status: Enabled
MPO MaxPlanes: 4
MPO Caps: RGB,YUV,BILINEAR,HIGH_FILTER,STRETCH_YUV,STRETCH_RGB,IMMEDIATE,HDR (MPO3)
MPO Stretch: 10.000X - 0.500X
MPO Media Hints: resizing, colorspace Conversion
MPO Formats: NV12,R16G16B16A16_FLOAT,R10G10B10A2_UNORM,R8G8B8A8_UNORM,B8G8R8A8_UNORM
PanelFitter Caps: RGB,YUV,BILINEAR,HIGH_FILTER,STRETCH_YUV,STRETCH_RGB,IMMEDIATE,HDR (MPO3)
PanelFitter Stretch: 10.000X - 0.500X
from llamasharp.
What's your cuda version? Since you installed two packages at the same time, one of them will cover the other. Therefore the cuda version of your system and that of LLamaSharp.Backend
may be mismatched.
from llamasharp.
I have cuda 12 installed. Removed Cuda11 package and tested again. No luck.
from llamasharp.
My GPU's Cuda Compute Capability is 7.5 and I think it will only work with Cuda version 10. (https://stackoverflow.com/a/28933055)
from llamasharp.
Would you like to build from source (llama.cpp) to have a try? I could help you with that. I think if it could be built from source, then it should work in your computer.
from llamasharp.
Thank you. It was running on my CPU fine and I wanted to know how it switches to GPU and works. I saw the cuda dlls in the runtime folder but the code was
LLamaSharp/LLama/Native/NativeApi.cs
Line 29 in e603a09
Shouldn't it be private const string libraryName = "libllama-cuda11";
to load the GPU version?
I changed it so and ran the example and that is when I got an interop exception which led me to believe there might be something wrong with my cuda version. I thought of compiling it myself but haven't got enough time to try it yet.
from llamasharp.
In LLamaSharp.Backend
package they are all renamed to libllama.dll
. When using master branch you need to change the filename to choose which dll you want to use. What does the interop exception look like?
from llamasharp.
I'll close this issue since it's been inactive for so long, but feel free to comment here if it's still an issue and I'll reopen it.
from llamasharp.
Related Issues (20)
- about NVidia GPU use example HOT 4
- SemanticKernel ChatCompletion is Stateless HOT 5
- Segmentation fault on Docker HOT 2
- [LLamaSharp.Backend.Cuda12 v0.10.0] Unable to load the model onto multiple GPUs HOT 3
- Unable to use lora in llamasharp but can use it in llama.cpp HOT 8
- AMD GPUs Support
- Exception error when sending any prompts. HOT 7
- Parallel Inferencing? HOT 27
- Information on new important updates in llama.cpp HOT 1
- LLava_shared.dll in LLamaSharp.Backend.Cuda12 is for CPU only HOT 3
- [llava] How to clear imagepaths ? HOT 9
- [Native Lib] Support specifying LLaVA native library path
- Can not intercept all the log messages with llama_log_set HOT 2
- [Feature] Support GritLM to get embeddings
- Add a best practice example for RAG HOT 4
- Clip model from LLaVA is not using GPU, but LLaVA model is using GPU in the same piece of code. HOT 4
- Whisper support HOT 7
- AccessViolationException HOT 4
- LLamaSharp.Backend.OpenCL 0.11.2: GGML_ASSERT llama.cpp:14093: hparams.n_embd_head_v % ggml_blck_size(type_v) == 0 HOT 4
- [Feature Request] Support using embeddings as input HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llamasharp.