Comments (8)
Yes, there is. I have only tested half-precision (fp16) on Intel GPUs with Beignet OpenCL. So far, I couldn't make the kernel work properly with Beignet under Linux. Thus, I haven't been able to run the tuners. As a result there are no entries for fp16 GEMM in the database.
I don't have access to other hardware supporting fp16, so I can't say for 100% that it is a Beignet-related issue - it could of course also be a bug in CLBlast. But until I have other fp16 hardware available (or someone else runs the tuners), it isn't fully tested.
from clblast.
On Mali-T628 (OpenCL r10), the kernel runs, but there are test failures (not all: 85 passed / 278 skipped / 149 failed). Is this what you mean by "not working properly"?
from clblast.
And if I swap the default kernel with the one written by ARM, then the numbers are 87/278/147. Couldn't it be an issue with precision?
from clblast.
With Beignet I haven't managed to get the tuner working for HGEMM.
I haven't tuned for Mali FP16, that's why the parameters are not included in the database. Feel free to do so and upload the results to #1. After that's done the next step would be to look at correctness.
from clblast.
Beignet
BTW, how does it support half-precision? Here (HD4600, beignet from git master) CLBlast says -2045. That is, I don't have cl_khr_fp16
in extensions.
from clblast.
The following 2 devices at least support cl_khr_fp16
with the latest Beignet:
- HD Graphics 5500 BroadWell U-Processor GT2
- HD Graphics Skylake ULT GT2
Could very well be that your hardware doesn't support FP16.
from clblast.
Ah, that matches the situation with proprietary drivers. Looks like I'm out of luck to verify this.
from clblast.
I changed the database script such that it now generates a default parameter set based on 32-bit precision (e.g SGEMM) in case there is no entry yet (e.g. HGEMM). This fixes your issue.
from clblast.
Related Issues (20)
- TRMV scratch buffer size calculated incorrectly
- llama.cpp hangs when defining cl_uint variable HOT 2
- TBMV, TPMV & TRSV scratch buffer sizes calculated incorrectly
- ArrayFire test builds are failing HOT 6
- Python Memory Management HOT 1
- Undefined reference to `clblast::StatusCode clblast::Gemm` on Windows with GCC with the C++ API HOT 4
- Segmentation fault with Octave-ocl HOT 5
- New CLBlast 1.6.0 Release is 3x previous library size HOT 4
- GEMM Batched Question HOT 2
- compiling CLBlast with my OpenCL drivers on Android HOT 3
- Multi-GPU, multi-threaded invocation of CLBlastSgemm seems to be unreliable. HOT 16
- GemmStridedBatched results question HOT 5
- make alltuner error HOT 7
- CL kernel preprocess cause compilation error HOT 2
- [Question] How to Install on Windows? HOT 2
- Cuda execution failed,when running clblast_sample_sgemm_cuda, "CUDA NVRTC error: nvrtcCompileProgram: NVRTC_ERROR_INVALID_OPTION" HOT 2
- [implement details] usm beheavior HOT 2
- CMake find package paths broken in MSYS2 HOT 3
- Binary releases on github are not valid tar.gz files
- Pyclblast float16 scalar conversion HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clblast.