7900xtx's People
Forkers
hamzamoudnib dagelf gnif trocker throwoutofcoffeeexception ductringuyen passw tanmaydhobale aromate drukpa1455 krenax junaidloonat jozef-javorsky-dodo smurfd villos sanowl vicky2618 samgh1230 zhikunhuo inarikami btc-cloud ppetroskevicius7900xtx's Issues
Claude OPUS - POC - AMD driver + ROCT + PM4Queue / packets
I forgot about this code - I got claude.ai to spit it out some weeks back in about 5 mins
I don't know if it works / compiles - so maybe garbage. I was captivated by the 10+ hrs of youtube videos - and frankly - I'm not sure if this is exactly what you wanted.
Regardless - I beseech you to look at Claude Opus as a vector to getting hacking results (not gpt4)
It's well abreast on the AMD firmware / drivers / ALL github projects (including tinygrad).
#include "helpers.h"
#include "nouveau.h"
#include "ROCT-Thunk-Interface.h"
#include "PM4Queue.hpp"
#include "PM4Packet.hpp"
#define ROCHSA_PM4_QUEUE_SIZE (64*1024) // 64 KB
uint64_t trivial[] = {
// Trivial compute shader, same as original
0x00005a00ff057624, 0x000fe200078e00ff,
0x0000580000027a02, 0x000fe20000000f00,
0x0000590000037a02, 0x000fca0000000f00,
0x0000000502007986, 0x000fe2000c101904,
0x000000000000794d, 0x000fea0003800000,
};
void gpu_setup(PM4Queue* pQueue) {
// Initialize the PM4 queue
pQueue->Init();
}
void gpu_memcpy(PM4Queue* pQueue, uint64_t dst, const uint32_t *src, int len) {
assert(len % 4 == 0);
// Use PM4 DMA packet to do the memcpy
pQueue->PlaceAndSubmitPacket(PM4DmaDataPacket(dst, src, len));
}
void gpu_compute(PM4Queue* pQueue, uint64_t shader_addr, uint64_t cb_addr, int cb_len) {
// Set up registers
const unsigned int COMPUTE_PGM_VALUES[] = {
static_cast<uint32_t>(shader_addr), // PGM_LO
static_cast<uint32_t>(shader_addr >> 32) // PGM_HI
};
const unsigned int COMPUTE_PGM_RSRC1[] = { 0x000c0084 }; // Same as original
const unsigned int COMPUTE_DISPATCH_DIMENSIONS[] = {
1, 1, 1, // THREADS_X/Y/Z
1, 1, 1, // GROUPS_X/Y/Z
0, 0 // PIPELINESTAT/PERFCOUNT
};
const unsigned int COMPUTE_USER_DATA[] = {
static_cast<uint32_t>(cb_addr), // CB1_BASE_LO
static_cast<uint32_t>(cb_addr >> 32), // CB1_BASE_HI
cb_len, // CB1_SIZE
1 // CB1_VALID
};
// Configure shader registers
pQueue->PlaceAndSubmitPacket(
PM4SetShaderRegPacket(mmCOMPUTE_PGM_LO, COMPUTE_PGM_VALUES,
sizeof(COMPUTE_PGM_VALUES)/sizeof(COMPUTE_PGM_VALUES[0])));
pQueue->PlaceAndSubmitPacket(
PM4SetShaderRegPacket(mmCOMPUTE_PGM_RSRC1, COMPUTE_PGM_RSRC1,
sizeof(COMPUTE_PGM_RSRC1)/sizeof(COMPUTE_PGM_RSRC1[0])));
pQueue->PlaceAndSubmitPacket(
PM4SetShaderRegPacket(mmCOMPUTE_NUM_THREAD_X, COMPUTE_DISPATCH_DIMENSIONS,
sizeof(COMPUTE_DISPATCH_DIMENSIONS)/sizeof(COMPUTE_DISPATCH_DIMENSIONS[0])));
pQueue->PlaceAndSubmitPacket(
PM4SetShaderRegPacket(mmCOMPUTE_USER_DATA_0, COMPUTE_USER_DATA,
sizeof(COMPUTE_USER_DATA)/sizeof(COMPUTE_USER_DATA[0])));
// Dispatch the compute shader
pQueue->PlaceAndSubmitPacket(PM4DispatchDirectPacket(1, 1, 1));
// Wait for shader completion
pQueue->PlaceAndSubmitPacket(PM4ReleaseMemoryPacket(true, cb_addr, 0xC0FFEE));
pQueue->Wait4PacketConsumption();
}
int main() {
PM4Queue queue;
HsaMemoryBuffer isaBuf(trivial, sizeof(trivial), PAGE_SIZE, false);
// Map and initialize GPU resources
void* gpu_mmio_ptr = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
MAP_SHARED, open("/dev/mem", O_RDWR), 0);
uint64_t gpu_local_mem = 0; // Allocate with hsaKmtAllocMemory()
uint64_t cb_gpu_addr = gpu_local_mem;
// Set up the queue
gpu_setup(&queue);
// Copy shader code to GPU memory
gpu_memcpy(&queue, gpu_local_mem, trivial, sizeof(trivial));
// Run the shader
gpu_compute(&queue, gpu_local_mem, cb_gpu_addr, 16);
// Clean up
munmap(gpu_mmio_ptr, PAGE_SIZE);
hsaKmtFreeMemory(gpu_local_mem, sizeGpuMem);
return 0;
}
GRBM / TA counters
gem5 - amdgpu
Found this: http://doxygen.gem5.org/develop/structgem5_1_1GEM5__PACKED.html
PQ=PrimaryQueue?
GEM5: "The gem5 simulator is a modular platform for computer-system architecture research"
They seem to have some interesting stuff on the repo too: https://github.com/gem5/gem5/blob/stable/src/dev/amdgpu/pm4_packet_processor.cc
https://www.gem5.org/documentation/general_docs/gpu_models/vega
info about gfxhub
https://www.mail-archive.com/[email protected]/msg104391.html
+First you have the memory hub, gfxhub and mmhub. gfxhub is the
+memory hub used for graphics, compute, and sdma on some chips. mmhub
+is the memory hub used for multi-media and sdma on some chips.
Unable to attempt to run crash script (note: 6800 not 7900 XTX might be user error)
I was attempting to test this script to see if it would trigger similar issue I am having on my RX 6800, however it seems I am getting the following error when attempting to run it
Traceback (most recent call last):
File "/home/noah/Documents/AI/7900xtx/crash/driver.py", line 43, in <module>
kio = ioctls_from_header()
^^^^^^^^^^^^^^^^^^^^
File "/home/noah/Documents/AI/7900xtx/crash/driver.py", line 41, in ioctls_from_header
fxns[name.replace("AMDKFD_IOC_", "").lower()] = functools.partial(kfd_ioctl, idirs[idir], int(nr, 0x10), getattr(kfd, "struct_"+sname))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'kfd' has no attribute 'struct_kfd_ioctl_criu_args'. Did you mean: 'struct_kfd_ioctl_svm_args'?
Seems to be a issue in the kfd.py script itself and not really related to me running a older card to anything like that, however, I don't really understand the low level code enough to say for sure exactly
tools and links
Radeon GPU analyzer
https://gpuopen.com/rga/
Radeon GPU profiler
https://gpuopen.com/rgp/
General Kernel gpu docs
https://docs.kernel.org/gpu/index.html?highlight=amd
llvm backend
https://llvm.org/docs/AMDGPUUsage.html
python interface for smi
https://rocmdocs.amd.com/projects/amdsmi/en/latest/py-interface_readme_link.html
searched for sdma, sometimes through the magnifying glass can find stuff om rocmdocs
https://rocmdocs.amd.com/en/latest/conceptual/gpu-memory.html
good comment about MES Locks
https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h#L403-L448
there is apparently this tool called systemtap in linux, here they have some good call diagrams as svg files. both under callgraph and rocm
https://github.com/jinsdb/amdgpu-systemtap-toolkit
some options to enable debugging for rocm
https://rocmdocs.amd.com/en/develop/how-to/system-debugging.html
MADENCİ ŞARTLARI
Bir bitcoın madenci kendi frekanslarında prım dağılımı ile ödüllendirilir.
yüksek kodlara tabi tutulan değerler için ödemesi gerekli makinelerin bir gün (1) değeri 24 dolar
her kod iki madenci maksimum ve eşlenmiş hesaplar için dört adet olmakla yükümlüdür.
madencilerden ödül havuzu dolduğunda çekimler bıtcoın cüzdan numarası ile yapılmaktadır.
her bir havuz yüksek işlemci koda aktarılacak ise bir düzen ödemesi 2.258,25 ABD doları (tek seferlik ödeme)
[Question] Any way to disable AMD PSP / Platform Security Processor ?
First, props to George for the work
I have security & safety concerns about AMD "Platform Security Processor" or PSP
Intel "Management Engine" or ME is kind of the same with an OS ("Minix") loading before the user's OS & running with priviledges superior to the user OS kernel
Both Intel ME & AMD PSP are closed source.
Intel ME has been exploited before by malware
https://www.eweek.com/security/newly-revealed-flaw-in-intel-processors-allows-undetectable-malware/
Additionally, "security by obscurity", or hiding code from the public, has been widely regarded as bad security practice.
Lastly I am a customer buying hardware, and I want control over what code is running on my home & work computer.
Thus, I would like to completely disable or remove AMD PSP as it appears as a security flaw to me and I do not trust the unknown & closed source code it is running.
If anyone knows how to do this, I will be very thankful for you to share it with us.
7900XT and rocm 6.1
interesting notes: (and yes i know, userspace shall never crash driver)
Pretty much reproduces on a 7900 XT(not the last X)
i dont get into that unrecoverable state though(or maby i have not run it for long enough)
v6.1 of rocm was released a few days ago 12th april, it behaves way better, with ubuntu 20.04 and default 5.15 kernel
Modified the driver.py
, indented the print, added a counter, sleep 1 and destroyed the queue.
print(nq, ii)
ii += 1
time.sleep(1)
qq = kio.destroy_queue(fd)
With rocm 6.0.3 and kernel 5.15 it will shortly start to write these to dmesg
(before 100 loops):
[ 337.541867] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=14
and you notice that it starts to go abit slower. Yes it will run a long time, over 1000 loops... though spewing the errors.
With rocm 6.1 and kernel 5.15 it will run for over 1000 runs
almost just writes these warnings:
[drm] Skip scheduling IBs!
They have not just silenced the errors, it seems to work better, otherwise it would go slower like it did for 6.0.3
bob@melee:~/dev/7900xtx/crash$ uname -a
Linux melee 5.15.0-105-generic #115~20.04.1-Ubuntu SMP Mon Apr 15 17:33:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Updated to kernel 6.5 and both rocm 6.0.3 and 6.1... ewww. spewing errors quickly.
how to load registry map into ghidra
George - stop using google - just use Claude.ai / opus
"I wish I had a NSA engineer helping me..." - Claude.ai ! Claude.ai ! Claude.ai ! - its read every repo - every line of code on github. Ask it about tiny grad - push it with questions you would think it wouldn't understand - it knows everything.
https://gist.github.com/johndpope/add694bcc04f0df134aa9938c12f72ce
UPDATE
just realized this is coded for jython land
this seems to fix that.
https://pypi.org/project/ghidra-bridge/
[PATCH] drm/amdgpu : Add mes_log_enable to control mes log feature
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.