GithubHelp home page GithubHelp logo

Fan Setting/Reading Issues about gpu-utils HOT 8 CLOSED

Ricks-Lab avatar Ricks-Lab commented on August 28, 2024
Fan Setting/Reading Issues

from gpu-utils.

Comments (8)

Ricks-Lab avatar Ricks-Lab commented on August 28, 2024

@csecht I just checked the math and it looks right. I did make the assumption that all GPU would use the PWM range from 0 to 255. Can you verify this for your card? Determine the HWMON directory using amdgpu-ls then cat the files: pwm1_max and pwm1_max

from gpu-utils.

csecht avatar csecht commented on August 28, 2024

from gpu-utils.

Ricks-Lab avatar Ricks-Lab commented on August 28, 2024

Part of the issue could be that pac shows actual fan speed, instead of the setting. I will check if I can get the setting instead. Since there was no delay between pac display update and writing the settings, the fans are still slowing down, so if you hit refresh a bunch of times, you can see the setting change. I added a 500ms wait time after writing settings to minimize this effect. But even now, the actual fan speed can be very different than what you specify. I will continue to investigate this. The reset command just changes it to manual mode. The latest on master has the 500ms delay and basic function for Radeon VII.

from gpu-utils.

csecht avatar csecht commented on August 28, 2024

I downloaded the latest master. I'm not sure there's any difference with how fans are set. I've done some more exploring into actual pmw values from amdgpu-ls, instead of just looking at % speed in the monitor. What I see is that with each PAC save the pmw decrements by a max value of 8. As the value gets closer to a stable setting (equivalent to 0, 20, 40, 60 80 100%), the decrement of the pmw setting becomes less until it hits a stable value. I only tested two stable points: pmw=102=40%, pmw=153=60% (same for both cards). PAC 'Save' no longer decrements pmw once one of these stable settings is reached. The observation of the 3% decrements (sometimes 2%) is a rounding or binning issue; for example, entered PAC value 48% and Save, then set value for
pmw=122=45%, Save->
pmw=114=42%, Save->
pmw=107=40%, Save->
pmw=102=40%, Save->
pmw=102=40%, etc...

from gpu-utils.

csecht avatar csecht commented on August 28, 2024

CORRECTION: First, I got all dyslexic and wrote pmw instead of pwm, sorry. The other error in my previous post is that I didn't get pwm values from ampgpu-ls, but from the ampgpu-pac --execute_pac terminal window as it printed the shell script on execution.

from gpu-utils.

Ricks-Lab avatar Ricks-Lab commented on August 28, 2024

I have checked it out here and it looks like the pwm values written to the card are correctly converted from the percentage value entered into the interface, but the resultant fan speed is different from what is specified. I will continue to investigate and research other implementations like rocm-smi.

from gpu-utils.

csecht avatar csecht commented on August 28, 2024

UPDATE: I tested fan settings and readings with ROC-smi and it too does what ampgpu-pac does, it reports the fan speed (level) lower than what is set. A slight difference is that ROC-smi shows a 5 unit decrement in the fan level (on 0 to 255 scale), where amdgpu-pac & -monitor show a 8 unit decrement. Both programs have the same stable points where the reported and set speeds agree (40% in the example below). I was running amdgpu-monitor during this run through rocm-smi commands and it showed the the same integer values as rocm-smi for % fan speed (e.g. 47% for rocm-smi's 47.84%)

Trimmed terminal output from a sequence of rocm-smi commands:

$ ./rocm-smi -d 1 --setfan 50%
GPU[1] : Successfully set fan speed to Level 127

$ ./rocm-smi -d 1 --showfan
GPU[1] : Fan Level: 122 (47.84)%

$ ./rocm-smi -d 1 --setfan 122
GPU[1] : Successfully set fan speed to Level 122

$ ./rocm-smi -d 1 --showfan
GPU[1] : Fan Level: 117 (45.88)%

$ ./rocm-smi -d 1 --setfan 40%
GPU[1] : Successfully set fan speed to Level 102

$ ./rocm-smi -d 1 --showfan
GPU[1] : Fan Level: 102 (40.0)%

from gpu-utils.

Ricks-Lab avatar Ricks-Lab commented on August 28, 2024

Seems to be a driver limitation. We have covered the anomaly in the user guide

Will revisit if behavior changes in newer driver or kernel releases.

from gpu-utils.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.