Monday, April 05, 2021

Power Limit as the Final Solution to Eileen-II's Heating Issues?

It's the first Monday of April, immediately after Easter Sunday.

Yesterday, I spent a little bit of time tweaking my Python3 script that brute-force resamples images to be within the ball-park of a target file size in JPEG. Script's useful for adjusting my One Punch Man CBR collection for easy reading. It reduced the CBR from about 2.1 GB to about 1.36 GB, so I would say that it was very successful.

I'm up to page 579/872 for Tools of Titans, and read The Laws of Thermodynamics: A Very Short Introduction by Peter Atkins as well.

I have also started on Junji Itō's The Junji Itō Horror Comic Collection. My first exposure to Itō's work is The Enigma of Amigara Fault. The story premise is fairly clean, without over explanation. That made me curious about Itō's other works, and hence this Comic Collection look.

I actually have a little trepidation of reading his works, considering that The Enigma of Amigara Fault is considered mild. I'm not creeped out enough from The Enigma of Amigara Fault to have nightmares, but I am not so sure about the other works.

------

In the ever-evolving saga of keeping Eileen-II from melting my fingers off, I did more tweaking.

I was running Cyberpunk 2077 on full-power setting while using an external keyboard to avoid having to deal with the much heightened temperatures and it was alright for quite a bit; sure, the temperatures were consistently above 95 °C, but it was still thermal throttling at 100 °C, so it was still sort of working as designed.

That is, until the ``Core x Critical Temperature'' flag for all x CPU cores were all tripped in HWiNFO64 and the ``CPU Package'' maxed out into 107 °C.

That raised too many alarm bells in my head. Tripping the ``Core x Critical Temperature'' flag is a big no-go, the point where really bad things can happen. I mean, getting thermal throttled is bad, but at least it is still within the operating temperatures---critical temperature triggers are exceeding the operating temperatures.

I did some more light experiments to study the problem. The fundamental issue is that the power rating of the cooling system (i.e. the fans, the heat pipes/sinks, and the ambient temperature/humidity) is much too low compared to the thermal dissipation power of the CPU.

Now, I could, in theory, rip it out and re-paste the thermal compound to ensure that the thermal energy is more effectively transferred to the heat pipes/sinks, but I don't think that is as major an issue as the fact that a non-air-conditioned room in a hot and humid environment is an absolutely rubbish place to dump excess thermal energy fast enough.

So, the temperature limits being busted is an indication of an inadequate dissipation power rating, or in layman's terms, the CPU is generating too much thermal energy compared to the rate of cooling per unit time.

I cannot rely on thermal throttling because the lag time is clearly too much, since the critical temperature flags are tripped.

So it was time to get back to basics, and figure something else altogether.

Under the standard profile (i.e. profile 1 or ``performance'') in ThrottleStop, I set the voltage offset to −40.0 mV---this has been established to be stable for Eileen-II.

My new approach is to consider the use of power limiting instead of thermal throttling.

To understand power limiting, I needed to understand PL1, PL2, and τ settings. This chart explains their relationship, and I am replicating it here for reference:
Put simply, PL2 is the peak power we are allowing the Turbo Boosting to go, τ is the time (in seconds) we allow the CPU to run at PL2 power, and PL1 is the power level we will throttle back (via messing with the Core x ratios).

Recall that Eileen-II sports Intel i7-10750H processor. The spec-sheet Thermal Design Power (TDP) ranges from 35 W to 45 W, depends on which of the two published numbers one wants to look at.

With ThrottleStop, I found that the PL1 and PL2 were set to 107 W, with τ being 56 s.

No wonder everything was frying under heavy load. Urgh.

Anyway, long story short, I tried to adjust the three parameters. After experimenting, I decided on just tweaking PL1 to a level that keeps me somewhere in the 90--99 °C range when playing Cyberpunk 2077, leaving PL2 and τ alone and relying on thermal throttling to kick in first before forcing power limiting just to prevent things from getting out of hand.
The rationale is that if I have a heavy (but short) duration load, perhaps it is better to just allow a quick burst of speed (and thus large power spike) to complete it. I just think that the sustained temperature is a bigger problem than just spikes, unless the said spike exceeds the critical temperature.

Oh, and I set the Alienware Control Centre setting to ``Balanced''. It did end up shaving off 10--20 fps in Cyberpunk 2077 (i.e. from 55+ fps to around 35--47 fps), but the plus side was that I didn't have to use an external keyboard, and that I was not running a helicopter when I was gaming.

It is at times like this I wonder if it had been a better idea to build another PC instead. But then I do get more mobility with this set up than a PC, so I really shouldn't complain too much.

------

I think that the Old Testament does not get enough love. Many people like to talk about Mosaic Law, or the first five books of the Old Testament, and many cited a lot of things from there. But there is just so much more about the ancient Israelites that are covered in the other books of the Old Testament that do not seem to get much notice, and to me, they expound upon the ramifications of disobedience of the Mosaic Law much better than anything else. I just think that it is a shame that the vocal ones who decide to quote from Scripture don't seem to realise that there are 66 books in all, and not everything is just from the first 5---if only the first 5 were sufficient, then the remaining 61 would be completely unnecessary.

Anyway, that's all I have for today. Till the next update then.

No comments: