RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    Overclocking the Pi3b+ GPU (Results)

    Scheduled Pinned Locked Moved General Discussion and Gaming
    pi3 b+overclockgpu
    133 Posts 18 Posters 39.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dankcushionsD
      dankcushions Global Moderator @mitu
      last edited by

      @mitu i didn't but that is a better idea! :)

      H 1 Reply Last reply Reply Quote 0
      • H
        hhromic @dankcushions
        last edited by hhromic

        @dankcushions @mitu @BuZz while setting the scheduler to performanceby default in runcommand is a tempting idea, I would advocate against it. Setting this could potentially make RPIs to overheat without users being aware of it. At least in my experience, I have never had the need to set the scheduler to performance and things runs nice for me.
        I think the best approach here is to educate the users about the CPU scheduler more than forcing a potentially troublesome option without their knowledge.

        1 Reply Last reply Reply Quote 3
        • BuZzB
          BuZz administrators @mitu
          last edited by

          @mitu I understood. I don't want to default to switching the governor on launch.

          To help us help you - please make sure you read the sticky topics before posting - https://retropie.org.uk/forum/topic/3/read-this-first

          1 Reply Last reply Reply Quote 0
          • quicksilverQ
            quicksilver
            last edited by

            Setting to performance is a small change that really shouldnt make a significant difference when it comes to wear and tear on a pi but I agree with buzz that it should be up to the user to make that decision. I think there is some documentation in regards to the CPU governor modes in the retropie docs but maybe we can clarify that performance mode does help a few games/systems run smoother (perhaps it already says this, need to review). I would be happy to make any edits to the wiki if people feel that it is pertinent info.

            mituM dankcushionsD 2 Replies Last reply Reply Quote 0
            • mituM
              mitu Global Moderator @quicksilver
              last edited by

              @quicksilver There is a note in the Wiki on https://retropie.org.uk/docs/Speed-Issues/ and also on the N64 page, but maybe wen can add it also to the Overclock or Advanced configuration.

              1 Reply Last reply Reply Quote 1
              • dankcushionsD
                dankcushions Global Moderator @quicksilver
                last edited by

                it would be useful to see some specific examples (games, benchmarks, etc) as i don't really get why ondemand (which i think is the default) would be slower than performance.

                since ondemand ramps up the speed with load. i would have thought there should be no difference between the runtime cpu frequency between governor in cpu-heavy applications. they both should be running the cpu at full speed in a cpu-limited emulator, right?

                B 1 Reply Last reply Reply Quote 1
                • shavecatS
                  shavecat
                  last edited by shavecat

                  I always deal with the overclock that's what i get.. (crazytaxi 2 runs really nice )
                  just now the core_freq=600
                  so give in it afew days see if its stable.
                  d3bef088-453f-4493-a292-a61d1d3df926-image.png

                  1 Reply Last reply Reply Quote 0
                  • H
                    hhromic
                    last edited by hhromic

                    Something I don't see mentioned very often and I think is important to keep in mind is that the RPIs have a "warranty bit" that is burned when you overvoltage too agressively. This way, RMA or support can know if users broke the RPI by misuse or the device was faulty from factory.

                    over_voltage
                    (...) Values above 6 are only allowed when force_turbo is specified: this sets the warranty bit if over_voltage_* is also set.

                    force_turbo
                    (...) Enabling this may set the warranty bit if over_voltage_* is also set.

                    never_over_voltage
                    Sets a bit in the OTP memory (one time programmable) that prevents the device from being overvoltaged. This is intended to lock the device down so the warranty bit cannot be set either inadvertently or maliciously by using an invalid overvoltage.

                    Ref: https://www.raspberrypi.org/documentation/configuration/config-txt/overclocking.md
                    Ref: https://www.raspberrypi.org/forums/viewtopic.php?p=176865#p176865
                    Ref: https://www.raspberrypi.org/blog/introducing-turbo-mode-up-to-50-more-performance-for-free/

                    quicksilverQ 2 Replies Last reply Reply Quote 0
                    • quicksilverQ
                      quicksilver @hhromic
                      last edited by

                      @hhromic the interesting thing is the current model pis cant overvolt any higher then a value of 4. At over_voltage=4 the core voltage equals 1.394v and it will not increase any higher than that. Values of 6-8 still only equal 1.394v (as @Rascas noted earlier). I think the stock core voltage is set higher on current model pis. I am sure that force_turbo would set the warranty bit but does over_voltage=4 now also set it? Official rpi documents are vague about this.

                      1 Reply Last reply Reply Quote 1
                      • ParabolaralusP
                        Parabolaralus @Brunnis
                        last edited by

                        @Brunnis I really dont know if it actually would apply that setting and found no tangible benefit to clocking it that high, but i do know that if i set it to 735 even it would eventually freeze on me...usually within 20-30 minutes of gameplay so 735 is not stable.

                        My testing method with 733 as well as other settings involved starting a PS1 game (hence moderate load mentioned earlier) and leaving it for a few days. I normally do not play video games during the week so i could leave it running without it being a pain in the butt. Three days later id find the game still running and call it stable.

                        My other pi (3b+) as mentioned before would flat out freeze on me so much as pushing the RAM 10mhz higher. I guess silicon lottery and one is a slight score while the other is a dud for OCing.

                        On the governer thing. I never really thought about changing that TBH...I think performance would work in place, but i noticed absolutely no slowdowns using force_turbo and kind of left it at that.
                        Does anyone have anyway to see if it is actually downclocking while running a core/rom with ondemand, or is this something thats more user experience?

                        mituM 1 Reply Last reply Reply Quote 0
                        • mituM
                          mitu Global Moderator @Parabolaralus
                          last edited by

                          @Parabolaralus said in Overclocking the Pi3b+ GPU (Results):

                          On the governer thing. I never really thought about changing that TBH...I think performance would work in place, but i noticed absolutely no slowdowns using force_turbo and kind of left it at that.

                          Using force_turbo overrides the CPU governor and runs the cores at max frequency, so you're already using a 'performance' profile

                          By default (force_turbo=0) the "On Demand" CPU frequency driver will raise clocks to their maximum frequencies when the ARM cores are busy and will lower them to the minimum frequencies when the ARM cores are idle.

                          force_turbo=1 overrides this behaviour and forces maximum frequencies even when the ARM cores are not busy.

                          1 Reply Last reply Reply Quote 0
                          • B
                            Brunnis @dankcushions
                            last edited by Brunnis

                            @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                            it would be useful to see some specific examples (games, benchmarks, etc) as i don't really get why ondemand (which i think is the default) would be slower than performance.

                            since ondemand ramps up the speed with load. i would have thought there should be no difference between the runtime cpu frequency between governor in cpu-heavy applications. they both should be running the cpu at full speed in a cpu-limited emulator, right?

                            It might be mainly a problem if you decrease buffering, for example by setting max_swapchain_images=2. I believe the issue is caused by the on demand CPU governor not being able to handle the spiky CPU load. The CPU will emulate one frame and then push it to the GPU. While the GPU waits for a frame flip, the CPU will more or less idle, before kicking off emulation of the next frame. My guess is that the governor spins down the CPU and loses too much time when spinning it back up again during the next frame.

                            I would say using the performance governor as default for the run command would be safe. The user should expect (and want) the CPU to be in the high performance state anyway when running an emulator (and have the necessary cooling in place). The fact that the CPU may not always hit or stay at max frequency is the actual unexpected part here.

                            EDIT: On second thought, I guess the reduced buffering just makes the issue more likely to crop up. The unwanted CPU frequency reduction probably happens all the time at default settings as well, it’s just that there’s an additional frame buffered that will mostly cover the performance drop and prevent frame rate hitches.

                            @quicksilver said in Overclocking the Pi3b+ GPU (Results):

                            @hhromic the interesting thing is the current model pis cant overvolt any higher then a value of 4. At over_voltage=4 the core voltage equals 1.394v and it will not increase any higher than that. Values of 6-8 still only equal 1.394v (as @Rascas noted earlier). I think the stock core voltage is set higher on current model pis. I am sure that force_turbo would set the warranty bit but does over_voltage=4 now also set it? Official rpi documents are vague about this.

                            My Pi 3 B+ actually hits 1.39V already at over_voltage=1. That’s actually pretty high on 40nm, so I wouldn’t want to push it more. I guess the A53 really isn’t made to cope with high frequencies... 1.5GHz on 40nm and 1.4V is pretty abysmal.

                            H 1 Reply Last reply Reply Quote 0
                            • BuZzB
                              BuZz administrators
                              last edited by

                              Most likely safe, but I prefer end users to make decisions like this. Also not everything would benefit - maybe some things that are launched you want to reduce clock if not in use. Frotz? Kodi? No doubt things will run hotter and consume more power. And in many cases it would be a waste. What about handhelds?

                              It's not going to be changed :-)

                              To help us help you - please make sure you read the sticky topics before posting - https://retropie.org.uk/forum/topic/3/read-this-first

                              B 1 Reply Last reply Reply Quote 1
                              • H
                                hhromic @Brunnis
                                last edited by hhromic

                                @Brunnis the ondemand governor is not so primitive for switching speed.
                                While what you say is true that the CPU idles more with these emulators that use the GPU, that idling time is very short and the governor won't be micro-switching the frequency so fast. In particular, there is a setting for how often the governor will monitor the load to do adjustments:

                                * sampling_rate:
                                
                                  Measured in uS (10^-6 seconds), this is how often you want the kernel
                                  to look at the CPU usage and to make decisions on what to do about the
                                  frequency.  Typically this is set to values of around '10000' or more.
                                  It's default value is (cmp. with users-guide.txt): transition_latency
                                  * 1000.  Be aware that transition latency is in ns and sampling_rate
                                  is in us, so you get the same sysfs value by default.  Sampling rate
                                  should always get adjusted considering the transition latency to set
                                  the sampling rate 750 times as high as the transition latency in the
                                  bash (as said, 1000 is default), do:
                                
                                  $ echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
                                

                                And there are many other settings to control the rather advanced ondemand governor :)

                                Ref: https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt

                                B 1 Reply Last reply Reply Quote 0
                                • B
                                  Brunnis @BuZz
                                  last edited by

                                  @BuZz said in Overclocking the Pi3b+ GPU (Results):

                                  Most likely safe, but I prefer end users to make decisions like this. Also not everything would benefit - maybe some things that are launched you want to reduce clock if not in use. Frotz? Kodi? No doubt things will run hotter and consume more power. And in many cases it would be a waste. What about handhelds?

                                  It's not going to be changed :-)

                                  Yep, for Kodi and the likes it’s definitely not a good idea to use the performance governor. For handhelds, performance governor would still be the way to go for emulation, since it’s still the predictable and stable mode. Any battery life issues should be handled by adjusting frequencies instead.

                                  I definitely understand your stance, though. Development is full of compromises and I’m happy with just changing the setting via the run command menu.

                                  1 Reply Last reply Reply Quote 1
                                  • B
                                    Brunnis @hhromic
                                    last edited by Brunnis

                                    @hhromic said in Overclocking the Pi3b+ GPU (Results):

                                    @Brunnis the ondemand governor is not so primitive for switching speed.
                                    While what you say is true that the CPU idles more with these emulators that use the GPU, that idling time is very short and the governor won't be micro-switching the frequency so fast. In particular, there is a setting for how often the governor will monitor the load to do adjustments:

                                    * sampling_rate:
                                    
                                      Measured in uS (10^-6 seconds), this is how often you want the kernel
                                      to look at the CPU usage and to make decisions on what to do about the
                                      frequency.  Typically this is set to values of around '10000' or more.
                                      It's default value is (cmp. with users-guide.txt): transition_latency
                                      * 1000.  Be aware that transition latency is in ns and sampling_rate
                                      is in us, so you get the same sysfs value by default.  Sampling rate
                                      should always get adjusted considering the transition latency to set
                                      the sampling rate 750 times as high as the transition latency in the
                                      bash (as said, 1000 is default), do:
                                    
                                      $ echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
                                    

                                    And there are many other settings to control the rather advanced ondemand governor :)

                                    Ref: https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt

                                    Thanks for the info. If it’s not the idle time that’s causing the issue, it’s something about the code itself being executed that fools the governor into thinking it’s okay to downclock. I believe I once saw it mentioned that the code used to emulate SuperFX games on the SNES could cause this issue. No idea if there’s any truth to it, though. I believe I’ve only seen the issue in SuperFX games, but at the same time they’re usually the most demanding ones I emulate, so they’d naturally be the first ones to have issues if margins are slim.

                                    EDIT:

                                    This section is pretty interesting (from the link you posted):

                                    * up_threshold:
                                    
                                      This defines what the average CPU usage between the samplings of
                                      'sampling_rate' needs to be for the kernel to make a decision on
                                      whether it should increase the frequency.  For example when it is set
                                      to its default value of '95' it means that between the checking
                                      intervals the CPU needs to be on average more than 95% in use to then
                                      decide that the CPU frequency needs to be increased.
                                    

                                    In my own tests, I've noticed that the Pi 3 may need as much as 5 ms between pushing a 1080p frame to the GPU and the frame flip occurring. In that time, unless there's additional frame buffers to render to, the CPU is mostly idle. That gives an average CPU usage of ~70%, which would not be enough to stay at the highest frequency. EDIT: Actually, the above talks about what's needed to initially increase clocks... There's also this:

                                    * sampling_down_factor:
                                    
                                      This parameter controls the rate at which the kernel makes a decision
                                      on when to decrease the frequency while running at top speed. When set
                                      to 1 (the default) decisions to reevaluate load are made at the same
                                      interval regardless of current clock speed. But when set to greater
                                      than 1 (e.g. 100) it acts as a multiplier for the scheduling interval
                                      for reevaluating load when the CPU is at its top speed due to high
                                      load. This improves performance by reducing the overhead of load
                                      evaluation and helping the CPU stay at its top speed when truly busy,
                                      rather than shifting back and forth in speed. This tunable has no
                                      effect on behavior at lower speeds/lower CPU loads.
                                    

                                    It's not completely clear, but I'm guessing that, by default, decisions about down clocking are made using the same sampling period and load evaluation as when increasing the clocks. Anyone know for sure?

                                    1 Reply Last reply Reply Quote 0
                                    • B
                                      Brunnis
                                      last edited by

                                      24h test complete. No issues found (Quake 3 + memtester 512 + sysbench (2 threads)) at the following settings:

                                      arm_freq=1475
                                      core_freq=600
                                      v3d_freq=400
                                      sdram_freq=550
                                      over_voltage=1
                                      temp_soft_limit=70
                                      
                                      robertvb83R 1 Reply Last reply Reply Quote 0
                                      • robertvb83R
                                        robertvb83 @Brunnis
                                        last edited by robertvb83

                                        @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                                        24h test complete. No issues found (Quake 3 + memtester 512 + sysbench (2 threads)) at the following settings:

                                        arm_freq=1475
                                        core_freq=600
                                        v3d_freq=400
                                        sdram_freq=550
                                        over_voltage=1
                                        temp_soft_limit=70
                                        

                                        i have played a little with overclocking on 3B+... I have no issues with temperature and games play well without any trouble

                                        However, I have issues with compiling (updating from source) with an overclocked 3B+. updating mame2003-plus almost always freezes or stops with errors... Any idea about that? (I don't know my exact settings but I had this issue with many settings found around the internet, even for moderate overclocking)

                                        Edit: I also did a sysbench stress test without issues

                                        My full size arcade cabinet Robotron vs. Octolyzer

                                        mituM 1 Reply Last reply Reply Quote 0
                                        • mituM
                                          mitu Global Moderator @robertvb83
                                          last edited by

                                          @robertvb83 said in Overclocking the Pi3b+ GPU (Results):

                                          updating mame2003-plus almost always freezes or stops with errors..

                                          What kind of errors ? If they're memory related error (not enough memory), then you can increase the amount of swap added during compilation to get over those issues. Do you get the same kind of errors without overclocking ?

                                          robertvb83R 1 Reply Last reply Reply Quote 0
                                          • B
                                            Brunnis
                                            last edited by Brunnis

                                            @BuZz @hhromic To expand on the discussion regarding CPU governor and lower than expected performance: I've been watching the output of the 'top' command now, while running some SNES loads and below are some results. "Tweaked video settings" below means:

                                            video_driver="dispmanx"
                                            video_threaded="false"
                                            video_max_swapchain_images=2
                                            

                                            7ac80baa-c347-4733-aa33-3b8f95e31dd3-image.png

                                            For the ondemand CPU governor tests above, I also ran a script that read actual CPU frequency every second. Turns out the ondemand CPU governor leads to frequent downclocking (to 600 MHz) in all test cases (whether running Super Mario World or Super Mario World 2 and whether using default or tweaked video settings). Here are the printouts:

                                            Test 2: Governor ondemand (tweaked video settings) - SMW
                                            Test 2: Governor ondemand (tweaked video settings) - SMW2
                                            Test 4: Governor ondemand (default video settings) - SMW
                                            Test 4: Governor ondemand (default video settings) - SMW2

                                            So, to conclude, it doesn't look like the ondemand CPU scheduler handles this in an optimal way. The constant ping-ponging of the CPU frequency (even with default RetroPie settings) is hardly optimal and may lead to performance issues in some cases. For most situations, the additional frame buffering used on a default installation seems to mask the impact of the reduced CPU frequency. Removing that buffering (i.e. using video_max_swapchain_images=2) reveals the issue in an obvious way with stuttering performance in demanding situations (such as SMW2).

                                            dankcushionsD ParabolaralusP 2 Replies Last reply Reply Quote 0
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.