• Recent
  • Tags
  • Popular
  • Home
  • Docs
  • Register
  • Login
RetroPie forum home
  • Recent
  • Tags
  • Popular
  • Home
  • Docs
  • Register
  • Login

Overclocking the Pi3b+ GPU (Results)

Scheduled Pinned Locked Moved General Discussion and Gaming
pi3 b+overclockgpu
133 Posts 18 Posters 39.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • H
    hhromic @quicksilver
    last edited by 8 Feb 2019, 16:15

    @quicksilver said in Overclocking the Pi3b+ GPU (Results):

    So force turbo or any amount of over voltage applied will set the warranty bit.

    No, it is force_turbo=1 and over_voltage_* > 0. If you don't use force_turbo, you are fine.

    The specific circumstances where the overclock bit is set are if force_turbo is set to 1 and any of the over_voltage_* options are set to a value > 0.

    Q 1 Reply Last reply 8 Feb 2019, 16:17 Reply Quote 0
    • Q
      quicksilver @hhromic
      last edited by 8 Feb 2019, 16:17

      @hhromic Ah thank you for the clarification! I completely missed the "AND".

      1 Reply Last reply Reply Quote 1
      • H
        hhromic
        last edited by 8 Feb 2019, 16:22

        Interesting discussions and investigations guys!
        I still advocate to leave ondemand as the system default and educate users on how overclocking works and how to use the governor runcommand option, as it's the soundest/safest approach. This topic should definitively be used to update/populate the Wiki entry on the topic.

        The only improvement I would consider in this subject would be to implement a per-command governor setting in runcommand, similar to how video modes are set currently. For example create a governors.cfg file alongside videomodes.cfg.

        This would give the flexibility for the governor to be configured per-emulator as necessary, e.g. performance for lr-mupen64plus and default for lr-gambatte, or any other customisation.

        What do you think @buzz? should be a fairly easy thing to code taking the videomode functionality as template.

        B 1 Reply Last reply 8 Feb 2019, 16:25 Reply Quote 0
        • B
          BuZz administrators @hhromic
          last edited by BuZz 2 Aug 2019, 16:25 8 Feb 2019, 16:25

          @hhromic The interface is busy enough as it is. I don't think this warrants that level of configuration. So no thanks.

          To help us help you - please make sure you read the sticky topics before posting - https://retropie.org.uk/forum/topic/3/read-this-first

          H 1 Reply Last reply 8 Feb 2019, 16:35 Reply Quote 0
          • H
            hhromic @BuZz
            last edited by 8 Feb 2019, 16:35

            @BuZz umm I could have sworn there was already a menu entry for the cpu governor in runcommand, but you are right, it is only read from the global options and configured externally in the runcommand scriptmodule. I agree that adding this to the menu would add two more entries to the already crowded interface.

            The idea was more for these advanced tinker users (like in this topic!), so if you reconsider it in the future, perhaps we can add the functionality without exposing any menu items, i.e. requiring editing the governors config file manually.

            B 1 Reply Last reply 8 Feb 2019, 16:36 Reply Quote 0
            • B
              BuZz administrators @hhromic
              last edited by 8 Feb 2019, 16:36

              @hhromic advanced users can do this via an onstart/onend script if they want.

              To help us help you - please make sure you read the sticky topics before posting - https://retropie.org.uk/forum/topic/3/read-this-first

              H 1 Reply Last reply 8 Feb 2019, 16:41 Reply Quote 0
              • H
                hhromic @BuZz
                last edited by 8 Feb 2019, 16:41

                @BuZz you mean adding something like this (and the corresponding reverting snippet in onend):

                #!/usr/bin/env bash
                system="$1"
                emulator="$2"
                if [[ "$emulator" == "lr-mupen64plus" ]]; then
                for cpu in /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor; do
                echo performance | sudo tee "$cpu" >/dev/null
                done
                fi

                Instead of adding something like this to a governors.cfg file?

                lr-mupen64plus = performance
                

                :)

                1 Reply Last reply Reply Quote 0
                • B
                  BuZz administrators
                  last edited by BuZz 2 Aug 2019, 16:55 8 Feb 2019, 16:50

                  Yes. If you're going to ignore the work involved putting it into runcommand and future maintenance of the code also.

                  But putting your sarcasm to one side - You can simplify that script on the RPI by just using cpu0 and skipping the loop.

                  no need for a corresponding reverting snippet either - can just be one line to restore to ondemand.

                  To help us help you - please make sure you read the sticky topics before posting - https://retropie.org.uk/forum/topic/3/read-this-first

                  H 1 Reply Last reply 8 Feb 2019, 17:00 Reply Quote 1
                  • H
                    hhromic @BuZz
                    last edited by hhromic 2 Aug 2019, 17:03 8 Feb 2019, 17:00

                    @BuZz sorry I didn't mean to be rude, I'm genuinely being friendly here. I realise being sarcastic wasn't a good move. Apologies.

                    Of course I'm not ignoring the work needed to code this functionality, and I was going to volunteer on doing it and testing it myself if you felt it was a contributing addition to the system. I understand your safety/maintainability concerns very well and respect your wishes as the project leader. If you don't think is worth it, no hard feelings and all good :thumbsup.

                    no need for a corresponding reverting snippet either - can just be one line to restore to ondemand.

                    I was just refering to the actual nice approach in runcommand where it saves the current governor and restores it on exit :)

                    Actually runcommand has all the functionality built-in to set/unset the governor already and is robust, that's why I liked the idea of implementing it in there instead of onstart/onend scripts.

                    B 1 Reply Last reply 8 Feb 2019, 17:10 Reply Quote 2
                    • B
                      BuZz administrators @hhromic
                      last edited by 8 Feb 2019, 17:10

                      @hhromic no worries. the functionality in runcommand is technically overkill on the RPI as the cores are not independently controllable (hence why using cpu0 is enough).

                      To help us help you - please make sure you read the sticky topics before posting - https://retropie.org.uk/forum/topic/3/read-this-first

                      1 Reply Last reply Reply Quote 1
                      • P
                        Parabolaralus @Brunnis
                        last edited by 8 Feb 2019, 19:09

                        @Brunnis Thank you for taking the time to research and post this data!

                        1 Reply Last reply Reply Quote 1
                        • R
                          Rion
                          last edited by 8 Feb 2019, 20:24

                          @Brunnis I have also noticed the slowdowns happening in certain games using ondemand CPU governor.

                          I never bothered with Overclocking but just changed to performance instead.

                          But if would be interesting to see if there is anyway to optimize cpu governor ondemand.

                          FBNeo rom filtering
                          Mame2003 Arcade Bezels
                          Fba Arcade Bezels
                          Fba NeoGeo Bezels

                          1 Reply Last reply Reply Quote 0
                          • B
                            Brunnis @dankcushions
                            last edited by Brunnis 2 Aug 2019, 21:35 8 Feb 2019, 21:30

                            @dankcushions

                            no, i'm just articulating what i mean when i say that the data presented is not the "smoking gun", but your personal observations of a stutter is.

                            Fair enough.

                            i think the sampling_down_factor might be the one we would tweak:

                            • sampling_down_factor:

                              This parameter controls the rate at which the kernel makes a decision

                              on when to decrease the frequency while running at top speed. When set

                              to 1 (the default) decisions to reevaluate load are made at the same

                              interval regardless of current clock speed. But when set to greater

                              than 1 (e.g. 100) it acts as a multiplier for the scheduling interval

                              for reevaluating load when the CPU is at its top speed due to high

                              load. This improves performance by reducing the overhead of load

                              evaluation and helping the CPU stay at its top speed when truly busy,

                              rather than shifting back and forth in speed. This tunable has no

                              effect on behavior at lower speeds/lower CPU loads.

                            I don't think that will work either. The problem is, again, that the average load is too low. Whether we stretch out the sample period over 1, 2, 10 frames, the average load will be close to the same and far below the required 95% that's needed to stay at the highest speed.

                            The way I see it, rapid highly periodic loads like these are hard to handle. The same issue occurs when running RetroArch on Windows 10 machines with modern Core processors, so it's not isolated to the Raspberry Pi. The only possible solutions I've been able to come up with so far are to:

                            1. Decrease the sample period, so that reactions to load changes can be carried out faster. If the default sample period is really 10 ms, that means more than half the execution time of a frame can be spent at the lower frequency before the CPU is instructed to increase clocks. The sample period would need to be drastically reduced in order to minimize the time spent down clocked after beginning actual timing critical work.

                            2. Use the "performance" governor. This completely eliminates the inefficiency of needing to sample CPU load before reacting.

                            That's it for me on the topic. I'm fine with using the run command settings to control this, like I always have before. Sometimes it's just fun to try to understand the mechanics behind a behavior. :-)

                            @Rion said in Overclocking the Pi3b+ GPU (Results):

                            I have also noticed the slowdowns happening in certain games using ondemand CPU governor.

                            Even without using video_max_swapchain_images=2?

                            @Rion said in Overclocking the Pi3b+ GPU (Results):

                            I never bothered with Overclocking but just changed to performance instead.

                            Yeah, that's the correct approach. Starting out with overclocking would be bad, since you're then just working against a mechanism that's now even more prone to try to lower the frequency. So, first change the governor, then overclock if performance still isn't good enough. :-)

                            @Rion said in Overclocking the Pi3b+ GPU (Results):

                            But if would be interesting to see if there is anyway to optimize cpu governor ondemand.

                            I think the nature of the load makes it hard. It will never be as performant as simply using the "performance" governor. Well, if you tweak the "ondemand" governor so that it considers the emulator load to be high enough to not down clock inbetween frames, then it will perform the same as the "performance" governor. But then there's no point in doing the optimization in the first place, since it won't save you any power consumption over the "performance" governor anyway!

                            D 1 Reply Last reply 9 Feb 2019, 07:12 Reply Quote 3
                            • R
                              Riverstorm
                              last edited by 9 Feb 2019, 01:46

                              Performance is one the first things I turn on with a new build for the past few years. There's several MAME titles that you can feel the difference between on-demand vs performance when gaming. I don't know if it's coincidental but the more demanding titles seem to really show which almost seems contradictory but maybe another component is bottlenecking it.

                              If you open SSH and use watch you can see it constantly yo-yo while playing almost any game. It never stays at maximum performance like when using the performance setting.

                              It takes a minute to turn it on, done! :)

                              1 Reply Last reply Reply Quote 0
                              • P
                                pi2user
                                last edited by 9 Feb 2019, 03:14

                                Another tool you can use to check up on cpu performance is nmon (nigel's monitor). It is a standard debian package that was originally developed for monitoring enterprise level POWER systems.
                                It shows performance on a per-core basis, so you can see how much any individual core is being used. It is interesting to see how load is spread out over all the cores even in a single-core task like retroarch. The system appears to frequently reassign load to different cores, so each core gets a turn at running fully loaded.
                                Install and run nmon, then press c for cpu display per-core, and then t for top procs. There's also l for long-term cpu stats but this is overall and not per-core. q quits out of nmon.

                                1 Reply Last reply Reply Quote 1
                                • D
                                  dankcushions Global Moderator @Brunnis
                                  last edited by 9 Feb 2019, 07:12

                                  @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                                  @dankcushions

                                  i think the sampling_down_factor might be the one we would tweak:

                                  • sampling_down_factor:

                                    This parameter controls the rate at which the kernel makes a decision

                                    on when to decrease the frequency while running at top speed. When set

                                    to 1 (the default) decisions to reevaluate load are made at the same

                                    interval regardless of current clock speed. But when set to greater

                                    than 1 (e.g. 100) it acts as a multiplier for the scheduling interval

                                    for reevaluating load when the CPU is at its top speed due to high

                                    load. This improves performance by reducing the overhead of load

                                    evaluation and helping the CPU stay at its top speed when truly busy,

                                    rather than shifting back and forth in speed. This tunable has no

                                    effect on behavior at lower speeds/lower CPU loads.

                                  I don't think that will work either. The problem is, again, that the average load is too low. Whether we stretch out the sample period over 1, 2, 10 frames, the average load will be close to the same and far below the required 95% that's needed to stay at the highest speed.

                                  agree i think you’d also have to raise that threshold also (which is possible)

                                  Well, if you tweak the "ondemand" governor so that it considers the emulator load to be high enough to not down clock inbetween frames, then it will perform the same as the "performance" governor. But then there's no point in doing the optimization in the first place, since it won't save you any power consumption over the "performance" governor anyway!

                                  for those situations, absolutely, but the issue is that using the performance governer puts ALL applications launched via run command at full speed, which includes multithreaded or otherwise low-load applications (kodi, pixel desktop (??), maybe even some emulators like gambette, etc). i still like the idea of finding a way to make the ondemand governer work for us.

                                  B 1 Reply Last reply 9 Feb 2019, 08:44 Reply Quote 0
                                  • B
                                    Brunnis @dankcushions
                                    last edited by Brunnis 2 Sept 2019, 14:02 9 Feb 2019, 08:44

                                    @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                                    @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                                    @dankcushions

                                    i think the sampling_down_factor might be the one we would tweak:

                                    • sampling_down_factor:

                                      This parameter controls the rate at which the kernel makes a decision

                                      on when to decrease the frequency while running at top speed. When set

                                      to 1 (the default) decisions to reevaluate load are made at the same

                                      interval regardless of current clock speed. But when set to greater

                                      than 1 (e.g. 100) it acts as a multiplier for the scheduling interval

                                      for reevaluating load when the CPU is at its top speed due to high

                                      load. This improves performance by reducing the overhead of load

                                      evaluation and helping the CPU stay at its top speed when truly busy,

                                      rather than shifting back and forth in speed. This tunable has no

                                      effect on behavior at lower speeds/lower CPU loads.

                                    I don't think that will work either. The problem is, again, that the average load is too low. Whether we stretch out the sample period over 1, 2, 10 frames, the average load will be close to the same and far below the required 95% that's needed to stay at the highest speed.

                                    agree i think you’d also have to raise that threshold also (which is possible)

                                    Well, if you tweak the "ondemand" governor so that it considers the emulator load to be high enough to not down clock inbetween frames, then it will perform the same as the "performance" governor. But then there's no point in doing the optimization in the first place, since it won't save you any power consumption over the "performance" governor anyway!

                                    for those situations, absolutely, but the issue is that using the performance governer puts ALL applications launched via run command at full speed, which includes multithreaded or otherwise low-load applications (kodi, pixel desktop (??), maybe even some emulators like gambette, etc). i still like the idea of finding a way to make the ondemand governer work for us.

                                    I was still stuck in thinking emulation (for which I’m not convinced ondemand is a great idea). I agree for Kodi and the likes.

                                    As for trying out an optimization: I’d start out by either testing half as long sampling_rate OR decreasing the up_threshold to something like 50%. It won’t fix all possible performance issues, but there’s a good chance it’s much better than the defaults for the RetroPie use case.

                                    1 Reply Last reply Reply Quote 1
                                    • R
                                      Rascas
                                      last edited by 9 Feb 2019, 16:47

                                      Here is the default settings of Libreelec:
                                      https://github.com/LibreELEC/LibreELEC.tv/blob/libreelec-9.0/projects/RPi/initramfs/platform_init
                                      I believe they are the same now on Raspbian/RetroPie. The only difference is the io_is_busy, which improves performance when in heavy reading/writing to the sdcard, for example streaming a torrent. But probably not so usefull for emulation.

                                      H 1 Reply Last reply 9 Feb 2019, 23:26 Reply Quote 2
                                      • H
                                        hhromic @Rascas
                                        last edited by 9 Feb 2019, 23:26

                                        @Rascas said in Overclocking the Pi3b+ GPU (Results):

                                        I believe they are the same now on Raspbian/RetroPie.

                                        Can confirm from one of my Raspbian devices:

                                        /sys/devices/system/cpu/cpufreq/ondemand/io_is_busy: 0
                                        /sys/devices/system/cpu/cpufreq/ondemand/up_threshold: 50
                                        /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate: 100000
                                        /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor: 50
                                        1 Reply Last reply Reply Quote 0
                                        • S
                                          Silent
                                          last edited by 10 Feb 2019, 09:31

                                          I really like the snippet with setting individual emulators to performance governor! I'd love to have that set eg. for PSX but I wouldn't necessarily want it for GBC or Kodi. Onstart sounds like a good way to optimally utilize the option!

                                          1 Reply Last reply Reply Quote 0
                                          69 out of 133
                                          • First post
                                            69/133
                                            Last post

                                          Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                          Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.

                                            This community forum collects and processes your personal information.
                                            consent.not_received