• Recent
  • Tags
  • Popular
  • Home
  • Docs
  • Register
  • Login
RetroPie forum home
  • Recent
  • Tags
  • Popular
  • Home
  • Docs
  • Register
  • Login

Overclocking the Pi3b+ GPU (Results)

Scheduled Pinned Locked Moved General Discussion and Gaming
pi3 b+overclockgpu
133 Posts 18 Posters 39.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    Brunnis
    last edited by 8 Feb 2019, 10:21

    24h test complete. No issues found (Quake 3 + memtester 512 + sysbench (2 threads)) at the following settings:

    arm_freq=1475
    core_freq=600
    v3d_freq=400
    sdram_freq=550
    over_voltage=1
    temp_soft_limit=70
    
    R 1 Reply Last reply 8 Feb 2019, 11:05 Reply Quote 0
    • R
      robertvb83 @Brunnis
      last edited by robertvb83 2 Aug 2019, 11:06 8 Feb 2019, 11:05

      @Brunnis said in Overclocking the Pi3b+ GPU (Results):

      24h test complete. No issues found (Quake 3 + memtester 512 + sysbench (2 threads)) at the following settings:

      arm_freq=1475
      core_freq=600
      v3d_freq=400
      sdram_freq=550
      over_voltage=1
      temp_soft_limit=70
      

      i have played a little with overclocking on 3B+... I have no issues with temperature and games play well without any trouble

      However, I have issues with compiling (updating from source) with an overclocked 3B+. updating mame2003-plus almost always freezes or stops with errors... Any idea about that? (I don't know my exact settings but I had this issue with many settings found around the internet, even for moderate overclocking)

      Edit: I also did a sysbench stress test without issues

      My full size arcade cabinet Robotron vs. Octolyzer

      M 1 Reply Last reply 8 Feb 2019, 11:21 Reply Quote 0
      • M
        mitu Global Moderator @robertvb83
        last edited by 8 Feb 2019, 11:21

        @robertvb83 said in Overclocking the Pi3b+ GPU (Results):

        updating mame2003-plus almost always freezes or stops with errors..

        What kind of errors ? If they're memory related error (not enough memory), then you can increase the amount of swap added during compilation to get over those issues. Do you get the same kind of errors without overclocking ?

        R 1 Reply Last reply 8 Feb 2019, 13:01 Reply Quote 0
        • B
          Brunnis
          last edited by Brunnis 2 Aug 2019, 12:32 8 Feb 2019, 11:38

          @BuZz @hhromic To expand on the discussion regarding CPU governor and lower than expected performance: I've been watching the output of the 'top' command now, while running some SNES loads and below are some results. "Tweaked video settings" below means:

          video_driver="dispmanx"
          video_threaded="false"
          video_max_swapchain_images=2
          

          7ac80baa-c347-4733-aa33-3b8f95e31dd3-image.png

          For the ondemand CPU governor tests above, I also ran a script that read actual CPU frequency every second. Turns out the ondemand CPU governor leads to frequent downclocking (to 600 MHz) in all test cases (whether running Super Mario World or Super Mario World 2 and whether using default or tweaked video settings). Here are the printouts:

          Test 2: Governor ondemand (tweaked video settings) - SMW
          Test 2: Governor ondemand (tweaked video settings) - SMW2
          Test 4: Governor ondemand (default video settings) - SMW
          Test 4: Governor ondemand (default video settings) - SMW2

          So, to conclude, it doesn't look like the ondemand CPU scheduler handles this in an optimal way. The constant ping-ponging of the CPU frequency (even with default RetroPie settings) is hardly optimal and may lead to performance issues in some cases. For most situations, the additional frame buffering used on a default installation seems to mask the impact of the reduced CPU frequency. Removing that buffering (i.e. using video_max_swapchain_images=2) reveals the issue in an obvious way with stuttering performance in demanding situations (such as SMW2).

          D P 2 Replies Last reply 8 Feb 2019, 11:55 Reply Quote 0
          • D
            dankcushions Global Moderator @Brunnis
            last edited by 8 Feb 2019, 11:55

            @Brunnis what's your 'Est. single CPU load (%)' column about? with video_threaded="false" retroarch should be entirely operating on one core. even with video_threaded="true" the threaded video tasks are very minor.

            B 1 Reply Last reply 8 Feb 2019, 11:59 Reply Quote 0
            • B
              Brunnis @dankcushions
              last edited by 8 Feb 2019, 11:59

              @dankcushions
              That's just converting top's CPU load (which is for all four cores) to the estimated resulting single core load. So:

              ("Total CPU load"/25)*100 gives you the value in the "Est. single CPU load" column.

              D 1 Reply Last reply 8 Feb 2019, 12:04 Reply Quote 0
              • D
                dankcushions Global Moderator @Brunnis
                last edited by dankcushions 2 Aug 2019, 12:05 8 Feb 2019, 12:04

                @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                @dankcushions
                That's just converting top's CPU load (which is for all four cores) to the estimated resulting single core load. So:

                ("Total CPU load"/25)*100 gives you the value in the "Est. single CPU load" column.

                actually top's percentage is cumulative. eg, 100% load on 4 cores would appear on top as 400%

                that said, these emulators are not threaded so they won't be using the other cores, so top's total load will be - or very close to - the load on one core (some OS tasks might be working on other cores)

                if you press 1 within top you get a % per core - https://unix.stackexchange.com/a/146090

                B 1 Reply Last reply 8 Feb 2019, 12:13 Reply Quote 0
                • B
                  Brunnis @dankcushions
                  last edited by Brunnis 2 Aug 2019, 12:17 8 Feb 2019, 12:13

                  @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                  @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                  @dankcushions
                  That's just converting top's CPU load (which is for all four cores) to the estimated resulting single core load. So:

                  ("Total CPU load"/25)*100 gives you the value in the "Est. single CPU load" column.

                  actually top's percentage is cumulative. eg, 100% load on 4 cores would appear on top as 400%

                  that said, these emulators are not threaded so they won't be using the other cores, so top's total load will be - or very close to - the load on one core (some OS tasks might be working on other cores)

                  if you press 1 within top you get a % per core - https://unix.stackexchange.com/a/146090

                  The %Cpu(s) value at the top (which is what I looked at, should have just looked at RetroArch in the process list below instead) is not cumulative unless you press 1. So, unless you press 1, a full load on all four cores will show as a combined value of 100. But thanks for the tip about pressing 1. Didn't know that!

                  I'll see if I can update the figures with slightly more accurate ones anyway.

                  D 1 Reply Last reply 8 Feb 2019, 12:37 Reply Quote 0
                  • B
                    Brunnis
                    last edited by 8 Feb 2019, 12:34

                    I just updated the chart to be a bit more clear on what it's showing.

                    1 Reply Last reply Reply Quote 1
                    • D
                      dankcushions Global Moderator @Brunnis
                      last edited by 8 Feb 2019, 12:37

                      @Brunnis yeah i couldn't quite work out why you were "estimating" them but that checks out :)

                      i guess i still don't see a smoking gun with the figures being given, especially when the issue is only apparent using video settings where stutter is a known risk under cpu load situations. however if it's a binary thing to your eyes where the stutter is eliminated once the performance governor is set, i guess that is all that needs to be said.

                      this seems like a perfect test case for my benchmarking script that i never got back to :) https://github.com/dankcushions/retropie-auto-testing/blob/master/retropie-auto-testing.sh

                      1 Reply Last reply Reply Quote 0
                      • B
                        Brunnis
                        last edited by Brunnis 2 Aug 2019, 13:19 8 Feb 2019, 12:56

                        @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                        i guess i still don't see a smoking gun with the figures being given, especially when the issue is only apparent using video settings where stutter is a known risk under cpu load situations. however if it's a binary thing to your eyes where the stutter is eliminated once the performance governor is set, i guess that is all that needs to be said.

                        Well, in this case the stuttering does not occur because the CPU isn't fast enough, but because the ondemand governor is not able to determine that the CPU should stay at max frequency. That's a pretty big difference in my eyes.

                        The figures I posted above show us that the ondemand governor doesn't work as we'd expect and that the resulting performance issue is simply masked by buffering with the default settings. With this testing alone, I can't say for sure that it doesn't affect some marginal games even at default settings. It's certainly possible that only video_max_swapchain_images=2 exposes it. In that case, it would of course be okay to leave the governor at the current default.

                        I didn't post this to press for a change of default governor (since BuZz has already said it won't happen). However, I thought the figures were interesting, since the frequency rollercoaster behavior at default settings didn't seem to be common knowledge.

                        D 1 Reply Last reply 8 Feb 2019, 14:34 Reply Quote 1
                        • R
                          robertvb83 @mitu
                          last edited by 8 Feb 2019, 13:01

                          @mitu said in Overclocking the Pi3b+ GPU (Results):

                          @robertvb83 said in Overclocking the Pi3b+ GPU (Results):

                          updating mame2003-plus almost always freezes or stops with errors..

                          What kind of errors ? If they're memory related error (not enough memory), then you can increase the amount of swap added during compilation to get over those issues. Do you get the same kind of errors without overclocking ?

                          I did not have any errors without overclocking!
                          this is where compiling ends when overclocked:
                          alt text

                          My full size arcade cabinet Robotron vs. Octolyzer

                          D 1 Reply Last reply 8 Feb 2019, 14:36 Reply Quote 0
                          • D
                            dankcushions Global Moderator @Brunnis
                            last edited by 8 Feb 2019, 14:34

                            @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                            @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                            i guess i still don't see a smoking gun with the figures being given, especially when the issue is only apparent using video settings where stutter is a known risk under cpu load situations. however if it's a binary thing to your eyes where the stutter is eliminated once the performance governor is set, i guess that is all that needs to be said.

                            Well, in this case the stuttering does not occur because the CPU isn't fast enough, but because the ondemand governor is not able to determine that the CPU should stay at max frequency. That's a pretty big difference in my eyes.

                            The figures I posted above show us that the ondemand governor doesn't work as we'd expect and that the resulting performance issue is simply masked by buffering with the default settings.

                            forgive me, but i don't think they neccesarily show that. they only show that the governor has decided the CPU should be downclocked (to 600) during some tests. we know the emulators in question do not exert a constant load on the cpu (~90% usage). from the earlier information, it looks like the governor should be checking CPU load and making this decision every 0.01 of a second (sampling_rate defaults to 10000 usecs?), so given that fidelity i am now not surprised that you will see it downlocking every so often. it's probably changing the core speed 100s of times a second.

                            the performance issue you observe must be caused by his process, agreed, but that stutters is not specifically measured in the above data, if you get what i mean. we need an fps benchmark for that.

                            anyway, it seems to me like a good fix might be to increase the sampling_rate fidelity to something north of a frame. eg over 16666667 usec. that way, applications that are generally low load will still downlock, but cpu-heavy emulators will stay full speed. i don't know if that's a good idea, just my initial thought.

                            B 1 Reply Last reply 8 Feb 2019, 15:17 Reply Quote 0
                            • D
                              dankcushions Global Moderator @robertvb83
                              last edited by 8 Feb 2019, 14:36

                              @robertvb83 said in Overclocking the Pi3b+ GPU (Results):

                              I did not have any errors without overclocking!
                              this is where compiling ends when overclocked:

                              remember that retropie compiles use 2 of the 4 cores, but emulation mostly uses 1 core, so an overclock that is stable in games can definitely be unstable in compiles.

                              1 Reply Last reply Reply Quote 0
                              • B
                                Brunnis @dankcushions
                                last edited by 8 Feb 2019, 15:17

                                @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                                forgive me, but i don't think they neccesarily show that. they only show that the governor has decided the CPU should be downclocked (to 600) during some tests. we know the emulators in question do not exert a constant load on the cpu (~90% usage). from the earlier information, it looks like the governor should be checking CPU load and making this decision every 0.01 of a second (sampling_rate defaults to 10000 usecs?), so given that fidelity i am now not surprised that you will see it downlocking every so often. it's probably changing the core speed 100s of times a second.

                                Yes, I think I may have expressed myself a bit unclear. I agree that the governor probably just behaves according to spec. The "unexpected" part is that it affects the performance in a negative way in certain cases. Ideally, the "ondemand" governor should produce the same (or very close to the same) end result (i.e. performance) as the "performance" governor.

                                the performance issue you observe must be caused by his process, agreed, but that stutters is not specifically measured in the above data, if you get what i mean. we need an fps benchmark for that.

                                Yeah, we can definitely measure the performance delta, but what would we do with the data? Would it affect the current discussion in any significant way? For an initial discussion on whether the ondemand governor can cope with the load without affecting the result, audio-visual cues are certainly sufficient. The regression in the end result is not exactly subtle.

                                anyway, it seems to me like a good fix might be to increase the sampling_rate fidelity to something north of a frame. eg over 16666667 usec. that way, applications that are generally low load will still downlock, but cpu-heavy emulators will stay full speed. i don't know if that's a good idea, just my initial thought.

                                They write sample rate in the docs, but I guess they mean period? Increasing the sampling period wouldn't really help. The average load over the period would then often be too low to clock up at all. We'd instead need a really small sample period, so that the downclocked core spins up as fast as possible when the load increases (i.e. the next frame rendering kicks off). The current issue is probably that the default of, say, 10 ms means that once the core is clocked down and the emulator kicks off again, you're spending up to 10 ms rendering the frame at 600 MHz before the governor checks the load and decides to clock back up. Then it's too late and you won't be able to submit the frame on time.

                                D 1 Reply Last reply 8 Feb 2019, 15:58 Reply Quote 0
                                • Q
                                  quicksilver @hhromic
                                  last edited by quicksilver 2 Aug 2019, 16:42 8 Feb 2019, 15:53

                                  @hhromic just found this in the official RPI documentation:

                                  "NOTE: Setting any overclocking parameters to values other than those used by raspi-config may set a permanent bit within the SoC, making it possible to detect that your Pi has been overclocked. The specific circumstances where the overclock bit is set are if force_turbo is set to 1 and any of the over_voltage_* options are set to a value > 0. See the blog post on Turbo Mode for more information."

                                  So force turbo or AND any amount of over voltage applied will set the warranty bit.

                                  H 1 Reply Last reply 8 Feb 2019, 16:15 Reply Quote 1
                                  • D
                                    dankcushions Global Moderator @Brunnis
                                    last edited by 8 Feb 2019, 15:58

                                    @Brunnis said in Overclocking the Pi3b+ GPU (Results):

                                    @dankcushions said in Overclocking the Pi3b+ GPU (Results):

                                    forgive me, but i don't think they neccesarily show that. they only show that the governor has decided the CPU should be downclocked (to 600) during some tests. we know the emulators in question do not exert a constant load on the cpu (~90% usage). from the earlier information, it looks like the governor should be checking CPU load and making this decision every 0.01 of a second (sampling_rate defaults to 10000 usecs?), so given that fidelity i am now not surprised that you will see it downlocking every so often. it's probably changing the core speed 100s of times a second.

                                    Yes, I think I may have expressed myself a bit unclear. I agree that the governor probably just behaves according to spec. The "unexpected" part is that it affects the performance in a negative way in certain cases. Ideally, the "ondemand" governor should produce the same (or very close to the same) end result (i.e. performance) as the "performance" governor.

                                    the performance issue you observe must be caused by his process, agreed, but that stutters is not specifically measured in the above data, if you get what i mean. we need an fps benchmark for that.

                                    Yeah, we can definitely measure the performance delta, but what would we do with the data? Would it affect the current discussion in any significant way?

                                    no, i'm just articulating what i mean when i say that the data presented is not the "smoking gun", but your personal observations of a stutter is.

                                    i think the sampling_down_factor might be the one we would tweak:

                                    • sampling_down_factor:

                                      This parameter controls the rate at which the kernel makes a decision

                                      on when to decrease the frequency while running at top speed. When set

                                      to 1 (the default) decisions to reevaluate load are made at the same

                                      interval regardless of current clock speed. But when set to greater

                                      than 1 (e.g. 100) it acts as a multiplier for the scheduling interval

                                      for reevaluating load when the CPU is at its top speed due to high

                                      load. This improves performance by reducing the overhead of load

                                      evaluation and helping the CPU stay at its top speed when truly busy,

                                      rather than shifting back and forth in speed. This tunable has no

                                      effect on behavior at lower speeds/lower CPU loads.

                                    B 1 Reply Last reply 8 Feb 2019, 21:30 Reply Quote 0
                                    • H
                                      hhromic @quicksilver
                                      last edited by 8 Feb 2019, 16:15

                                      @quicksilver said in Overclocking the Pi3b+ GPU (Results):

                                      So force turbo or any amount of over voltage applied will set the warranty bit.

                                      No, it is force_turbo=1 and over_voltage_* > 0. If you don't use force_turbo, you are fine.

                                      The specific circumstances where the overclock bit is set are if force_turbo is set to 1 and any of the over_voltage_* options are set to a value > 0.

                                      Q 1 Reply Last reply 8 Feb 2019, 16:17 Reply Quote 0
                                      • Q
                                        quicksilver @hhromic
                                        last edited by 8 Feb 2019, 16:17

                                        @hhromic Ah thank you for the clarification! I completely missed the "AND".

                                        1 Reply Last reply Reply Quote 1
                                        • H
                                          hhromic
                                          last edited by 8 Feb 2019, 16:22

                                          Interesting discussions and investigations guys!
                                          I still advocate to leave ondemand as the system default and educate users on how overclocking works and how to use the governor runcommand option, as it's the soundest/safest approach. This topic should definitively be used to update/populate the Wiki entry on the topic.

                                          The only improvement I would consider in this subject would be to implement a per-command governor setting in runcommand, similar to how video modes are set currently. For example create a governors.cfg file alongside videomodes.cfg.

                                          This would give the flexibility for the governor to be configured per-emulator as necessary, e.g. performance for lr-mupen64plus and default for lr-gambatte, or any other customisation.

                                          What do you think @buzz? should be a fairly easy thing to code taking the videomode functionality as template.

                                          B 1 Reply Last reply 8 Feb 2019, 16:25 Reply Quote 0
                                          52 out of 133
                                          • First post
                                            52/133
                                            Last post

                                          Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                          Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.

                                            This community forum collects and processes your personal information.
                                            consent.not_received