Raspberry Pi OS 64bit version
-
Hi,
i know it's not supported yet, but i noticed this commit in the retropie setup script, which seems to indicate that installing retropie on raspberry pi os 64bit is an experimental option now we can try.
Does anyone already have experience with retropie on raspberryPi OS 64bit on a pi4? If so, is there a significant increase in emulation performance?
-
@akamming said in Raspberry Pi OS 64bit version:
i know it's not supported yet, but i noticed this commit in the retropie setup script, which seems to indicate that installing retropie on raspberry pi os 64bit is an experimental option now we can try.
Does anyone already have experience with retropie on raspberryPi OS 64bit on a pi4? If so, is there a significant increase in emulation performance?I mean, if you want to head to the SBCgaming discord/subreddit (or monka.sbcgaming.net) and look for monkabuntu which is a 64bit build for pi4 with Emulationstation and Retroarch (essentially retropie) installed.
But it's still a work in progress and is more of a desktop OS.
Note: Not officially affiliated or endorsed by the people at Retropie.
-
@SgtJimmyRustles tx... it's more that i'm interested in the performance gain when i compare armhf vs aarch64 on the pi4...
if you look at performance comparisions like this (32bit vs 64 bit) like this. there can be a big difference performance gain, but that is dependent on the type of operations you are doing. So i was curious to know the performance impact on emulation
but anyway. I had a spare SD card and decided to run a test myself. On 64 bit OS, the retropie installation takes a very long time (no precompiled packages, everything needs to be compiled locally), but i was happy to see it works!.
Did not have a lot of time, so only ran a few psp (lr-ppsspp) and n64 (lr-mupen64plus-next) games.
ppsspp was about the same, But lr-mupen64plus-next seems to benefit a lot from 64bit: e.g. Cruis'n World has a much higher framerate. (around 29fps on 32 bit, around 55fps on 64 bit with same settings on the same pi 4 on 640x480 resolution)
If other people have experiences like this i'd be very interested....
-
Some emulators could also run slower. Although ones that may already be full speed. Due to having Arm 32bit neon optimisations which can't be used without rewriting for aarch64.
Btw I saw the comment on the bugtracker. I've not yet gone through the experimental packages. I'm aware some stuff isn't working :-)
-
Clear…
i stopped my test as i found out rpi os 64 bit is too unstable. Had issues with the display driver. when i googled i found out i wasn't the only one
based on the performance gain on lr-mupen64plus-next it is a promising development though, so i hope rpi 64 bit os will be supported in the future..
for now: i will keep away from it until it is a more stable environment...
-
Has there been any updates for a 64 bit image?
-
@billymild Raspberry Pi OS 64-bit version is still in beta, so i wouldn't expect an image until it's at least official.
-
@dankcushions where is the beta version hosted?
-
-
@billymild For fun, you could try using the 64-bit kernel in your existing 32-bit RetroPie image.
You need to add arm_64bit=1 to your /boot/config.txt file and reboot. (More detail in this page for example
how-to-make-your-raspberry-pi-4-faster-with-a-64-bit-kernel)It's easy to comment out or remove when you're finished, and reboot back into the regular all-32-bit configuration.
In a really quick test I found that the 64-bit kernel reduces audio glitching in lr-mupen64plus-next. (And probably the better mupen emulators too, like mupen64-gliden64).
Unfortunately running the 64-bit kernel with the 32-bit RetroPie also causes lr-genesis_plus_gx to crash soon after starting any game I tried. (From memory, perhaps because the 64-bit kernel is a bit stricter about how code should be organised in memory).
In the full 64-bit kernel+OS both of those emulators are stable (but then lr-parallel_n64 won't start for me). The full 64-bit RetroPie also has a big chunk of packages not yet available, but it is safe enough to try on a spare partition or SD card.
You need to install the base Raspberry Pi OS 64-bit beta, update it, and then install RetroPie from source.
-
@busywait that won't work properly - the userland will be 32-bit still.
-
@dankcushions said in Raspberry Pi OS 64bit version:
@busywait that won't work properly - the userland will be 32-bit still.
Yes, 32-bit userland still used.
I think that the reason that I did see a difference depending on the arm_64bit setting (32 bit kernel vs 64-bit kernel) could be because of the way that data is copied in the HDMI driver. I'm using the full KMS and HDMI/arm-side audio driver if that makes any difference.
Edit: yes, I know this is an unsupported configuration for RetroPie; no, I don't use it regularly, I was just interested to see what happened.
-
64-bit will only lead to a substantial gain in performance when all components of Retropie(Cores, Retroarch etc.) are properly rewritten in 64-bit.
Expect this to take another few years. -
@molokkoplus the majority of default cores, retroarch, etc already compile in 64-bit via RetroPie, but you will need the 64-bit userland for that to happen (ie, the raspberry pi OS 64-bit beta version). but retropie doesn't officially support this, yet, so it's at your own risk...
PS, not an obvious gain in performance, to me (but then very little is CPU-bound on retropie for pi 4 anyway).
-
@molokkoplus said in Raspberry Pi OS 64bit version:
64-bit will only lead to a substantial gain in performance when all components of Retropie(Cores, Retroarch etc.) are properly rewritten in 64-bit.
Expect this to take another few years.@dankcushions said in Raspberry Pi OS 64bit version:
PS, not an obvious gain in performance, to me (but then very little is CPU-bound on retropie for pi 4 anyway).
I run my RetroPie “test build” exclusively on 64-bit RPiOS with the KMS video and audio drivers and have had a lot of stability with it.
Building the cores/emulators in 64-bit is fairly trivial as most of them already support other arm64/aarch64 platforms.
The main issue with 64-bit RPiOS is the lack of hardware acceleration for video decoding but that isn’t much of a problem for RetroPie and this feature will be implemented in the near term at Pi Towers.
@dankcushions is 100% right, the real hw limitation on the Pi4 is the GPU. The first GB of the Pi4’s memory is only available to the GPU through CMA so when you start to crank up the resolution on 3D emulators the GPU will become the bottleneck. The Vulkan driver helps but it can only do so much until the 1GB limitation rears it’s ugly head.
The Vulkan driver is getting optimization improvements every few weeks but the only way to get that driver is from building from MESA gitlab. I suspect that the user base won’t see the driver in apt until the next Debian major release “Bullseye” whenever that happens.
-
@bluestang most pi4 are sold with 4gb of memory. Why is there a 1 GB bottleneck?
-
@billymild 1 GB access for limitation for the GPU. the CPU has access to the full memory.
i'm not convinced that any viable emulation task would ever get near 1GB GPU memory, though. i believe the internal bandwidth of the pi and various bottlenecks in the GPU would make framerates a crawl at the kinds of tasks that need 1GB VRAM , regardless.
realtime VRAM usage used to be easy enough to measure:
sudo vcdbg reloc
- presume this doesn't work correctly on pi4, though. -
Specific to MAME, there are measurable improvements comparing 32bit to 64bit.
I benchmarked 650 games in MAME on an RPi3B+ and an RPi4B, and compared stock speeds, overclocked, 32bit and 64bit (full kernel+os+binary). These are tested on Raspbian Buster (based on Debian 10 Buster, GCC8.3).
Write up and results are here:
https://stickfreaks.com/misc/raspberry-pi-mame-benchmarksWithin that link you can find the raw CSV data, as well as a Google Sheet link with some silly stats I did. Across all 650 games there's an average jump of around 18% on an RPi4, however there are individual games that see well over 50% speedups (and a very small amount that go backwards). Overall it seems to be very positive.
I can't speak for other emulators or RetroArch cores, but I would assume similar results for those given the inherent nature of what emulation does. If anyone knows of a similar method of benchmarking other emulators, please let me know. (MAME makes this easy with a "-bench" command line flag, and the ability to automate the process and output the results to a text file for easy logging). I'm happy to repeat the tests on other emulators if there's an objective way to do so that doesn't involve manually recording frame rates from a GUI overlay.
Similar results happened with MAME on x86 when it moved to 64bit. Discussion from back in 2007 here: http://forum.arcadecontrols.com/index.php?topic=74600.0
-
@elvis said in Raspberry Pi OS 64bit version:
Specific to MAME, there are measurable improvements comparing 32bit to 64bit.
I benchmarked 650 games in MAME on an RPi3B+ and an RPi4B, and compared stock speeds, overclocked, 32bit and 64bit (full kernel+os+binary). These are tested on Raspbian Buster (based on Debian 10 Buster, GCC8.3).
Write up and results are here:
https://stickfreaks.com/misc/raspberry-pi-mame-benchmarksWithin that link you can find the raw CSV data, as well as a Google Sheet link with some silly stats I did. Across all 650 games there's an average jump of around 18% on an RPi4, however there are individual games that see well over 50% speedups (and a very small amount that go backwards). Overall it seems to be very positive.
nice! did you run within the desktop, though? because MAME on pi4 via x is allegedly majorly faster to the point i'm not really sure there's any value in any other type of optimization, at least via retropie (which is not run within X), at least, until we've investigated this further (via kms vs fkms, later sdl2 versions, etc).
I can't speak for other emulators or RetroArch cores, but I would assume similar results for those given the inherent nature of what emulation does.
i'm afraid very few of the poorly performing emulators on pi are cpu bound. it's true that emulation is traditionally cpu-intensive, but most emulators in retropie on pi have dynarecs and/or accelerated graphics (leveraging gpu) that make the cpu requirements often quite low, and it's more often the GPU or system bandwith that is the bottleneck (the latter may be improved in 64-bit, mind). mame is an exception to this as they don't typically use dynarecs or accelerated 3d graphics, opting for full-fat cpu emulation.
that said, it could be useful for those who want faster than fullspeed performance in 2d emulators, for fast-forward functions, etc.
If anyone knows of a similar method of benchmarking other emulators, please let me know. (MAME makes this easy with a "-bench" command line flag, and the ability to automate the process and output the results to a text file for easy logging). I'm happy to repeat the tests on other emulators if there's an objective way to do so that doesn't involve manually recording frame rates from a GUI overlay.
it's possible to harvest benchmarking data from retroarch cores via verbose logging. an unfinished POC: https://github.com/dankcushions/retropie-auto-testing
-
@dankcushions said in Raspberry Pi OS 64bit version:
nice! did you run within the desktop, though? because MAME on pi4 via x is allegedly majorly faster to the point i'm not really sure there's any value in any other type of optimization, at least via retropie (which is not run within X), at least, until we've investigated this further (via kms vs fkms, later sdl2 versions, etc).
I used the MAME supplied "-bench" flag. It renders video and audio internally, but doesn't send them to the GPU/screen/sound hardware. This is useful in that you see the true "raw" performance of MAME, free from any display technology or driver issues.
I'm happy to retest a games under actual display modes if required. MAME supports a number of screen display and rendering modes (including a number of legacy lossy-compressed framebuffer modes like YUV420, which were brilliant back in the day of poor 2D performance in Linux), but also your usual candidates like OpenGL, SDL, X11 etc. That will also answer the question of exactly how much overhead exists between the upper bounds of CPU performance, and what overhead actually spitting out a picture introduces.
I'm currently running the overclocked Pi4 64bit tests again under Debian Bullseye (merely dist-upgrading Raspbian from Buster) and a newly compiled MAME via GCC10 (up from GCC8 on Buster). If you have specific video outputs you want me to test, let me know (I'll fetch a complete list and post it when I'm back at a PC).
it's possible to harvest benchmarking data from retroarch cores via verbose logging. an unfinished POC: https://github.com/dankcushions/retropie-auto-testing
Brilliant, cheers, I'll check that out. My overall goal was less around comparing 32bit to 64bit, and more just trying to come up with an objective yes/no answer to the "will a Pi run my favourite game/system?" question. So even with all the dynarec points acknowledged, this will help me with my original goal.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.