RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    Versatile C++ game scraper: Skyscraper

    Scheduled Pinned Locked Moved Ideas and Development
    skyscraperscrapergamelist.xmlscrapinggithub
    1.6k Posts 113 Posters 1.5m Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F
      fartsparkles
      last edited by

      Does anyone have any idea why I can't scrape two ROMs specifically? I'm using screenscraper and I can see in their database both the game and exact file I have (filename, crc, md5, and sha1 are exact matches)? This only happens on two specific ROMs that also happen to have period/. chars in their names (not sure if this is relevant). ROMs are in binary form - not zipped - as per the screenscraper.fr DB entry.

      Running Skyscraper 3.0.1 via the RetroPie Setup script.

      mituM 1 Reply Last reply Reply Quote 0
      • mituM
        mitu Global Moderator @fartsparkles
        last edited by

        @fartsparkles Maybe it would help to give the names/crcs for those ROMs and what system they're in. They might exist on SS, but have a different system.

        1 Reply Last reply Reply Quote 0
        • F
          fartsparkles
          last edited by

          I tested with Skraper and had no issue scraping the ROMs. There are the specific ROMs that won't scrape with Skyscraper:
          https://www.screenscraper.fr/rominfos.php?romid=315352
          https://www.screenscraper.fr/rominfos.php?romid=314736
          I'm not a C++ person so I haven't stepped through the code however I find it interesting that both these ROMs have periods in the name.

          muldjordM 2 Replies Last reply Reply Quote 0
          • muldjordM
            muldjord @fartsparkles
            last edited by

            This post is deleted!
            1 Reply Last reply Reply Quote 0
            • muldjordM
              muldjord @fartsparkles
              last edited by muldjord

              @fartsparkles I've been looking into this a bit, and I can rule out that it has to do with the "." in the file name (at least not on the Skyscraper side of things). I tested this by simply giving Screenscraper a custom query with the --query option where I specifically ask for the md5 or sha1 checksums of the rom you mention. And even by doing so, I still get a direct "Rom not found" back from Screenscraper. So this almost has to be a Screescraper bug, and not a Skyscraper bug. If Screenscraper does not provide a result when using a direct md5 or sha1 checksum query, I simply can't do anything about it.

              EDIT: If you want to mess around with this yourself, here's the command line needed to try with specific md5 sums. You can also change that to sha1 if you want:

              Skyscraper -p atari2600 -s screenscraper --verbosity 3 --query "md5=<CHECKSUM>" "<FILENAME>"
              

              Verbosity 3 simply gives you a bit more details in the output.

              EDIT2: Something is seriously wrong with Screenscraper right now. I can't get it to find any rom based entirely on checksum right now. I just tried with Super Mario World for Snes, and that only returns a reply on file name, not when I try just with the checksums. This usually works just fine.

              My best advice is to give it a week and see if it resolves itself. I'm guessing they are having some issues with their database that needs resolving before this will work.

              B 1 Reply Last reply Reply Quote 1
              • B
                Brunnis @muldjord
                last edited by

                @muldjord Maybe I suffer from the same issue as @fartsparkles ? The strange thing is that Steven Selph's scraper, using Screenscraper, finds these 40 games that Skyscraper can't identify. I'll re-post what I just wrote on Reddit:

                "I've been setting up a new RetroPie setup and thought I'd use Skyscraper. However, I'm having a strange issue with it. I have a curated list of NES roms which are all verified against No-Intro. Both RetroArch's own DB (which checks against No-Intro hashes) and Steven Selph's scraper (using Screenscraper) correctly identify these ROMs. However, Skyscraper fails on 40 of them, with "No returned matches".

                Here's an example: Batman (U) [!].nes (MD5: 2e9f52556273aa735d0e75649541d812)

                This ROM can be found at screenscraper.fr with the exact same hash: https://www.screenscraper.fr/rominfos.php?romid=135447

                I've tried both versions 2.9.5 and 3.0.1 of Skyscraper. Same issue. Anyone have any ideas? Maybe /u/muldjord

                I feel like I must be missing something obvious here... Still, the other methods mentioned above work out of the box with these ROMs."

                1 Reply Last reply Reply Quote 0
                • muldjordM
                  muldjord
                  last edited by muldjord

                  I'm currently following some threads on their forums that might be related to this. For the time being Skyscraper seems to be blocked - no idea why. I'm awaiting reply from them.

                  EDIT: Skyscraper is currently banned from using the Screenscraper database due to the 4 passes I do with varying checksums and file name. I'm working on resolving this. It will most likely mean that Skyscraper will get a new API key and all users will have to update to a new version of Skyscraper when the new key has been issued. Stay tuned.

                  B 1 Reply Last reply Reply Quote 0
                  • B
                    Brunnis @muldjord
                    last edited by

                    @muldjord said in Versatile C++ game scraper: Skyscraper:

                    I'm currently following some threads on their forums that might be related to this. For the time being Skyscraper seems to be blocked - no idea why. I'm awaiting reply from them.

                    EDIT: Skyscraper is currently banned from using the Screenscraper database due to the 4 passes I do with varying checksums and file name. I'm working on resolving this. It will most likely mean that Skyscraper will get a new API key and all users will have to update to a new version of Skyscraper when the new key has been issued. Stay tuned.

                    Ahh, okay! Do you think this ties into the issue I was seeing regarding not finding ROMs by using checksums?

                    muldjordM 1 Reply Last reply Reply Quote 0
                    • muldjordM
                      muldjord @Brunnis
                      last edited by muldjord

                      @Brunnis Sortof. :) I can't test anything right now as Skyscraper is blocked, so we'll see when the situation clears up.

                      M 1 Reply Last reply Reply Quote 0
                      • M
                        mo418 @muldjord
                        last edited by

                        @muldjord I was about to ask why I get « no game found » for all my roms using screenscraper.

                        Thanks for the heads up. Keep us posted (and keep up the good work!)

                        Regards :)

                        1 Reply Last reply Reply Quote 0
                        • muldjordM
                          muldjord
                          last edited by

                          The good people of Screenscraper has been very helpful in figuring out how to move forward with this and the issue is now resolved. Stay tuned for more info soon...

                          M 1 Reply Last reply Reply Quote 3
                          • M
                            mo418 @muldjord
                            last edited by mo418

                            @muldjord I confirm. Working like a charm now

                            Thanks for your support :)

                            Edit: Lots of games not found though, but it’s the 2nd time I use your scraper so I don’t know if it’s a setting issue or else.

                            muldjordM 1 Reply Last reply Reply Quote 0
                            • muldjordM
                              muldjord @mo418
                              last edited by

                              @mo418 This will probably be better once the new release (3.0.2) is out. There are issues with the screenscraper module currently in addition to the previous key ban.

                              1 Reply Last reply Reply Quote 0
                              • muldjordM
                                muldjord
                                last edited by

                                Skyscraper 3.0.2 released: https://github.com/muldjord/skyscraper

                                • Upped the rom limit from 5 to 35 for the "igdb" module
                                • Upped the rom limit from 25 to 35 for the "mobygames" module
                                • Added media cache config options to module section
                                • Add Sharp X1 platform as "x1"
                                • Now exits nicely when running low on disk space
                                • Added 'spaceCheck=<BOOL>' to config.ini
                                • Fixed crash when using '--startat' and '--endat' where the '--endat' file name came before the '--startat' file name. Note! What 'ls' reports in alphabetical order is not always what Skyscraper see as it it locale specific. So be aware of this. A huge thanks to 'Gemba' for taking the time to investigate this bug thoroughly.
                                • Fixed bug in game list metadata preservation when using relativePaths and '<folder>' entries (thank you to 'HumanRob' for reporting this)
                                • Fixed game list entries skipping for 'relativePaths' and '<folder>' instances
                                • Skyscraper now saves the cache and exits nicely on ctrl+c (SIGINT) (thanks to 'krcroft' for pointing this out)
                                • The 'screenscraper' module now includes 'systemeid' in the query for better results
                                • Now skips the game list assembling when in gathering mode
                                • Now skips cache saving when in game list generation mode
                                • Output now says whether it was a gathering run or a game list generation run

                                Fixed a bunch of stuff in the game list skipping and metadata preservation code. This was sortof b0rked before when people used relative paths. Should work as expected now. Let me know if you find cases where it doesn't work.
                                The screenscraper module now includes 'systemeid' in the query which should give better results for all platforms.
                                I've also included another quite important feature which is the "exit nicely" when user presses ctrl+c to stop a scraping. Before it would simply force the process to die. Now it let's the currently running threads finish up the entries they are working on and then saves the cache for the stuff it has gone through. Before this data was lost. So this is a huge improvement.

                                B 1 Reply Last reply Reply Quote 4
                                • B
                                  Brunnis @muldjord
                                  last edited by

                                  @muldjord That’s awesome! I can’t test it out until tomorrow morning. Do you have reason to believe this also fixes the issue where Skyscraper missed hits on hashes that should have been found?

                                  muldjordM 1 Reply Last reply Reply Quote 0
                                  • muldjordM
                                    muldjord @Brunnis
                                    last edited by

                                    @Brunnis Yes, I can do manual md5 searches now, so that should be fixed too.

                                    B 1 Reply Last reply Reply Quote 1
                                    • mituM
                                      mitu Global Moderator
                                      last edited by

                                      Yes, just updated and tried an MD5 search (the one reported first with the Atari 2600 E.T game) and it's found now.

                                      1 Reply Last reply Reply Quote 1
                                      • S
                                        scocasso @muldjord
                                        last edited by

                                        @muldjord said in Versatile C++ game scraper: Skyscraper:

                                        Currently you have the option to either scrape from scratch, or to only scrape entries that don't already exist.

                                        I cannot find this option. I don't want to accidentally scrape all 2000+ games, I only need to scrape about 50 which are new.

                                        I'm using Skyscraper 3.0.1 on retropie from the Retropie tools menu.

                                        muldjordM 1 Reply Last reply Reply Quote 0
                                        • M
                                          mo418
                                          last edited by mo418

                                          When it says [...] PASS 1 ——— Game [gametitle] not found :( ———

                                          Does it mean it will make other passes until it will find the rom media in screenscraper?

                                          Updated to 3.0.2 and I still get a lot of them (while UXS found them all, except very few titles) so there must be something going on.

                                          Also, is there a way to export the games titles that were not found?

                                          Thanks in advance!

                                          mituM 1 Reply Last reply Reply Quote 0
                                          • mituM
                                            mitu Global Moderator @mo418
                                            last edited by mitu

                                            @mo418 When running with --unattend, the list of not found titles is saved in ~/.skyscraper/skipped-<scraper>.txt. It will tell you at the end of the run

                                            [...]
                                            Total number of games: 2
                                            Successfully processed games: 1
                                            Skipped games: 1 (Filenames saved to '~/.skyscraper/skipped-thegamesdb.txt')
                                            
                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.