RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    Versatile C++ game scraper: Skyscraper

    Scheduled Pinned Locked Moved Ideas and Development
    skyscraperscrapergamelist.xmlscrapinggithub
    1.6k Posts 113 Posters 1.6m Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • muldjordM
      muldjord @timb
      last edited by muldjord

      @timb Thank you, I'll give it some thought. I'm leaning towards a solution that does it automatically for each localdb entry when it is requested. That is, if I decide to go ahead with it at all. :) I'm also quite happy with the solution as it is if I get that weird bug fixed you posted about earlier.

      T 1 Reply Last reply Reply Quote 0
      • T
        timb @muldjord
        last edited by

        @muldjord
        Speaking of the weird bug, it gets weirder!

        Total number of games: 733
        Successfully scraped games: 661
        Skipped games: 72 (Filenames saved to '~/.skyscraper/skipped-screenscraper.txt')
        

        Now, these same 733 games scraped just fine as extracted ROMs. If I use 7z on the Pi to extract these problem files (the same 7z Skyscraper is using) they extract fine and show the correct hashes. So I know it's not the archive or ROMs that are the problem.

        All 72 of the skipped files are returning the same hash (that I posted earlier, it doesn't appear to change). Weird, right?

        muldjordM 1 Reply Last reply Reply Quote 0
        • muldjordM
          muldjord @timb
          last edited by muldjord

          @timb I have a guess. Maybe it's because it checksums "no data" (or an error message) because of the way I use the QProcess read or something. It's very hacked. Haven't had time to look properly into this yet, I'm just guessing without having entirely checked your descriptions yet.

          T 1 Reply Last reply Reply Quote 0
          • T
            timb @muldjord
            last edited by

            @muldjord
            I had similar thoughts earlier. An easy way to see if it’s checksumming an error message from 7zip is to entirely suppress all output from it. We can do that with the following flags:

            7z e -y -bd -bso0 -bse0 -bsp0 -so
            

            That essentially completely disables console, error and progress output and also answers ‘yes’ to any prompts. It’s the closest you can get to a “-qq” flag. I’ll add that to the source this afternoon, recompile and see if it makes any difference.

            No rush on this, by the way. I appreciate you implementing it at all. :)

            1 Reply Last reply Reply Quote 0
            • muldjordM
              muldjord
              last edited by muldjord

              Please git pull and try it again. I've improved it quite a bit and added further error checking. I'm interested in knowing if you still get those weird same-sha1 errors. Also, I've added a 20 meg limit to using the unpacking feature to try and avoid running into mem limitations on the pi. Hopefully this is temporary as I would like it to read chunks instead, but so far I haven't gotten that to work reliably. So it take up the amount of ram the rom takes up on disk to calculate the checksums. And if you run several threads AND the compositor is working with images that spells trouble.

              T 2 Replies Last reply Reply Quote 0
              • T
                timb @muldjord
                last edited by

                @muldjord

                Whatever you did seems to have fixed it! I’ll let you know how the other platforms go (I’ve only tried NES so far). I’m just putting the finishing touches on my script that swaps the SHA1 hashes in the DB files; I want to run it first before scraping the other platforms.

                A 20MiB limit seems sensible for now. I don’t think any of my ROMs come close to that. (I keep my N64 stuff uncompressed as the emulator doesn’t support on the fly decompression.)

                1 Reply Last reply Reply Quote 1
                • ?
                  A Former User
                  last edited by

                  The only time I could see the 20 MiB limit coming in to play is if someone has zipped PSX PBP files. I am unsure if the PSX emulators allow for zipped media though.

                  muldjordM 1 Reply Last reply Reply Quote 1
                  • muldjordM
                    muldjord @A Former User
                    last edited by

                    @livefastcyyoung Yes, it would probably only be relevant for the "newer" platforms such as psx and n64 so I think it'll be ok. Thanks for your input guys, I appreciate all of it!

                    1 Reply Last reply Reply Quote 2
                    • T
                      timb @muldjord
                      last edited by timb

                      @muldjord
                      So once I converted all the hashes in the database with my script, it seems to have successfully scraped everything just fine. Thanks for getting the basics of this feature working! :)

                      muldjordM 1 Reply Last reply Reply Quote 1
                      • muldjordM
                        muldjord @timb
                        last edited by

                        @timb Glad it works :) And you're welcome! I'll probably make a release with this in a few days.

                        1 Reply Last reply Reply Quote 0
                        • S
                          SteveW25561
                          last edited by

                          This tool is amazing! Thanks for your work on this, @muldjord

                          Is there a way to get Skyscraper to scrape ALL of the systems in my RetroPie directory (or a selected set), rather than specifying one at a time? The selph scraper allows you to choose "ALL" or "Selected" systems, and I was looking for the same in Skyscraper but don't see it in the docs.

                          Also, I just scraped a bunch of MAME games and many of the videos Skyscraper got back were not playable (initial simple mode scrape). I can see altbeast.mp4 (2.8 MB) or centiped.mp4 (928 KB) for example, and they won't play via Mac or on the Pi. Any way to fix this?

                          muldjordM T 2 Replies Last reply Reply Quote 0
                          • muldjordM
                            muldjord @SteveW25561
                            last edited by muldjord

                            @stevew25561 Thank you, glad you like it! :) No, there is no way to scrape all platforms, you'll have to script that youself. :)

                            Did you get those videos from screenscraper or arcadedb? I just scraped centiped from both arcadedb and screenscraper and they play just fine with mplayer. I did notice that it has some weird dimensions because centipede is a vertical game. So I'm guessing it's because EmulationStation has issues with that. Not really something I can fix I'm afraid, as I basically just download the videos and save them as is.

                            S 1 Reply Last reply Reply Quote 0
                            • parasvenP
                              parasven
                              last edited by

                              @muldjord
                              is there a way to get the media off of screenscraper that actually belong to the scraped game region ? For example there is multiple covers for this game:

                              007 - The World Is Not Enough:
                              https://www.screenscraper.fr/gameinfos.php?gameid=102752&action=onglet&zone=gameinfosmedias

                              The covers are different for different regions of the game. Is it actually possible to get the cover for the german version through skyscraper?

                              In the source of skyscraper i found following regions:
                              eu
                              us
                              ss
                              uk
                              wor
                              jp

                              Are there more options for the region parameter?
                              What do the parameter ss and wor stand for?
                              wor = world?

                              https://github.com/parasven

                              muldjordM 1 Reply Last reply Reply Quote 0
                              • muldjordM
                                muldjord @parasven
                                last edited by muldjord

                                @parasven If you want the German ones, just use '--region de' I believe it is. 'ss' simply means 'screenscraper' and is a generic region they apply to any media they don't have a region for. The regions listed in the source are just the priority list I use internally. 'wor' just means world I'm guessing. Probably for games that are regionless.

                                There are many more options, they are listed with this call (Chrome can view this url directly, otherwise save it and open it in a text editor):
                                https://www.screenscraper.fr/api/regionsListe.php?devid=xxx&devpassword=yyy&softname=zzz&output=xml&ssid=test&sspassword=test

                                Btw, keep in mind that even though you set '--region de' it doesn't mean it will find german versions of all the covers. It always falls back to the internal regions if the one provided manually can't be found. So you will still see the others for the ones where a 'de' version didn't exist.

                                parasvenP 1 Reply Last reply Reply Quote 0
                                • parasvenP
                                  parasven @muldjord
                                  last edited by

                                  @muldjord
                                  Thank you very much for that list. I found some of these regions by hand with try and error hehe

                                  Love your scraper btw. It is very fast and works like a charm. The localDB is a very cool feature to be honest :)

                                  https://github.com/parasven

                                  1 Reply Last reply Reply Quote 1
                                  • Used2BeRXU
                                    Used2BeRX @dorkvader
                                    last edited by

                                    @dorkvader Just noticed this now man.

                                    Damned if I can see your email there though man. Is there something I'm supposed to do when I load your profile? I just can't find it.

                                    D 1 Reply Last reply Reply Quote 0
                                    • S
                                      SteveW25561 @muldjord
                                      last edited by

                                      @muldjord Thanks for the reply.

                                      I used the simple mode to scrape so I can't tell what video source they came from. I'll try again in manual mode and specify screenscraper or arcadedb.

                                      Additional questions:

                                      1. Is there a command to scrape and overwrite just one rom file? For example, I have centiped.zip in /home/pi/RetroPie/roms/mame-libretro

                                      2. What does scraping one rom do to the gamelist.xml file?

                                      Thanks...

                                      muldjordM 1 Reply Last reply Reply Quote 0
                                      • muldjordM
                                        muldjord @SteveW25561
                                        last edited by

                                        @stevew25561 You can just put the full or partial path of a rom as the last part of the command line like so:

                                        $ Skyscraper -p [platform] -s [source] --videos /partial/or/full/path/to/romfile.zip
                                        

                                        It won't touch the gamelist.xml when you do that. But in order to make use of the data afterwards, you should always rescrape with just:

                                        $ Skyscraper -p [platform] --videos
                                        

                                        Which will regenerate the gamelist.xml from all of the locally cached data. This step is also where you can see where the videos are from. It will be listed in the parenthesis.

                                        Also remember that if you always scrape with videos enabled you can just put this in '~/.skyscraper/config.ini'

                                        [main]
                                        videos="true"
                                        

                                        Then you won't need to add the --videos every time you scrape. Check more options in the '~/.skyscraper/config.ini.example' file.

                                        1 Reply Last reply Reply Quote 0
                                        • D
                                          dorkvader @Used2BeRX
                                          last edited by

                                          @used2berx OK, it should be fixed now. I guess I didn't save the setting to show email.

                                          1 Reply Last reply Reply Quote 0
                                          • T
                                            timb @SteveW25561
                                            last edited by timb

                                            @stevew25561

                                            For what it’s worth, this is how I scrape multiple platforms:

                                            Skyscraper -p nes -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p snes -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p n64 -s screenscraper -u user:pass && \
                                            Skyscraper -p gb -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p gbc -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p gba -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p megadrive -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p sega32x -s screenscraper --unpack -u user:pass && \
                                            Skyscraper -p segacd -s screenscraper --unpack -u user:pass
                                            

                                            The && between commands tells the shell to run the next command if the previous command completed without errors. So, in the example above, if the NES, SNES and N64 instances all scrape fine, but the GB instance runs into a problem and quits, the chain will stop there. The \ allows you to split a single command among multiple lines. (It tells the shell to treat the newline as an actual newline, instead of executing the command.)

                                            This is a quick and dirty way to do it, but works fine. I could whip up a short shell script that would be a lot cleaner and allow you to pass it a list of platforms to scrape and args to pass to Skyscraper if you’d like. (Something like this: ./skywrapper.sh -p ‘nes megdrive snes n64’ -wargs ‘-s screenscrapper -u user:pass’)

                                            1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.