RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    Versatile C++ game scraper: Skyscraper

    Scheduled Pinned Locked Moved Ideas and Development
    skyscraperscrapergamelist.xmlscrapinggithub
    1.6k Posts 113 Posters 1.6m Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • parasvenP
      parasven
      last edited by

      @muldjord
      is there a way to get the media off of screenscraper that actually belong to the scraped game region ? For example there is multiple covers for this game:

      007 - The World Is Not Enough:
      https://www.screenscraper.fr/gameinfos.php?gameid=102752&action=onglet&zone=gameinfosmedias

      The covers are different for different regions of the game. Is it actually possible to get the cover for the german version through skyscraper?

      In the source of skyscraper i found following regions:
      eu
      us
      ss
      uk
      wor
      jp

      Are there more options for the region parameter?
      What do the parameter ss and wor stand for?
      wor = world?

      https://github.com/parasven

      muldjordM 1 Reply Last reply Reply Quote 0
      • muldjordM
        muldjord @parasven
        last edited by muldjord

        @parasven If you want the German ones, just use '--region de' I believe it is. 'ss' simply means 'screenscraper' and is a generic region they apply to any media they don't have a region for. The regions listed in the source are just the priority list I use internally. 'wor' just means world I'm guessing. Probably for games that are regionless.

        There are many more options, they are listed with this call (Chrome can view this url directly, otherwise save it and open it in a text editor):
        https://www.screenscraper.fr/api/regionsListe.php?devid=xxx&devpassword=yyy&softname=zzz&output=xml&ssid=test&sspassword=test

        Btw, keep in mind that even though you set '--region de' it doesn't mean it will find german versions of all the covers. It always falls back to the internal regions if the one provided manually can't be found. So you will still see the others for the ones where a 'de' version didn't exist.

        parasvenP 1 Reply Last reply Reply Quote 0
        • parasvenP
          parasven @muldjord
          last edited by

          @muldjord
          Thank you very much for that list. I found some of these regions by hand with try and error hehe

          Love your scraper btw. It is very fast and works like a charm. The localDB is a very cool feature to be honest :)

          https://github.com/parasven

          1 Reply Last reply Reply Quote 1
          • Used2BeRXU
            Used2BeRX @dorkvader
            last edited by

            @dorkvader Just noticed this now man.

            Damned if I can see your email there though man. Is there something I'm supposed to do when I load your profile? I just can't find it.

            D 1 Reply Last reply Reply Quote 0
            • S
              SteveW25561 @muldjord
              last edited by

              @muldjord Thanks for the reply.

              I used the simple mode to scrape so I can't tell what video source they came from. I'll try again in manual mode and specify screenscraper or arcadedb.

              Additional questions:

              1. Is there a command to scrape and overwrite just one rom file? For example, I have centiped.zip in /home/pi/RetroPie/roms/mame-libretro

              2. What does scraping one rom do to the gamelist.xml file?

              Thanks...

              muldjordM 1 Reply Last reply Reply Quote 0
              • muldjordM
                muldjord @SteveW25561
                last edited by

                @stevew25561 You can just put the full or partial path of a rom as the last part of the command line like so:

                $ Skyscraper -p [platform] -s [source] --videos /partial/or/full/path/to/romfile.zip
                

                It won't touch the gamelist.xml when you do that. But in order to make use of the data afterwards, you should always rescrape with just:

                $ Skyscraper -p [platform] --videos
                

                Which will regenerate the gamelist.xml from all of the locally cached data. This step is also where you can see where the videos are from. It will be listed in the parenthesis.

                Also remember that if you always scrape with videos enabled you can just put this in '~/.skyscraper/config.ini'

                [main]
                videos="true"
                

                Then you won't need to add the --videos every time you scrape. Check more options in the '~/.skyscraper/config.ini.example' file.

                1 Reply Last reply Reply Quote 0
                • D
                  dorkvader @Used2BeRX
                  last edited by

                  @used2berx OK, it should be fixed now. I guess I didn't save the setting to show email.

                  1 Reply Last reply Reply Quote 0
                  • T
                    timb @SteveW25561
                    last edited by timb

                    @stevew25561

                    For what it’s worth, this is how I scrape multiple platforms:

                    Skyscraper -p nes -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p snes -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p n64 -s screenscraper -u user:pass && \
                    Skyscraper -p gb -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p gbc -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p gba -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p megadrive -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p sega32x -s screenscraper --unpack -u user:pass && \
                    Skyscraper -p segacd -s screenscraper --unpack -u user:pass
                    

                    The && between commands tells the shell to run the next command if the previous command completed without errors. So, in the example above, if the NES, SNES and N64 instances all scrape fine, but the GB instance runs into a problem and quits, the chain will stop there. The \ allows you to split a single command among multiple lines. (It tells the shell to treat the newline as an actual newline, instead of executing the command.)

                    This is a quick and dirty way to do it, but works fine. I could whip up a short shell script that would be a lot cleaner and allow you to pass it a list of platforms to scrape and args to pass to Skyscraper if you’d like. (Something like this: ./skywrapper.sh -p ‘nes megdrive snes n64’ -wargs ‘-s screenscrapper -u user:pass’)

                    1 Reply Last reply Reply Quote 1
                    • E
                      easye9inches
                      last edited by

                      Ubuntu is new to me and driving me crazy, can someone help. I have my ROMs on an external. I want to scrape the data to the external.

                      My mount point is /media/usb0

                      The virtualboy for example, would be: /media/usb0/All ROMs/virtualboy

                      This is what is confusing me when I goto scrape:

                      Platform: 'virtualboy'
                      Scraper module: 'screenscraper'
                      Input folder: '/media/usb0/’All'
                      Game list folder: '/media/usb0/’All'
                      Covers folder: '/media/usb0/’All/covers'
                      Screenshots folder: '/media/usb0/’All/screenshots'
                      Wheels folder: '/media/usb0/’All/wheels'
                      Marquees folder: '/media/usb0/’All/marquees'
                      Videos folder: '/media/usb0/’All/videos'
                      Local db folder: 'dbs/virtualboy'

                      DID YOU KNOW: You can force a refresh of the locally cached data using the '--refresh' option. Skyscraper will then refetch the requested entries from the scraping sources, instead of loading it from cache. Sort of like Ctrl+F5 in a browser.

                      Forcing 1 threads as this is the anonymous limit in the ScreenScraper scraping module. Sign up for an account at https://www.screenscraper.fr and support them to gain more threads. Then use the credentials with Skyscraper using the '-u [user:password]' command line option or by setting 'userCreds=[user:password]' in '~/.skyscraper/config.ini'.

                      Looking for optional 'priorities.xml' file in local db folder... Found!
                      Priorities loaded successfully!

                      Input folder '/media/usb0/’All' doesn't exist or can't be seen by current user. Please check path and permissions.

                      1 Reply Last reply Reply Quote 0
                      • muldjordM
                        muldjord
                        last edited by muldjord

                        You need to quote the entire path since you have spaces in it. Otherwise it'll see All roms as two different paths and just stop at All. So basically put in:

                        $ Skyscraper -p virtualboy -i "/media/usb0/All ROMs/virtualboy" -s screenscraper
                        

                        But instead of doing that, I would just add this to the ~/.skyscraper/config.ini

                        [virtualboy]
                        intputFolder="/media/usb0/All ROMs/virtualboy"
                        

                        Then you don't have to type it in all the time. Check ~/.skyscraper/config.ini.examplefor more available options, and also check the output of $ Skyscraper --help for all command line options. :)

                        EDIT: I just realized I've forgotten to add inputFolder as a possible option of the [main] section of config.ini. I'll fix this in the next release. When that's fixed you can add it as:

                        [main]
                        intputFolder="/media/usb0/All ROMs"
                        

                        Then it'll be used as the base for all platforms.

                        E 1 Reply Last reply Reply Quote 2
                        • E
                          easye9inches @muldjord
                          last edited by easye9inches

                          @muldjord Ahhhh. Thanks, it works flawlessly now!

                          Question? I know you could have only one picture as in a screen shot, cover, etc. I found in DBS folder if i wanted to copy covers, marquees, etc over to the external as well, why are those pics named like 0b809d4d49d064bd95d84b2865cc0f9304750b9d (3d Tetris COVER)? Must you go about in renaming everyone, lol?

                          muldjordM 1 Reply Last reply Reply Quote 1
                          • muldjordM
                            muldjord @easye9inches
                            last edited by

                            @easye9inches I've just released 2.7.4 earlier today. That allows you to add the 'inputFolder' variable to the [main] section in config.ini. Then it works for all platforms automatically.

                            1 Reply Last reply Reply Quote 1
                            • M
                              Mick2K
                              last edited by

                              I have a problem. I can't update. Any ideas?!

                              alt text

                              muldjordM 1 Reply Last reply Reply Quote 0
                              • muldjordM
                                muldjord @Mick2K
                                last edited by

                                @mick2k Please run the following and try again, I have been messing around with the 2.7.4 release because I found a silly bug, so I re-released.

                                Run this:

                                $ cd
                                $ cd skysource
                                $ rm VERSION
                                $ ./update_skyscraper.sh
                                

                                That should fix it and update to 2.7.4. Let me know how it goes.

                                M 1 Reply Last reply Reply Quote 0
                                • M
                                  Mick2K @muldjord
                                  last edited by

                                  @muldjord said in Versatile C++ game scraper: Skyscraper:

                                  ./update_skyscraper.sh

                                  It worked. Thank you very much.

                                  1 Reply Last reply Reply Quote 1
                                  • E
                                    easye9inches
                                    last edited by easye9inches

                                    Is there a certain way to name a sub-folder so that it also could get scrapped into the gamelist.xml? For instance, I added a "#JP Games" folder to the genesis library instead of having both a megadrive and genesis system selection. Just to condense the system selection down. But it did not scrape that sub-folder. Is that possible?

                                    Edit: Nevermind. I see they were scraped and included in the gamelist.xml

                                    1 Reply Last reply Reply Quote 0
                                    • muldjordM
                                      muldjord @AnalogHero
                                      last edited by muldjord

                                      @analoghero Hey man, I've implemented the checksum option you suggested some time back. It's currently on the master branch and will be in 2.7.5 to be released soon. :)

                                      It works by using the command line option '--query' which basically takes either a search query for the filename based scraping modules, or either 'sha1=[checksum]', 'md5=[checksum]', or 'romnom=[filename]' (rom name in French). It also requires a single rom to be passed on command line like so:

                                      $ Skyscraper -p [platform] -s screenscraper --refresh --query sha1=[checksum] /[path]/[to]/[romfile.zip]
                                      

                                      This will allow you to override the checksums used when searching for the game. So you can look one up at screenscraper, and just use that.

                                      AnalogHeroA 1 Reply Last reply Reply Quote 1
                                      • AnalogHeroA
                                        AnalogHero @muldjord
                                        last edited by

                                        @muldjord Nice! Thanks for putting your time into this. Hipe that others could use this option, too.

                                        muldjordM 1 Reply Last reply Reply Quote 1
                                        • muldjordM
                                          muldjord @AnalogHero
                                          last edited by

                                          @analoghero It's a niche feature for sure, but I got another request for it, and I figured out a way to implement it in a way I was satisfied with. :)

                                          1 Reply Last reply Reply Quote 0
                                          • muldjordM
                                            muldjord
                                            last edited by

                                            Skyscraper version 2.7.5 released: https://github.com/muldjord/skyscraper

                                            • Fixed a bug where 'brackets="false"' in config.ini would be flipped (Thanks to Vynce for reporting this)
                                            • Completely refactored pass procedures for cleaner code and to enable '--query' option
                                            • Added '--query' command line option. This option requires a single rom file to be passed on command line aswell, otherwise it will be ignored (Thank you to AnalogHero and Vynce for suggesting this)
                                            • Added scrapers to 'psx' and 'pc' platforms when using Simple Mode

                                            To elaborate on the "--query" option, this is how it works: For most modules a search query is sent to the scraping module in an URL format. That means that a filename such as "Rick Dangerous.lha" becomes "rick+dangerous". The '+' here means a space. You could probably also use the URL encoded space "rick%20dangerous" but my tests show that most modules expect spaces as '+'. And it is the "rick+dangerous" that you, as the user, can pass as the query, like so:

                                            $ Skyscraper -p [platform] -s [module] --query "rick+dangerous" [filename]
                                            

                                            Remember to also add a filename that you wish to use the override with. Otherwise the query will be ignored.

                                            But, not all of the scraping modules are search name based. For instance, the "screenscraper" module can use a variety of different search methods. So for screenscraper, you also have the option of overriding the checksums it use to search for a game. This is especially convenient in cases where a filename exists multiple times in their database and your own local file doesn't match with any of the connected checksums (maybe you've compressed the rom yourself or whatever).
                                            In this case you can look up one of the working checksums on "screenscrapers" website (screenscraper.fr) and override the checksum like these examples:

                                            $ Skyscraper -p [platform] -s [module] --query sha1=[checksum] [filename]
                                            $ Skyscraper -p [platform] -s [module] --query md5=[checksum] [filename]
                                            $ Skyscraper -p [platform] -s [module] --query sha1=[checksum]&md5=[checksum]&romnom=[exact url encoded filename] [filename]
                                            

                                            The last example combines two of the checksum options and even the "romnom" option which is "rom name" in French (this is a screenscraper thing, not a Skyscraper thing). You obviously only need one of the checksum options, it's just to show that you can combine them if you really need to.

                                            The '--query' option is clearly an "experts only" option, but for those that like to go down the rabbit hole, I am your humble servant. Down you go... :D

                                            And happy scraping! :)

                                            1 Reply Last reply Reply Quote 2
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.