RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    Versatile C++ game scraper: Skyscraper

    Scheduled Pinned Locked Moved Ideas and Development
    skyscraperscrapergamelist.xmlscrapinggithub
    1.6k Posts 113 Posters 1.6m Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • muldjordM
      muldjord @AnalogHero
      last edited by muldjord

      @analoghero Ah, I know why! It's because the search match becomes too low for one of the games because of how the title is prioritized from the different scraping modules. So one title makes the search match go below 50% which it didn't when scraping with the source originally. So basically it is /meant/ to do this, but I should still look into it. I'll figure something out for 1.8.3.

      Bottom line, it's not an error, just lower the minMatch to '0' with '-m 0' when scraping with localdb and you'll get the nice result again. :) But I do want this to make sense to the user, so I'll do something about it anyways.

      I do have another very minor bug I found while investigating this though, so I'm glad you brought it up. Either way, 1.8.2 is working just fine, so no need to hold your breath for 1.8.3. :)

      EDIT: I've had some more time to think about this "problem". And I've decided to leave it, simply because it isn't an error at all. I can see how it seems confusing that addmore resources from more scrapers suddenly makes it skip more games than previously. But the explanation is simply that the title of a game from a certain module is prioritized above the other titles, and that title makes is go below the minimum match percentage threshold. This is easily "fixed" by lowering the threshold. And it is /supposed/ to work like this. The priorities can easily be changed inside the 'priorities.xml' file as described in the documentation on github. So in the case of the Youtube howto I made, lowering the 'title' priority for 'screenscraper' to go below that of 'mobygames' will /also/ make the "problem" go away. I am absolutely in agreement that this seems puzzling if you don't understand or make use of the priorities. It seems like it "going backwards" when you scrape with more scrapers, when in fact it just got a title from a new scraper that was of higher priority and then gave a worse percentual match to the filename. That's all it is. And it's not an error. :)

      1 Reply Last reply Reply Quote 0
      • Used2BeRXU
        Used2BeRX
        last edited by Used2BeRX

        Every one of my gamelists are in home/pi/RetroPie/Gamelists.

        That way when I re-run the script meleu has been working on and they get updated there, the symbolic links from the location that Emulation Station looks for them link to all of the files in a single folder and they automatically work on a reboot instead of having to manually move 12 to 15 files to as many locations when an update is made.

        Might want to look into doing something like that yourself. Saves a ton of time, especially when you're probably updating things all the time with the testing you're doing.

        1 Reply Last reply Reply Quote 0
        • muldjordM
          muldjord
          last edited by

          EmuMovies has accepted my request for developer access. I'll get the details soon and start looking into an implementation. I'll keep you posted. :)

          1 Reply Last reply Reply Quote 3
          • HurricaneFanH
            HurricaneFan
            last edited by

            @muldjord Is there a way to have boxart without a drop shadow and not centered. Like just a raw boxart image?

            muldjordM 1 Reply Last reply Reply Quote 0
            • muldjordM
              muldjord @HurricaneFan
              last edited by

              @hurricanefan Yes, please read the readme on github.

              1 Reply Last reply Reply Quote 0
              • S
                StephanePare
                last edited by

                I've noticed that out of 700 NES titles, there's 32 that your scraper can't recognize purely because the article (the, a) is already at the end in the original file name (As per GOOD tools standard), messing with the name recognition. for example:

                Punisher, the

                Simple name, but big problems finding it even though the default emulationstation scraper finds it no problem. for now I'll simply rename the offending roms and re-run the scraper when I have time, but I thought this might be worth mentionning. Originally there was 50 titles, but after running from multiple source it dropped down to 32.

                muldjordM RionR 2 Replies Last reply Reply Quote 0
                • muldjordM
                  muldjord @StephanePare
                  last edited by muldjord

                  @StephanePare Yeah, it's a well-known problem. The thing is this: If I do it one way, it won't work with X scraping module (Because it has the name as 'Punisher, The') and if I do it the other way it won't work with Y scraping module (Because it has it as 'The Punisher').

                  If any of you guys have ideas for how to better handle "The" let me know. I was thinking that perhaps I could just leave out "The" altogether if it exists at the beginning or the end of a name (and of course add it back in later). Not sure if that would cause other problems though. I've kindof reached a place where the puzzle is so complete, that changing things one place, will screw with things in other places, so it's a bit of a fidgety thing to get improvements in there now, that won't make other matches get worse... Makes sense?

                  1 Reply Last reply Reply Quote 0
                  • RionR
                    Rion @StephanePare
                    last edited by

                    @stephanepare May i ask why you are using Good tools collections? Go with No-Intro sets instead.

                    FBNeo rom filtering
                    Mame2003 Arcade Bezels
                    Fba Arcade Bezels
                    Fba NeoGeo Bezels

                    S 1 Reply Last reply Reply Quote 0
                    • S
                      StephanePare @Rion
                      last edited by

                      @rion it's the first time I ever hear of any other set than GOODtools, I've always assumed them to be the only standard and that it didn't change. I'll google my way into that "no intro" thing

                      @muldjord I've always thought that was how some scraper worked, but I'm obviously no coder.

                      HurricaneFanH RionR 2 Replies Last reply Reply Quote 0
                      • HurricaneFanH
                        HurricaneFan @StephanePare
                        last edited by

                        @stephanepare My understanding is there are no-intro sets for the following systems. Not all of these work in RetroPie though:

                        Atari - 5200
                        Atari - 7800
                        Atari - Jaguar
                        Atari - Lynx
                        Atari - ST
                        Bandai - WonderSwan Color
                        Bandai - WonderSwan
                        Casio - Loopy
                        Casio - PV-1000
                        Coleco - ColecoVision
                        Commodore - 64 (PP)
                        Commodore - 64 (Tapes)
                        Commodore - 64
                        Commodore - Amiga
                        Commodore - Plus-4
                        Commodore - VIC-20
                        Emerson - Arcadia 2001
                        Entex - Adventure Vision
                        Epoch - Super Cassette Vision
                        Fairchild - Channel F
                        Funtech - Super Acan
                        GamePark - GP32
                        GCE - Vectrex
                        Hartung - Game Master
                        Leapfrog - Leapster Learning Game System
                        Magnavox - Odyssey2
                        Microsoft - MSX 2
                        Microsoft - MSX
                        NEC - PC Engine - TurboGrafx 16
                        NEC - Super Grafx
                        Nintendo - Famicom Disk System
                        Nintendo - Game Boy Advance (e-Cards)
                        Nintendo - Game Boy Advance
                        Nintendo - Game Boy Color
                        Nintendo - Game Boy
                        Nintendo - Nintendo 64
                        Nintendo - Nintendo 64DD
                        Nintendo - Nintendo Entertainment System
                        Nintendo - Pokemon Mini
                        Nintendo - Satellaview
                        Nintendo - Sufami Turbo
                        Nintendo - Super Nintendo Entertainment System
                        Nintendo - Virtual Boy
                        Nokia - N-Gage
                        Philips - Videopac+
                        RCA - Studio II
                        Sega - 32X
                        Sega - Game Gear
                        Sega - Master System - Mark III
                        Sega - Mega Drive - Genesis
                        Sega - PICO
                        Sega - SG-1000
                        Sinclair - ZX Spectrum +3
                        SNK - Neo Geo Pocket Color
                        SNK - Neo Geo Pocket
                        Tiger - Game.com
                        VTech - CreatiVision
                        VTech - V.Smile
                        Watara - Supervision

                        1 Reply Last reply Reply Quote 0
                        • RionR
                          Rion @StephanePare
                          last edited by Rion

                          @stephanepare Goodtools are absolite. It's best to go with No-Intro instead.

                          Google is your friend here if you are not part of a larger community cough private trackers cough

                          Just search for "No-Intro 2017"

                          FBNeo rom filtering
                          Mame2003 Arcade Bezels
                          Fba Arcade Bezels
                          Fba NeoGeo Bezels

                          1 Reply Last reply Reply Quote 0
                          • muldjordM
                            muldjord
                            last edited by

                            I got my hands on an EmuMovies / gamesdbase API key now (Are they the same? It's a bit confusing). Spend a bit of time checking things out from their demo vb project, and I think I get the gist of the implementation. I will write my own implementation of it and it will then be added to Skyscraper as a scraping module.

                            L 1 Reply Last reply Reply Quote 1
                            • L
                              LocVez @muldjord
                              last edited by

                              @muldjord Fantastic, looking forward to this :)

                              1 Reply Last reply Reply Quote 0
                              • L
                                LocVez
                                last edited by

                                @muldjord - Would it be possible to redirect the localdb folder from [homefolder]/.skyscraper/ to [install folder]/.skyscraper/ ? I have copied the install files to usb hdd as I left Skyscraper running, scripted to scrape my entire collection over the last 3 or 4 days and I came back to it today and realised the SD card was ful (64gb card) So i'm trying to transfer everything voer to my 1Tb hdd.

                                I had a weird error where emulationstation was crashing on boot but I think it was to do with a full card, i'm going to have to try copy contents to usb hdd and see how it goes :)

                                muldjordM 1 Reply Last reply Reply Quote 0
                                • muldjordM
                                  muldjord @LocVez
                                  last edited by muldjord

                                  @LocVez Yes, just use '-d [dbs folder]'. Check the readme :) Just remember that using '-d' points to the platform dbs folder you are gonna be scraping with localdb. So, for instance, if you wanted to scrape 'nes' with a custom local nes db path, you would put in '-d [whatever]/.skyscraper/dbs/nes'.

                                  EDIT: To elaborate: You can't change the Skyscraper folder, but if you want it to be seamless, you can always just create a symbolic link from ~/.skyscraper to wherever your usb hdd is mounted. Or, simply mount your usb hdd at ~/.skyscraper. :)

                                  L 1 Reply Last reply Reply Quote 0
                                  • L
                                    LocVez @muldjord
                                    last edited by LocVez

                                    @muldjord Nice one, thanks! :) And yes, I will now go read the readme <blush>

                                    1 Reply Last reply Reply Quote 0
                                    • L
                                      LocVez
                                      last edited by

                                      Sorry @muldjord another suggestion or two... - Can we have a switch to set a timeout for scraping each rome file? I've noticed a handful of occasions where the scraping seems to take 10 minutes for a particular few files and I'm unsure if this is a fault on the scraper or the scrapee side but if we could make it so that if it takes longer than 10 seconds or so, skip and move on that would be great (If the user could manually set the timeout I mean)

                                      Also - could we list the database and platform we are scraping on the text that says xxxx/xxxx --- Pass 1, Pass 2 ------ <rom name> etc etc so that in the event of having a "stuck" scrape it can be cancelled and that database can be ommited from the script?

                                      I have the script setup in the following way

                                      Skyscraper -p megadrive -s gamesdatabase --unattend
                                      Skyscraper -p megadrive -s mobygames --unattend
                                      etc, etc

                                      But it's impossible to know which database is causing issues :(

                                      Thanks again!

                                      muldjordM 1 Reply Last reply Reply Quote 0
                                      • muldjordM
                                        muldjord @LocVez
                                        last edited by muldjord

                                        @LocVez

                                        1. Sounds really odd. I have a 30 seconds timeout on the network connections (tested and working well), so it has to be a problem elsewhere, perhaps on your system. I've never had my scraper wait for 10 minutes while scraping (and I've scraped A LOT!). Maybe it could be related to saving data to the SD card. This is not something that can be fixed as it's system related. If you can investigate a bit further it might help, but for the moment I am going to assume it's a problem with your system.

                                        2. I've wanted this myself, so I'll think about it. :) The platform is already part of the output, but I could add an output line about the current scraping module.

                                        EDIT: Btw, you can actually figure out where it stopped. Just look at the 'skipped*' files. The one that has been changed last, is the one where it was stopped.

                                        EDIT2: Another think I just thought of. If you have been scraping a lot, it might also be that some of the sites have started throttling you down. That would result in transfers taking a loooong time, but not be a timeout as such.

                                        Have you noticed if it's any particular scraping module that is slow?

                                        EDIT3: 'Scraper' is now included in the output per entry but only when using the '--verbose' option. It is redundant information, so I didn't want it per default. I think it works well when it's only shown when using '--verbose'. That's the whole point of verbose. Will be in 1.8.3.

                                        1 Reply Last reply Reply Quote 0
                                        • L
                                          LocVez
                                          last edited by

                                          Thanks @muldjord , I've mapped the .skyscraper folder to my usb hdd but it was doing this with my sd card as well as the usb hdd. As suggested it is likely a website throttling the connection or refusing. I did notice tonight when I shut it down that the "gamesdatabase" website had banned me again so I wonder if it were that, at the moment i'm running a system one scraper at a time to check all ok.

                                          I will eagerly await the addition to verbose :)

                                          Note - the platform is only part of the output if it sucessfully scrapes, if , in my case, it doesn't find anything, and it's taking 10 mins to scrape, it doesn't display this information. Thinking more about it, taking such a long time to scrape and returning "no results" more than likely does indicate a ban from the scraper website..... Looking forward to the emumovies addition :D

                                          1 Reply Last reply Reply Quote 0
                                          • muldjordM
                                            muldjord
                                            last edited by

                                            Just added a check for "bad scraping runs" which basically means that Skyscraper will quit if the first 30 files are missed. This indicates that the scraping module that is being used doesn't support the platform. Will be in 1.8.3.

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.