RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    Versatile C++ game scraper: Skyscraper

    Scheduled Pinned Locked Moved Ideas and Development
    skyscraperscrapergamelist.xmlscrapinggithub
    1.6k Posts 113 Posters 1.6m Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • circoC
      circo @muldjord
      last edited by

      I've read the conversation, they were surprisingly hostile about it... :/
      I'm sorry, that sucks.

      1 Reply Last reply Reply Quote 1
      • AnalogHeroA
        AnalogHero
        last edited by

        Sad to hear. I cant find the thread on their forums, maybe because im not registered there.

        Can understand that servers cost money, and scrapers can produce a lot of traffic.
        I just tested the 2.5.0 with gamesdb as scrapermodule and it worked (counted down from 1000 uses). Cant understand what was wrong with this solution.

        Used2BeRXU muldjordM 2 Replies Last reply Reply Quote 0
        • Used2BeRXU
          Used2BeRX @AnalogHero
          last edited by Used2BeRX

          Well if anybody can figure out a good way to host this so everybody could use everything without any hassle or drama by the time I put my release out, I'll be happy to do the work there so everybody can enjoy it.

          So far 2,118 unique NES/FDS games are covered. This includes (or will include by release time) Box Art, Cartridge/Disk Art, Title and Action screenshots, "3D" Box Art, synopsis files that contain all the game information for the gamelist.xml tags and a lot of info that RetroPie doesn't have tags for, HD video previews, Game Manuals (either PDF or Zipped JPGs or both... so far around 950 of the games have them), GameFAQs zipped for most official titles and some other goodies.

          At some point when I feel they're ready, I could probably release the synopsis files first so you guys aren't waiting another 6 months + for at least that part. I proof read a ton on those, and I believe they're the best descriptions available out there for the games, including many of the obscure pirates and unlicensed games that usually aren't covered on any of the major gaming sites that would be scraped with this scraper. I also removed all of the "weird" characters that don't like to show up properly in either RetroPie or on the XBox, so there are no strange "empty box" characters in the descriptions anymore.

          I've got a few days off. I'm really going to make an effort to get my spredsheet where I want it to be for a public release so everybody can see what progress has been made so far and follow along if they're bored. :)

          muldjordM 1 Reply Last reply Reply Quote 1
          • muldjordM
            muldjord @AnalogHero
            last edited by muldjord

            @analoghero I have readded support for 'thegamesdb'. Just be wary of the limit. Time will tell if it changes.

            cyperghostC 1 Reply Last reply Reply Quote 1
            • muldjordM
              muldjord @Used2BeRX
              last edited by muldjord

              @used2berx Have you considered somehow uploading the information you are creating to screenscraper.fr? That would be the optimal way of making use of it in Skyscraper. It sounds like you have a pretty much perfect collection of data on your hands, I'm pretty sure they would appreciate the data if it could be automated somehow. I don't know if you're interested in working with them on that. They have been very friendly towards me whenever I've contacted them, so you could consider that if you wish.

              1 Reply Last reply Reply Quote 0
              • cyperghostC
                cyperghost @muldjord
                last edited by cyperghost

                @muldjord You're right! Maybe they overcome their descission - time will show. But I think 1000 entries per unique IP are enough for a user.

                What is this "queue" thing? I understand this as connection request to their server and with one request you can do 20 actions. So in theory you can retrieve 10.000-20.000 entries per IP - or am I wrong?

                muldjordM 1 Reply Last reply Reply Quote 0
                • muldjordM
                  muldjord @cyperghost
                  last edited by

                  @cyperghost If I understand it correctly, which I might not, their API can contain up to 20 game results per request. So optimally, if I knew the ID's of the games in their database beforehand, I could requests a comma-seperated list of 20 specific ID's per request. And all of those 20 games would be returned to be in a JSON answer. Problem here being that I do not know the ID's, that's what Skyscraper is trying to figure out. So I'd have to search for the filename one at a time instead, find the best result and its ID, and then fetch the data. So I'd use 2-3 requests per game.

                  But I might have misunderstood this completely which it seems I have a habbit of doing when it comes to the new API. Pretty embarrasing to be honest.

                  cyperghostC circoC 2 Replies Last reply Reply Quote 0
                  • cyperghostC
                    cyperghost @muldjord
                    last edited by

                    @muldjord Well ... I think there is a possibilty to get these IDs. I think it's just a checksum of the ROM files (surely rearranged and changed with aretmetics)

                    Or am I completely wrong this time?

                    mituM 1 Reply Last reply Reply Quote 0
                    • mituM
                      mitu Global Moderator @cyperghost
                      last edited by

                      @cyperghost said in Versatile C++ game scraper: Skyscraper:

                      Or am I completely wrong this time?

                      I'd say yes, the ID refers to the (internal) identifier of the game in thegamesdb database, not the hash of the file. You search by a game name (don't know if their API has a 'search by hash' option) and you get a list of games with their IDs.

                      cyperghostC muldjordM 2 Replies Last reply Reply Quote 2
                      • cyperghostC
                        cyperghost @mitu
                        last edited by cyperghost

                        @mitu Yes you're right ;)
                        The id for Sonic the Hedgehog is just 5544
                        So the call to this is only ...
                        httpx://thegamesdb.net/game.php?id=5544

                        1 Reply Last reply Reply Quote 0
                        • muldjordM
                          muldjord @mitu
                          last edited by

                          @mitu Correct, they don't currently support hash searches as screenscraper does. The id is just a numeric identifier starting from 0 and going upwards for any new game added it seems.

                          cyperghostC 1 Reply Last reply Reply Quote 0
                          • cyperghostC
                            cyperghost @muldjord
                            last edited by

                            @muldjord Well I think for a single IP 1000 calls is okay ;)
                            You can check if data is received and if there is an failure then report to the user ... and I think you're fine with this. Not the best solution to satisfy all needs but good enough to go.

                            muldjordM 1 Reply Last reply Reply Quote 0
                            • muldjordM
                              muldjord @cyperghost
                              last edited by muldjord

                              @cyperghost Yes, I agree that it should be usable for some minor installations. And obviously it is not a good thing if people are scraping 50000 games at a time no matter what source they are hammering. So I have never been against limits, they are necessary.

                              I would love it if TheGamesDb would support md5 and sha1 hashes aswell though. Then I could fetch 1 game per request instead of using two request for 1 game. But I think I've worn out my welcome for now, so I'll leave them be and spare myself any further embarrasment.

                              1 Reply Last reply Reply Quote 0
                              • circoC
                                circo @muldjord
                                last edited by circo

                                @muldjord said in Versatile C++ game scraper: Skyscraper:

                                So I'd use 2-3 requests per game.

                                You could try stringing them together? As in, first you send the requests for the individual games to get the IDs, then send just a single request for the metadata for every game being scrapped at once. Then you string those together, and you send a single request with the comma-separated IDs.
                                This could reduce the number of requests to n+1, where n is the number of games that are being scrapped.

                                muldjordM 1 Reply Last reply Reply Quote 0
                                • muldjordM
                                  muldjord @circo
                                  last edited by

                                  @circo Thank you for your suggestion, but it's more a question of development time. I would have to rewrite almost the entirety of the scraper system in Skyscraper for this to work as it is not designed for multigame-requests. It's not an overnight change I'm afraid. It's a substantial rewrite that would probably take me months to complete. I do not have that kind of time to work on this project.

                                  circoC 1 Reply Last reply Reply Quote 1
                                  • circoC
                                    circo @muldjord
                                    last edited by circo

                                    @muldjord Ah, I see.
                                    I know from experience that it's difficult to work with APIs that don't fit with the rest :/ But Skyscraper is working really nicely, so at least the rest are cooperating!

                                    1 Reply Last reply Reply Quote 1
                                    • chipsnblipC
                                      chipsnblip
                                      last edited by

                                      i finally got around to scraping my roms with Skyscraper. a few minor hiccups, but overall it's been really great so far. over the last few days, i noticed quite a few screenshots it was pulling in were not properly cropped, so i would like to share an example of how to autocrop them, for those who are interested.

                                      here's a before & after example:

                                      before

                                      after

                                      it's easy to batch-process a database's artwork after it's been scraped. this requires installing ImageMagick first (apt-get install imagemagick). for example, mogrify -trim /home/pi/.skyscraper/dbs/n64/screenshots/screenscraper/*.*, then it's just a matter of rescraping from the localdb.

                                      i hope this is useful to someone.

                                      muldjordM 1 Reply Last reply Reply Quote 1
                                      • muldjordM
                                        muldjord @chipsnblip
                                        last edited by

                                        @chipsnblip Fantastic, thank you for the tip. And that gave me the idea to implement this automatically into my compositor. I can simply look for black borders, and remove them. I'll add that to my todo list. :)

                                        Used2BeRXU 1 Reply Last reply Reply Quote 2
                                        • Used2BeRXU
                                          Used2BeRX @muldjord
                                          last edited by

                                          @muldjord I didn't even know it was possible to script that. Any chance you could post a standalone script to remove black borders from images?

                                          A few months back I re-shot all 4,236 Title and Action screenshots for NES and FDS to make sure they were all the same size and were taken while using the SONY CSX palette, but just in case I don't want to re-invent that particular wheel every time I tackle a new system down the road, that code of yours might just come in handy. :)

                                          1 Reply Last reply Reply Quote 0
                                          • chipsnblipC
                                            chipsnblip
                                            last edited by

                                            i wasn't too impressed with the gameboy screenshots that were coming in, so i gave in and downloaded a pack of them from emumovies and imported to my skyscraper database. but something was still a bit off, the grayscale images didn't really fit my theme or look very appealing on my modest 13" CRT screen.

                                            so i set out to make them appear more or less like the puke-green original gameboy using Skyscraper's built in compositor, a simple marquee overlay made in gimp, a snippet of xml, and a dash of bash. the result looks pretty ok, so thought i'd go ahead and share my findings with you all.

                                            Before:
                                            before01
                                            before02

                                            After:
                                            after01
                                            after02

                                            depending on your priorities.xml settings for <screenshot>, you may need to fiddle with the brightness/contrast/etc in the artwork_gb.xml file. in this example i'm using imported screenshots that i downloaded from emumovies, which for the most part are the same ones from screenscraper.fr (i think).

                                            instructions for use:

                                            drop this "marquee" file in /home/pi/.skyscraper/import/marquees/

                                            create a file /home/pi/.skyscraper/artwork_gb.xml and paste the following code into it:

                                            <?xml version="1.0" encoding="UTF-8"?>
                                            <artwork>
                                                <output type="screenshot" width="730" height="480">
                                                    <layer resource="screenshot" x="20" width="433" height="390" align="center" valign="middle">
                                                        <balance red="1" green="-10" blue="0"/>
                                                        <brightness value="20" />
                                                        <contrast value="-45" />
                                                        <rounded radius="10" />
                                                        <stroke width="5" />
                                                    </layer>
                                                    <layer resource="marquee" x="20" width="433" height="390" align="center" valign="middle" mode="hardlight">
                                                        <opacity value="80"/>
                                                        <rounded radius="10" />
                                                        <stroke width="5" />
                                                    </layer>
                                                    <layer resource="cover" height="212" x="0" y="-10" valign="bottom">
                                                        <shadow distance="5" softness="5" opacity="69" />
                                                    </layer>
                                                    <layer resource="wheel" width="250" x="-10" align="right">
                                                        <shadow distance="5" softness="5" opacity="69" />
                                                    </layer>
                                                </output>
                                            </artwork>
                                            

                                            next you can go one of two routes: download this zip package (851.0 kB) containing 1,870 pre-made .png copies of the marquee, which should cover most no-intro rom file names out there, or, open a terminal and run a command that will make copies of the marquee for just your set of roms. here's a basic command that assumes your roms are zip files. if not just exchange both instances of .zip with .7z/.gb:

                                            ls ~/RetroPie/roms/gb | egrep *.zip | sed -e 's/^/cp "gb_marquee.png" "/' | sed 's/.zip/.png"/' > ~/.skyscraper/import/marquees/gb_roms.txt && cd ~/.skyscraper/import/marquees && bash gb_roms.txt && rm gb_roms.txt
                                            

                                            the final step of course is to import them into your Skyscraper database, and that's pretty much it.
                                            Skyscraper -p gb -s import --pretend && Skyscraper -p gb -s localdb --nobrackets -a ~/.skyscraper/artwork_gb.xml --region us --updatedb

                                            so feel free to post your results and/or improvements (adding a dot matrix grid to the overlay might look nice..).

                                            i hope this will be useful to someone, even though it barely scratches the surface of what is possible with Skyscraper =)

                                            happy scraping!

                                            muldjordM 1 Reply Last reply Reply Quote 2
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.