Versatile C++ game scraper: Skyscraper
-
Skyscraper 3.1.0 released: https://github.com/muldjord/skyscraper
- MAJOR: Added '--cache edit' command line option which allows viewing, editing and deleting cached resources for the roms in the queue. Narrow the queue down by providing file names on command line or by using '--startat <FILENAME>' and '--endat <FILENAME>'
- Added 'zx81' platform. Note! The only module that supports it is the 'screenscraper' module
I finally found the time to work on the planned resource cache editing features of Skyscraper. This allows you to easily add and remove resources to / from any given rom simply by using the
--cache edit
option. With that option you can edit cached resources for a few roms simply by providing their filenames on command line. Or you can use the--startat
and--endat
option(s). If no filenames are provided on command line or both--startat
and--endat
are left out, Skyscraper will edit all files in your input folder one by one.Any resources added by the user will be marked as
user
and will be prioritized above all other types. The edit feature will also allow you to get a look at what data will currently be used for any given rom when generating the game list. You can also list all resources for any given rom. And it also enables you to remove resources as you see fit, both single resources or all resources of specific types or from specific modules.Have fun! And let me know if this is useful to you. I've already used it myself quite a bit and find it to be very handy indeed.
-
@muldjord must be a screenscraper issue then. My credentials are definitely being accepted but for whatever reason some roms (I have found others now) that will scrape only partial data. It will get the screenshot but not the box art or vice-versa even though all the data is there on the screenscraper website. Hopefully they continue to improve the website. Thanks again for your help.
-
@quicksilver said in Versatile C++ game scraper: Skyscraper:
@muldjord must be a screenscraper issue then. My credentials are definitely being accepted but for whatever reason some roms (I have found others now) that will scrape only partial data. It will get the screenshot but not the box art or vice-versa even though all the data is there on the screenscraper website. Hopefully they continue to improve the website. Thanks again for your help.
Please try rescraping those roms with the
--refresh
option enabled or by providing the filenames on command line to scrape just those few roms that are giving you difficulties. That will most likely fix your issue. I'm very interested in knowing if this works for you. If not, then there's something else going on that I am not aware of. -
@muldjord the refresh option took care of my issue. Once again thank you for your help!
-
@muldjord - a quick question; can the "dbs" folder be relocated anywhere else? My uSD card is full and nothing can be scraped any further despite all my ROMs and skyscraper cache being located on the external 2TB HDD. The dbs folder contains 2.3GB of data, by all accounts.
Can it be relocated? I'd empty it but I'm not sure what effect that would have, and it will be a PITA to have to do that every time I want to scrape a collection.
RetroPie is running from a 32GB uSD card that I cannot access physically, hence I've attached the HDD to the board so that ~/pi/RetroPie is now accessed from there. I edited config to move the cache there also, but the dbs folder doesn't seem to be configurable?
Edit: Even though my config has been edited to move the cache to ~/pi/RetroPie/skyscraper/cache, Skyscraper is still using the original location on my uSD card :(
-
@ZXDunny Can you post your
config.ini
file and your Skyscraper version ? Since 3.x thedbs
folder is namedcache
, so you might be using an older version. -
How does Skyscraper handle/display arcade roms? Obviously there is no "box art". Anyone willing to post a sample? Curious before I start scraping 1500 roms.
-
@quicksilver Here are some samples:
Note that I only used ScreenScraper for scraping - they do have boxart for some of the Arcade titles.
-
@mitu Thank you! This is exactly what I was looking for. I have had good results with screenscraper for console scraping, so will probably stick with it for arcade then. I havent gotten very accurate results with the gamesdb (probably because it searches based on rom name).
-
i was just scraping my gba games and i noticed i got a lot of errors like this:
ScreenScraper APIv2 returned invalid XML for the following query: romnom=Dragon%20Ball%20Z%20-%20Buu_s%20Fury%20%28USA%29.gba&crc=1C1707F&md5=3A74FCE97F1EA2B28C2A50EC3DF0ACEE&sha1=F1C4B07554D2A3B1AD2F325307051E775CE68087&romtaille=8388608
it used to scrape all my gba games perfectly, so i tried to update to latest version, but it still persists resulting in a fail rate of about 6-7% of the games, where it used to find basicly every game in earlier versions of skyscraper
-
@Halvhjearne You may be running into https://retropie.org.uk/forum/topic/11826/versatile-c-game-scraper-skyscraper/1123. It's not a Skyscraper problem.
-
@mitu said in Versatile C++ game scraper: Skyscraper:
@Halvhjearne You may be running into https://retropie.org.uk/forum/topic/11826/versatile-c-game-scraper-skyscraper/1123. It's not a Skyscraper problem.
im a registered donor and skyscraper recognizes i have 6 threads available ...
-
@Halvhjearne The screenscraper site dashboard shows their CPU at 294% right now, so I wouldn't be surprised if their system overload would result in the errors you're getting.
-
@mitu
scraping again without the --refresh seems to work fine ... -
@Halvhjearne Why are you using
--refresh
by default ? It forces Skyscraper to re-fetch the data when you already have it in cache. -
@mitu
im not using it by default, but i do run skyscraper sometimes with the --refresh to see if there is any missing artwork available ... -
That's not what
--refresh
should be used for. If you don't have the corresponding artwork in the cache (marquee, cover, screenshot, video), Skyscraper would try and get it, so you don't have to add--refresh
to get missing artwork. @muldjord can correct me if I'm wrong, but forcing--refresh
just negates any advantage of having a cache. -
@mitu said in Versatile C++ game scraper: Skyscraper:
That's not what
--refresh
should be used for. If you don't have the corresponding artwork in the cache (marquee, cover, screenshot, video), Skyscraper would try and get it, so you don't have to add--refresh
to get missing artwork. @muldjord can correct me if I'm wrong, but forcing--refresh
just negates any advantage of having a cache.This is true, but there is one issue where it helps. When the screenscraper servers are overloaded, some of the media might not be scraped for a certain game, even though the textual data is (each artwork resource is a separate request that might fail due to the high load). So in those (hopefully rare cases, especially for registered users) they will need to use
--refresh
to grab them.With that said, I would always recommend only scraping a few roms with
--refresh
enabled. Instead, just add those few files to command line to grab the missing data. That is much better than scraping everything again. -
@muldjord Is there a way to use the
--refresh
function only on roms that are detected as missing some media? I.e. screenshot, box art or description are missing? Im not sure if what I am asking makes sense. Unfortunately for me I scraped hundreds of roms before realizing that credentials for screenscraper made such a difference. Now there are tons of roms that are missing some media. Going through and checking them one by one will be a huge task. -
@quicksilver said in Versatile C++ game scraper: Skyscraper:
@muldjord Is there a way to use the
--refresh
function only on roms that are detected as missing some media? I.e. screenshot, box art or description are missing? Im not sure if what I am asking makes sense. Unfortunately for me I scraped hundreds of roms before realizing that credentials for screenscraper made such a difference. Now there are tons of roms that are missing some media. Going through and checking them one by one will be a huge task.What you are asking makes perfect sense, but it is not possible to do this at the moment I'm afraid. I had not anticipated so many requests to be rejected, so this is actually quite a big problem at the moment. And it's a bit of a bad circle to get into. People will notice media being missing, and then they will start using refresh, which again puts even more load on the source servers and so on...
I am wondering if screenscraper has any sort of server caching installed on their service. This could potentially alleviate the problem. I've asked on their forum.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.