Versatile C++ game scraper: Skyscraper
-
@muldjord Thank you
-
Hi,
I recently scraped PC platform with about thirty titles.
Unfortunately, the generation step assigns wrong resources to games.
Ten games with same snap and/or description.My setup for Windows PC platform, consists of Autoit (.au3) files. This files differ only in name, but the content is the same. Could it depend on this?
For example
#2/30 (T4) ---- Game 'BlazeRush' found! :) ---- Scraper: cache Title: 'Yoku's Island Express' (thegamesdb) Platform: 'PC' (thegamesdb)
#3/30 (T2) ---- Game 'Cuphead' found! :) ---- Scraper: cache Title: 'Yoku's Island Express' (thegamesdb) Platform: 'PC' (thegamesdb)
-
@o0alucard0o Yes, that is exactly the reason. For most files Skyscraper does a sha1 checksum to id them. But for some script files I do the sha1 checksum on the filename instead, since they either often change, of have the same contents.
.au3
files are not currently on this list, so if your files have the same contents, it will have the same id in Skyscrapers cache.I will add
.au3
files to this list, so in future releases this won't be a problem. Thanks for reporting it. -
@muldjord Any reason when I rescraped my NES many of them now are showing "load screens" (e.g., Select 1 or 2 player screen) vs. gameplay screens?
-
@AlCzervik What source is it using for the screenshots? (you can check it with
Skyscraper -p nes --cache edit FILENAME
. The reason is that sometimes the source has the title screens in their databases as screenshots. It's quite normal, but is a bit of a challenge when scraping from them, since I don't know one screenshot from another. -
@muldjord I'm using screenscraper. Is it random? As in, if I scrape again would it pull different images they've stored? Some examples I know changed were Super Mario 3, Metroid, Zelda 2.
-
@AlCzervik When scraping with
-s screenscraper
it will first look for a screenshot of the typess
which is usually a normal screenshot. If it doesn't find that, it will look for one calledsstitle
which, obviously, is the game title screen screenshot. So in this case, my guess is that it doesn't find any of typess
and therefore uses thesstitle
screenshots. -
@muldjord That must be it for those titles but it is strange that they changed recently. These are older NES titles so perhaps screenscraper changed for some of those titles. Not a big deal, just wanted to point it out. Thanks!
-
@muldjord i've got a quick question about skyscraper, i've tried messing with the
--cache
settings but haven't been able to quite hit my use case. i scraped all my roms over a couple days with skyscraper from screenscraper, but i had videos disabled. now, i'd like to add videos, but i haven't been able to figure out how to get skyscraper to just scrape videos. because the roms already have metadata in the cache, skyscraper seems to want to either skip them entirely, or with a--cache refresh
completely scrape them fresh. is there a way to tell skyscraper "the metadata is fine, i just want you to scrape the videos that are missing and not rescrape all the other fields?"ive tried running it with just
--flags videos
and it asks me if i want to skip already existing game list entries. saying yes obviously skips all the roms, and saying no will show that it finds the rom, show all the metadata and media i have (with a big red NO() for video), and then do nothing and proceed to the next rom. if i scrape with--cache refresh
it finds the rom and the video and will download and include it. i was under the impression skyscraper would automatically add media that is missing but that doesn't seem to be the case, or more likely i am doing something wrong. guidance would be appreciated! i don't want to spend another 2-3 days hammering screenscraper.fr to get videos when i already have 90% of the scrapes i neededit: i also tried with
--cache edit:new=videos
but it errors saying it only supports certain resources. not sure that would have solved it for me anyway since i fear it would have expected manual input rather than scraping but i tried it regardless -
@theshadowzero said in Versatile C++ game scraper: Skyscraper:
is there a way to tell skyscraper "the metadata is fine, i just want you to scrape the videos that are missing and not rescrape all the other fields?"
No, it will always scrape the metadata since it's "free". When a game is looked up, the metadata is already there in the data received from the server. So it would be pretty silly to not use it.
It's the media that might take some time to grab - especially the videos of course.I would simply rescrape all of the games with videos enabled. That's the easiest for this use case. You will need
--cache refresh
for that to work.i also tried with --cache edit:new=videos but it errors saying it only supports certain resources. not sure that would have solved it for me anyway since i fear it would have expected manual input rather than scraping but i tried it regardless
Correct, you cannot use
--cache edit
for this use case. It can only edit metadata and add metadata. Not media. -
Skyscraper 3.5.9 released: https://github.com/muldjord/skyscraper
- Implemented the new IGDB v4 authentication method. IGDB will now work again, and requires free credentials. Read more about that here
- Improved memory consumption when handing entries back to main thread
- Added '.au3' file extension to id script exception list (Thank you to 'o0alucard0o' for reporting this)
Most prominent feature is the updated IGDB v4 API authentication method. This was broken due to IGDB moving away from their old v3 API. This is now fixed and requires free personal credentials. Read more about that here.
Let me know if you run into issues.
-
@muldjord Perfect timing as ScreenScraper is currently down! 😉
Seriously, thanks for still working on this awsome tool. 👍
-
@Clyde Yeah, they seem to be hit hard this time. I hope they aren't about to shut down... :S
EDIT: Ok, just visited your link. So they are working on it. Good to know. :) Thanks.
-
Hi,
I just noticed that Skyscraper doesn't seem to generate a valid gamelist.xml for Daphne, at least not from the file structure for Daphne that's described in the Docs.Since that structure requires directories instead of archive files, Skyscraper writes a
<folder>
element instead of<game>
:<?xml version="1.0"?> <gameList> <folder> <path>./ace.daphne</path> ...
This makes Emulation Station ignore the game's entry:
Nov 19 23:26:24 lvl2: Parsing XML file "/home/pi/RetroPie/roms/daphne/gamelist.xml"... Nov 19 23:26:24 lvl1: gameList: folder doesn't already exist, won't create Nov 19 23:26:24 lvl0: Error finding/creating FileData for "/home/pi/RetroPie/roms/daphne/ace.daphne", skipping.
Can you confirm this? Should I open an issue on Github?
Thanks
Clyde -
And another question about Daphne:
SS scrapes Space Ace wrongly as the MAME aircraft game Ace, probable because of SA's rom directory
ace.daphne
. How can I fix that?Thanks again
Clyde -
@clyde I spend quite a lot of time getting Daphne to work some time ago I think, but I can't remember how it works. I'll have to spend some time looking into that when I get the time.
EDIT: @Clyde, please check the release info here. That version (3.5.0 and 3.5.1) is where I implemented
daphne
properly. -
@muldjord Of course, take your time. And please tell me if I can help you with that in any way (other than coding, which I am not proficient in).
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
EDIT: @Clyde, please check the release info here. That version (3.5.0 and 3.5.1) is where I implemented
daphne
properly.Alas, I'm using version 3.6.1 already, and I have four roms in daphne/roms (ace.zip, dle21.zip, lair.zip, lair2.zip).
-
@clyde said in Versatile C++ game scraper: Skyscraper:
And another question about Daphne:
SS scrapes Space Ace wrongly as the MAME aircraft game Ace, probable because of SA's rom directory
ace.daphne
. How can I fix that?Thanks again
ClydeBy SS do you mean Skyscraper or Screenscraper in this case? :D
-
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.