Versatile C++ game scraper: Skyscraper
-
PLEASE UPDATE ASAP!!! Quick note: I urge you all to completely remove your "[homedir]/.skyscraper/dbs" folder as it contains faulty data. If you don't it will just reuse those data and it will appear as if the mixed entry bug hasn't been fixed at all!
Skyscraper version 2.3.5 released: https://github.com/muldjord/skyscraper
- IMPORTANT: Fixed bug that caused resources to be mixed up between games because Qt's network cache wasn't cleared (Probably not a Qt bug, but a DAMN hard bug to spot either way). All previous Skyscraper releases have this bug, SO PLEASE UPDATE!!!
- Made 'scummvm' parsing look for config in homedir aswell ('.scummvmrc')
- Now always removes brackets from returned titles
- Now always shows current scraping module in output
- Optimized search matching even more
- No longer asks user about skipping entries if filenames are provided on command line
This release should make mixed entries a thing of the past. I thought this bug was related to the local database cache (and I thought I already fixed it). But it was not! It was related to Qt's internal network requests caching its own data and sometimes returning the data of the previous request. After I included a full cache clear on it, I can no longer make it mix up entries. I've been testing it for well over 15 times on thousands of entries over the last two days. Before this fix I got the problem within 3 tries. I haven't been able to since.
I've also improved the search matching even more, which should now result in some positives that were filtered out before are now back in. Quite a lot actually. With that said, there will be the occasional false-positive still. It can't be prevented 100%. If you are very strict about this, please remember that you can set the "Minimum search percentage match" with the "-m" command line parameter.
If you experience problems with this release, please let me know. Also, I will be without internet for the next couple of weeks, so I won't be able to provide support or comment.
Please update asap! And happy scraping! :)
-
@muldjord Do you made a enquiry to add this scraper as standard installation of RetroPie?
-
@cyperghost No, I'm not sure how that works. Someone told me to contact someone on the forum, but I haven't had the time to look into it yet.
-
@muldjord Thanks for the update. Quick question: Is ~/skyscraper the new homedir? Before it was ~/.skyscraper. Just want to be sure.
-
@analoghero Homedir is still "~/.skyscraper". I can see the confusion. Perhaps I should change the github instructions a bit.
EDIT: Changed the instructions to "skysource" instead of "skyscraper". It's has no functional difference other than clarity to avoid confusion.
-
@muldjord I've read the documentation but not found this, could i change the folder of the localdb? I am running out of space in my SD. I already have mounted in an external HD the roms, so i thought i could move the localdb to the hd, and maybe in someway in the config file or whatever, indicate Skyscraper to use the localdb in the HD? Thanks!
-
-
@analoghero Thanks, I could try a little hacking (modifying the source) here to see if i can force the localdb in the mounted hd. It's out of my coding skills, but having a windows port could be a lot lot easier, i tried Cygwin, but as muldjord told me, this is not going to work so easily.
-
@bleuge The ability to define the path for localdb is an option i also would like to see. Great for us that use usb sticks and hdd's.
-
@bleuge @Rion Not sure how you are running Skyscraper but there is
-d
switch which may be of interest. If you run the commandSkyscraper --help
, you can see all the available options.Here is the relevant output for the
-d
switch:-d <folder> Set local resource database folder. (default is '[homedir]/.skyscraper/dbs/[platform]')
-
@dudleydes I should have checked before i write. Looks like it isnt hardcoded then.
-
@dudleydes That you for pointing that out. 😀
-
Edit - Reference @bleuge request to change the localdb location due to limited space on the SD...as well as my diatribe below:
As suggested, I've "solved" this by mounting a network store in the .skyscraper/dbs folder (via fstab mod, although recommended is through autostart.sh)
This gives the best solution for me, as it caches the downloaded media files in the NAS storage for when I'm home/on my LAN. That's where I would be doing any updates to the scraping anyway. When I take the Pi with me, I don't need the cached files, just those I'm actually using through the final "Skyscraper localdb" command.
I believe the other comments and readme docs mean you could put the localdb on either a separate USB attached drive, or potentially a network share.
However...is it acceptable to just delete the localdb files once you've updated? Assuming you understand the risk that they (obviously) aren't available anymore to reference if you decide to do a full update.
More specifically, when you run updates later for added ROMs, will it delete or corrupt the old information when you overwrite the XML if it can't find the media in the localdb?
In my scenario, with updating initially from import, I could have THREE copies of the same file on the SD. And if we're talking video, that's a lot of space.
1st copy: You import the video files to .skyscraper\import\videos
2nd copy: Run "Skyscraper -p [platform] -s import --videos" and the videos are all copied (and renamed) to skyscraper\dbs[platform]\videos\import folder.
3rd copy: Run "Skyscraper -p [platform] -s localdb --videos" and the videos are all copied again (and renamed back to original) to the roms[platform]\media\videos folder.I assume I can delete the 1st copy once I'm done, but at minimum I still have two copies of the video.
Wouldn't a hardlink between the database and the actual video in the rom folder be better at saving space?
Regardless, as above, can I delete the localdb media files once they've been scanned and copied to the ROM folder without risk of when I update it says "oops, no file, better update the XML file to say it doesn't exist?"Thanks - I tried to make this as clear as possible.
-
@timekills Sure, you can delete the dbs folder. I wouldnt recommend it though, as skyscraper would need to download everything again if you ever want to rescrape.
Maybe youre better off zipping it and storing it elsewhere, so you can reuse it later.Btw: Whats the best way to get video files for your roms? No luck with skyscraper atm.
-
@analoghero I figured as much, but thank you for confirming. It just seems to me there is a better way to store the database if you don't have a need to keep duplicates. Even a choice with a warning to remove once they've been transferred to the ROM folder (or whatever location). I understand anything short of keeping a copy, either local or on a remote localdb location means redownloading.
Regarding video download location, I don't have a great suggestion ATM because I'm fortunate enough to have downloaded all the video snaps over the years. My go-to is always EmuMovies, as I've had an account with them for years. All the way back to when they actually mailed me a full seven DVD set of snaps for everything they had at that time.
-
https://www.reddit.com/r/RetroPie/comments/828a0p/best_way_to_scrape_games_in_2018
Skyscraper wins ;-)
Also thanks for the -d option, I checked the help, but don't know why i missed it.
-
@bleuge This makes me happy. :) Thanks to all of you for your support.
I've just arrived home from a trip to Africa. I'll probably take a couple days to rejoin the internet and then I'll get right back on Skyscraper duties. :)
-
@muldjord Welcome back :)
Have to report good things about using skyscraper. Bought a new (bigger) microsd and thought maybe its time to do video snaps. Backed up my (known good) dbs folders, but deleted them later (because of owner issues, another story). So rescraped everything again (think ive done it 10 times now).
Results are amazing. :)
Have some ideas to improve skyscraper, but just take your time now to sort your things first.
-
@analoghero Happy to hear that! I am currently investigating some stuff with Dom from the Amiberry team for optimizing Amiga scraping and I also need to implement the Mobygames(.com) scraping module. But just post your ideas here and I'll check them out.
-
Has anyone tried scraping SNES recently and successfully had the ratings included?
Across the multiple sites that Skyscraper uses, only thegamesdb seems to get ratings pulled by Skyscraper, and it's not very compelete.However, if I use another scraper (UXS, SSelph, etc.), they will get the ratings from other sites that Skyscraper also uses, to a much higher completion rate (nearing 100% from screenscraper...)
Of course, the problem is none of the different scrapers can agree on how to format the gamelist.xml file, much less where to store the media, so each one overwrites the other.If there were a way to import the data for ALL the games at once, it would be fine. But breaking apart a gameslist.xml file into each one of 1000+ roms and naming each file...not really viable.
Bottom line: any good way to either
- Get the ratings when scraping (specifically SNES, can't speak to other games yet.)
or - Merge the gamelist.xml file from another scraper without having to create 1000+ individually named txt files?
(Note: I could not find in the priorities.xml which of the choices prioritizes which site's ratings are used or how to format just priorities when importing your own data files. I assume it's part of "description", but I don't need or want to overwrite the WHOLE description - just the ratings.)
- Get the ratings when scraping (specifically SNES, can't speak to other games yet.)
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.