Versatile C++ game scraper: Skyscraper
-
@aphyx Check the artwork.xml documentation. You can add an <artwork type="screenshot"> node that has the <layer type="cover"> inside. It will then export the cover as the screenshot (which populates the <image> node as you request).
Concerning the <cover> node I can't remember if I export that currently. I think I do, but I am on vacation currently so I'll have to check it when I get back home in a weeks time. :)
-
So, currently working on the option of providing filename(s) on command line. My current approach is this: If one or more filenames are provided on command line, it will scrape those using the provided platform with '-p'. It will NOT alter the game list you have, and it will NOT process artwork files. It WILL cache or update the local data and media for the games provided, meaning that a scraping with '-s localdb' afterwards, will make use of the data.
The usefulness as I see it, would be that it would allow you to experiement with changing the filename for a better result from the scraping. And when you've found one that works, you can scrape everything again.
Would this be useful to you guys? Comments? Suggestions? It's been requested quite a few times, so I'm very interested in feedback on this.
-
Skyscraper 2.2.6 released: https://github.com/muldjord/skyscraper
- Now always caches resources locally, even if pretend is set
- Optimized 'simple mode' generated script. Now has '--pretend' set for all non-local modules to avoid artwork processing on those runs. This is a lot faster and provides the same result
- Added the possibility to supply one or more filenames on the command line - it will then ONLY scrape those particular files. Platform still has to be set with '-p' for this to work
- Fixed bug where [tags] would be appended twice when using '--forcefilename'
The much requested feature of providing filenames on command line is implemented in this release. When used, it will scrape those files exclusively and cache the resulting data in the local database cache. To make use of the data afterwards you need to rescrape the entire platform using '-s localdb'. Please give it a go and let me know what you think.
Happy scraping! :)
-
@muldjord whatever I do I just can't get the textual data to scrape and I've created the definitions.dat folder and the made the ROM base.txt file what am I doing wrong? These are the files that I have made:
Description: ###DESCRIPTION###
Developer : ###DEVELOPER###
Publisher : ###PUBLISHER###
Rating : ###RATING###
Genre : ###TAGS###Description: Being a real-estate magnate used to be hard work. Thanks to technology, the computer does all the hard stuff, like rolling dice, for you. Freed of the anxiety of arguing over leaners, you have plenty of time to strategize your next financial move. Up to four players, or you and three computer opponents, take turns around the board. A player wins when the others go belly up. You can also play a quick game mode or a time limit game. In any of the game modes, you can institute your own house rules. Some of those could include awarding a person for landing on free parking, dealing out some properties at random to begin the game, or changing the number of properties you have to own in a group before you can build houses. The easy-to-use interface makes it easy to manage your finances. When you have multiple players, the games can take a long time to finish, but that's the nature of Monopoly.
Developer : Takara
Publisher : Destination Software Inc
Rating : 3.5
Genre : StrategyThe name of the file is Monopoly (U).txt
-
EDIT: Nevermind, read your post wrong. Looks ok to me, not sure what is wrong. I recommend using an xml based format instead, try that.
-
@muldjord tried that and it is the exact same result (i tried it first)
-
@muldjord what should I do?
-
All I can say is that if you've followed the documentation completely, it WILL work. I'm assuming you might have something wrong with where you've placed the files, what you've named them or what command line you run Skyscraper with. And since I don't know any of these things, I can't tell you what to do.
-
@muldjord i send pictures and the command I use is Skyscraper -p gba -s import
-
Visit my Twitter Take a look at Sammy boy (@samsaju04_boy): https://twitter.com/samsaju04_boy?s=09
-
@muldjord the pictures are on Twitter - the first thing you see
-
Please see next post.
-
Please try rerunning it with 'Skyscraper -p gba -s import --updatedb'. That might be the problem.
-
Skyscraper 2.2.6a released: https://github.com/muldjord/skyscraper
- Now always sets '--updatedb' when using 'import' scraping module
Get this release, then it'll work without '--updatedb'
-
@muldjord Thank you so much!
-
Hi @muldjord , I used your software for the scraping the images for roms and I loved it.
Seems to work almost OK, sometimes when game names are closely the same e.g. gameName 1 and gameName 2 it messes the description for these games. There's might be a chance that these descriptions are messed up in the location where it get it from at the first place, dunno.
When I checked the Lakka which is another Retro gaming fontend seems that they get images and thimbnails from the libretro. Dunno is it only based on the name of the game.
Any chance to use also this location which is used in Lakka for the images to the games? :)
https://github.com/libretro/libretro-thumbnails/tree/master/
-
@jura It doesn't mess them up, rather it sometimes gets faulty results from the sources, which is a side-effect of the highly automated way Skyscraper works. For instance, if you have a relatively unknown game called "Star Blob" and a game called "Star Fox", it will quite likely get a correct hit for "Star Fox" since it's a well-known game. BUT, it will probably also return the data for "Star Fox" when it searches for "Star Blob" since that's how the source search engines work. So it doesn't mess them up, it just, sometimes, gets faulty results where it seems like the previous game has "spilled over" into the next entry, when in fact it only does so because the names are closely related. In this example by the "Star" in the name.
You can force it to not accept these by changing the minimum match percentage with the '-m' flag on command line. :) It defaults to 50%, which means that "Star Blob" and "Star Fox" gives a positive result.
I can't use a github repository for scraping, simply because I do not have the permission from github to do so. Github is not meant to be used for game media scraping and I highly doubt they would allow it if they knew people used it for that purpose.
EDIT: On a sidenote, I plan on completely rewriting the way matches are achieved, so matches will improve in a future version. No ETA though.
-
A huge thank you to @chipsnblip for spreading the word about Skyscraper on reddit. I notice these things, and I appreciate it a lot! :)
-
@muldjord it's my pleasure to help! i've been lurking in this thread since day one, and to tell the truth haven't even used skyscraper yet (but plan on it very soon-- life keeps getting in the way). i was just so impressed watching one person's hard work and dedication to a project of this nature. seeing people pour their heart into the retro gaming community is truly inspiring.
-
On a related note: If any of you good people feel Skyscraper is a project you want to support, then a good way to do so is simply to spread the word about it. :) I'm not exactly a PR guru myself, so I'll concentrate on the coding part of it. When people do lend a hand it means quite a lot, and is quite motivating.
As you know Skyscraper is completely open source and free to use (I'm a huge advocate of openness in general). I do this work entirely in my spare time simply because I enjoy working on it. Part of that is watching the community actually make use of the project. So a thank you to all of you for simply using it is in order aswell. :)
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.