Versatile C++ game scraper: Skyscraper
-
@muldjord I still have some more NES games to add and some more synopsis tweaks, but I'm going to be away from this work for a few weeks unfortunately. I won't forget about this though and you'll get the pack of updated NES synopsis.txt files when they're done. I've got this page bookmarked so I can catch up when I'm back to it.
-
@Used2BeRX No hurries, I'll look into it when you have things ready. :)
-
Is there a way to change the output of the <name> tag that is written to the gamelist.xml file such that the country information is not shown in ES? It looks like it is currently using the file name of the rom. For example:
Super Fun Game Deluxe (USA, Europe, Japan).zip
should be displayed in the ES game list as
Super Fun Game Deluxe
Is this already possible? I looked throught the readme and the command line switches but didn't see anything relevant. Thank you!
-
@incunabula No, this is not possible at the moment. It is not actually using the file name. What it does is register any round and squarebracket "notes" as I call them. It then uses the web result title, and adds the notes back in.
I'll add a '--nobrackets' option in the next release that will disable the bracket tags. :) You will also be able to set 'brackets="false"' in the config.ini file under both [main] and [platform]. That should give you plenty of options for disabling it. :)
Also, stay tuned for attractmode support, also coming in the next release.
-
@muldjord Hey guys. I won't be around for a few weeks, but I'll get right back to work on the NES synopsis stuff when I do.
I will only have the NES/Famicom/FDS games done at that point, but the problem you mentioned above won't be an issue if you scrape from the top line of the synopsis. I have a spreadsheet that displays all of this, but I wasn't able to get it ready for a public release before it was time to wrap it up unfortunately.
The spreadsheet shows the file name for the roms, synopsis and all associated media. It shows the top line of the synopsis which is the name displayed in the romlist using meleu's script, as well as using the XBox emulators. It also shows which games have a manual, which ones have videos, and the exact dimensions of the raw artwork files for Box Front and Cart images. Every single game has these two images. Hundreds of them were made by me personally for the more obscure games, hundreds more have been touched up to varying degrees, and anywhere from 1,000-1500 of them were cropped slightly to have a uniform look and to get rid of any beat up edges. (Most US games had great restorations done by other people, but a lot of foreign, pirate and other games had some pretty shoddy boxes even if they were HD images).
Anyways.... gotta go for now, but I'll be back soon. Good luck on your project muldjord.
-
@Used2BeRX Have fun dude! I'll be here when you get back. :)
-
@muldjord That would be great, thank you! If you need someone to test out new builds or whatever, i'm happy to help.
-
@incunabula If I can get you to test the current built, that would be great. The tag option I mentioned is in there and I've fixed a bunch of other stuff aswell. Any feedback would be very welcome before I release it officially:
https://github.com/muldjord/skyscraper/archive/b031b26889c827f2559995fc6592f827e4b00a4b.zip -
OK, i'll give it a go. Something i noticed last night was that my MSX and Game Gear roms had to be unzipped before they could be scraped. All other platforms that i've scraped so far worked fine as zipped.
-
@incunabula Yes, according to the RetroPie wiki, the GameGear and MSX emulators don't support .zip files. So that's why it isn't included. If you can confirm that it works with those filenames without unzipping them, I can easily add it. Using zipped roms does have a few disadvantages though, so I don't recommend ever using zipped roms.
EDIT: Let me elaborate a bit on that. I use the sha1 checksum for storage of local resources as a means of having a unique key per rom. For best results the actual rom data is preferred.
Another disadvantage is that the 'screenscraper' module uses the sha1 checksum of rom data for identifying them. If they are zipped, it can't do that. And unzipping them internally makes no sense, since zips often contain more than 1 rom.The only reason I can see to actually zip roms, is that it makes it easier to pack together different roms for the same game. It doesn't save much space because of the type of data anyways, and if you use a zip with multiple roms, it actually costs you space, since you have a bunch of roms inside the zip you won't ever use.
So, that's my thoughts and concerns on the subject. :) Not trying to tell you what to do, just thought I'd give a bit of background for why I think zipped roms is a problem.
-
That makes sense - the checksum could be anything depending on what application was used to compress the file, what level of compression was used, etc. Ok, i'm fine with unzipping these roms (rather small file sizes already) but i can confirm that MSX (using lr-bluemsx) and GG (using lr-genesis-plus-gx) both do in fact work with zipped roms.
-
Is there a way you can add local folders as a scraping option? For instance if i already had a folder of boxart and it has some images that the scraper modules are not returning could it be possible to "scrape" images from a default path like %roms%/boxart ?
-
@incunabula I'm considering options for this, same thing I am working on with Used2BeRX. The problem here of course being that people tend to have varying ways of saving these files so I would need to handle "all of them" so to speak.
If would be really cool if you could sortof "import" a snap folder into the localdb and then just scrape using the 'localdb' module and it would use those snaps. But all local db resources are identified by sha1 checksum, and the snap images that are not easily identifiable other than from their filenames. And that poses a big problem.
I think a first implementation will be to define an xml format where your snaps would need to be in. Then you can scrape from that. That would then automatically add it to the localdb. So when you scrape from 'localdb' afterwards, it would use those snaps aswell as any other snaps you might have acquired.
EDIT: I realize what I just wrote might seem a bit fuzzy :D Bottom line is that I am considering it. And if I find "the right way"(tm) I will implement it.
-
@incunabula said in Versatile C++ game scraper: Skyscraper:
That makes sense - the checksum could be anything depending on what application was used to compress the file, what level of compression was used, etc. Ok, i'm fine with unzipping these roms (rather small file sizes already) but i can confirm that MSX (using lr-bluemsx) and GG (using lr-genesis-plus-gx) both do in fact work with zipped roms.
Exactly. But I will add .zip to both MSX and GameGear, no problem.
EDIT: zip support is implemented in this release, feel free to test it: https://github.com/muldjord/skyscraper/archive/a820fbd8368372c09c5f3a00338dbf040bc23e43.zip
-
@muldjord I understand, and i don't mean to be pushy by requesting new features. Just suggesting things that might be useful to others. So far 3 out of 3 platforms are working OK using the --nobrackets switch. :)
-
@incunabula Suggestions are always more than welcome. :)
-
Is adding support for other platforms as simple as adding additional if/else statements in the Skyscraper::run() function? I know nothing about C++ but have done a bit of VB .net so i apologize if this is an ignorant question or if i'm using the wrong terminology hehe :)
-
@incunabula Almost. Adding new platforms is probably the easiest thing to do in Skyscraper. :) As long as the platform doesn't have any special demands. If it's just an ordinary platform with ordinary file formats that need to be run, then I can do it real easy. Just let me know which ones you'd like in there.
-
Skyscraper 1.7.0 released: https://github.com/muldjord/skyscraper
- MAJOR: Fixed and refined 'attractmode' frontend implementation, now works in a basic manner
- 'attractmode' can now skip existing entries
- 'emulationstation' now properly add brackets to 'name' on skipped entries
- Added check for 'db.xml' when doing '--cleandb'
- Refactored GameEntry variables
- Changed GameEntry from struct to class
- Added 'Overall title similarity' to final output
- Added 'Overall completeness' to final output
- Code refactoring here, there and everywhere
- Now accepts results where we have low editDistance, but high similarity (For instance "Disney's Darkwing Duck" with fileName "Darkwing Duck").
- Added '--nobrackets' option that disables and [] and () tags in the frontend game titles. (Thanks for the feedback 'incunabula')
- Fixed bracket parsing
- Now always uses completeBaseName since some filenames have more than one '.'
- Completely rewrote sorting algorithm. 30 lines became one with a nifty C++11 lambda :D
- Added zip format to GameGear and MSX platforms
- Now uses filenames for output image files again
Have a great weekend everyone!
-
Skyscraper 1.7.1c released: https://github.com/muldjord/skyscraper
- Moved all source files to 'src' folder
- '[homedir]/.skyscraper' is now default folder for all files used by Skyscraper
- '/usr/local/bin/Skyscraper' is now default location for Skyscraper executable
- Refined '--help' output a bit
- Fixed lemon64 scraping
- Added 'lemonamiga' scraping module
- Added '--skipped' command line option
- Added 'make install' for correct installation of files
This is more of a clean-up release. First of all, I've rewritten the compile and install procedure in the readme on github, so be sure to use it exactly as written. Basically what I've added is a "sudo make install" command, which will take care of installing the files in the correct locations on your Pi. This ensures that each time you compile a new version of Skyscraper, it will install it in the same place, thus perserving all of your configs and local dbs properly. It was a bit of a mess before, sorry about that.
In other news, I've fixed the lemon64 scraper. I've also added the 'lemonamiga' scraping module. I've also added the '--skipped' option which allows you to always include skipped entries in the resulting gamelists. It is especially useful for attractmode.
As always, if you run into trouble, let me know. I expect a few bugs in this release, since I've moved a lot of code around and refactored quite a bit to enable the '--skipped' option.
Have fun!
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.