Versatile C++ game scraper: Skyscraper
-
@analoghero Glad to hear it :)
-
Working on improving the attractmode export module. Currently testing game descriptions implementation with user qqplayer. Requests for further attractmode improvements should come now, as this is what I am focused on. So let me know if you have anything that should be supported.
-
Skyscraper version 2.4.6 released: https://github.com/muldjord/skyscraper
- Added 'overview' support for AttractMode. It will now create the necessary cfg files to show the game descriptions
- Added '
<kidgame>bool</kidgame>
' output to Emulationstation gamelist generation - Added 'ages' support in 'screenscraper' module. Will convert PEGI and ESRB to numeric
- Added 'ages' support in 'thegamesdb' module. Will convert PEGI and ESRB to numeric
The 'kid friendly release'. As per user request, I've implemented the
<kidgame>
Emulationstation tag and added classification scraping support for both the 'thegamesdb' and 'screenscraper' modules. This means that you can now make use of the "kids" mode in Emulationstation, which will then hide all non-kid-friendly games from the lists.IMPORTANT! If you want to scrape the 'ages' on games you already have cached, you have to scrape using the '--updatedb' flag. Otherwise it will just scrape with the pre-2.4.6 cached data which doesn't contain the ages resources.
Lastly, I've also added 'description' support for the 'attractmode' frontend module. I'm not an Attract-Mode user myself, but I've done basic testing on it, and user 'qqplayer' has verified that it works as expected.
:)
-
@muldjord I've seen this very nice and complete MSX db : https://www.msxgamesworld.com/index.php
maybe you want to check it for scraping.
Thanks as usual! -
This post is deleted! -
[deleted, makes no sense]
-
@muldjord Thats bad news. Good that there is still some other good sources left.
-
Hey bud,
I know it's been a long time since we talked. I'm still plugging away at my NES project.
I can't remember if we discussed it at all back in the day, but I was wondering if you'd be interested in scraping off of my synopsis files whenever I finally release them? I'd be putting them in several places, but I do think that github would probably be the best for something like this. I still haven't attempted uploading anything there myself, but I've seen archives other people have made with artwork and such that would seem to be a perfect fit with my work and the RetroPie experience.
I eventually plan on doing all console and handheld systems between Odyssey II and the SNES/Genesis era, but who knows whenever I'll actually complete them all.
So far, 2,118 unique NES and FDS games are accounted for, including all complete translations, pirates, unlicensed, English Japanese games, Europe games that weren't released in the US, and hundreds of game hacks. I probably won't be adding many more at this point except for all new translations that come out before my release as well as any interesting looking hacks.
I've still got a lot of work to do and I think I want to release everything at once, so I can't even give a timeframe of when I'd make this available to you, but you're welcome to scrape it when it's out there. :)
-
Thank you, it's appreciated. The problem here being finding a place where I can scrape from. I am almost certain I am not allowed to bash away on the github servers for this purpose (which I will respect of course), so people would have to clone your files themselves and scrape from it using the import module somehow.
We can look into the options when you have it online somewhere.
-
[deleted, makes no sense]
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
As I mentioned before, it won't work without an api key
So each user will need an API key, they don't have per-app app keys ?
-
[Deleted, too much negativity.]
Bottom line: 'thegamesdb' is rendered useless for automated scrapers with the new api, since they decided to go with a developer api limit instead of a user request limit.
I don't even know why I am implementing the new api. It won't work for anyone except myself...
-
@muldjord I read your comment (even the deleted one, with too much negativity)
Well you're right with your expressions. But there would be annother way. Maybe with your API you can ask for unlimited access, just write the guys from thegamedb and tell them what is your project for. To be true, I'm the wrong person here to talk to. Because I never scraped any game ;) I just use the bare filestructure.
If this won't work then consider with every closed door a new one opens... Why not building your own database? You have a mass of users here that will likely help to build this. But on the other hand there are also a bunch of other databases ;)
-
@cyperghost I've just created a post on their forums, asking them about their stance on automated scrapers. Their answers will basically decide the fate of the 'thegamesdb' module in Skyscraper. We'll see...
EDIT: I keep asking myself why I keep working on this project. There's so much negativity related to using these databases. It's always a feeling of being unwanted. The only site that implemented a great solution for this is screenscraper. They basically let their users earn requests, either by paying or by updating the database. It's a true community effort.
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
Thank you, it's appreciated. The problem here being finding a place where I can scrape from. I am almost certain I am not allowed to bash away on the github servers for this purpose (which I will respect of course), so people would have to clone your files themselves and scrape from it using the import module somehow.
We can look into the options when you have it online somewhere.
Cool man.
I didn't think of the draw it might have on github. I was actually thinking of uploading all of the matching artwork there as well based off of other people's suggestions, but that would probably be a much larger draw on github's resources.
If anybody knows definitively if this would be something github would frown upon, please let me know. :)
As for the synopsis, it shouldn't be a huge deal for somebody to just download a zip, extract it to the correct folder(s) and then run your scraper. In total the NES/FDS synopsis is only around 2MB. After re-writing them from scratch I did my best to keep each entry to 1kb or less per game, and that includes all the other tags as well as the description. My original synopsis files were much larger, and some of them were damn near novels in length. On the XBox this made sense, where you could pull up the screen to read it and scroll through the info and trivia at your leisure. On the Retro-Pie where it's displayed as part of the romlist info and auto-scrolls, that didn't make any sense.
-
There's so much negativity related to using these databases. It's always a feeling of being unwanted.
That's easy to explain!
These databases or sold together with ROMs on amazon, ebay, alibaba .... for much money. For a full flavoured, ready setted, no brainer, setup ... and the database itself is "FREE" for private usage.It's the same with RetroPie, the software is sold for lots of money ... but RetroPie is "FREE" for private usage.
No joke people pay much to much for things that they can get free.
Why? A system that is able to run out of the box. I saw 128GB sd-cards with romset and scrapes being sold for 110$ - nuts! -
That's an entirely different discussion, but not an irrelevant one at that. Yes, people who sell anything related to these projects are a pest.
But in the case of games databases, the big problem is server load. It costs a lot of money to keep those databases up and running. And allowing unlimited access for everyone is not really an option. That's why I love how they did it with 'screenscraper'. It makes sense. The most active users, will be the ones who have the better access. As it should be imo.
-
what about some sort of [decentralized] peer-to-peer distribution system? i'm not familiar with the technical side of things, but bittorrent comes to mind. just a thought, sorry if this was out of line with the discussion at hand.
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
That's an entirely different discussion, but not an irrelevant one at that. Yes, people who sell anything related to these projects are a pest.
But in the case of games databases, the big problem is server load. It costs a lot of money to keep those databases up and running. And allowing unlimited access for everyone is not really an option. That's why I love how they did it with 'screenscraper'. It makes sense. The most active users, will be the ones who have the better access. As it should be imo.
Honestly, when I complete a system, it won't require any server load or scraping at all. It could just be torrented pretty much anywhere, unpackaged, FTP'd and with a little rom hunting on the end user's part it should all work out of the box. The media will be packaged exactly as it should be transferred to the Pi file system and the gamelist.xml will also be created for it that points to every available piece for each game.
The only thing that won't be included is the roms themselves, but I intend to do everything humanly possible to make this as pain-free as possible. I think there may be a way to create datfiles with sub-directories, which should make the process much easier to ensure that not only are your roms all named correctly but you have all of your games in the correct folders for all of the media to work. If not, I will have to figure out how to create a script that will take the good roms and move them to the proper directories after you've run your own through Romcenter or clrMame. Rest assured that it will be the absolute easiest possible method when I take this very important step to make sure everyone can enjoy the sets without having to have a degree in rocket science.
My spreadsheets also list the CRC values of the roms as well as the matching GoodTools, No-Intro and TOSEC file names if they are available. I also have a "COUNTS" page, which shows out of the entire number of games in the collections how many each of those mainstream sets are missing. Before I release everything, I will likely add a column for "NonGood" sets and/or anything else I can come across that would make finding the missing games even easier. A majority of the current "missing" ones are hacks and translations that were made between 2010 and today. I do believe there are actually a few lesser known collections that should have a lot of matches for those.
I would like to put these somewhere that users of your scraper could get. Not just the synopsis files, but the artwork as well. If github wouldn't be a good option, I'd love to continue a discussion about where the best place would be as I get closer to my release.
I still haven't done new videos or finished working on game manuals yet, so it will be quite a bit of time before I finally put this out there.
I should be able to open the spreadsheet to the public soon though to show what work has been done and so anybody interested can see the progress as it's made. :)
-
@chipsnblip This is certainly an interesting concept. I've asked about the issues at TheGamesDb's forum, and they actually seem to seek out a decentralized system (not p2p though). If I understand it correctly, they want to allow a bunch of copies of their database to be downloaded from the get-go. Then those will be placed on mirrors that are used by any app. And instead of re-downloading the entire database each month, they supply updates instead.
Unfortunately that doesn't help much in Skyscrapers case, as I don't have the option of running a Skyscraper game database anywhere.
So I guess they see thegamesdb.net as the central hub, where everyone will update and add games. And then they have, sortof, slaves, that mirror the updates every once in a while to stay updated. And any app would use those instead of the central one.
Their core api is still not finished, so time will tell how they decide to do this.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.