Versatile C++ game scraper: Skyscraper
-
@livefastcyyoung Thank you!!! This is really helpful, it tells me a lot about what is going on. I still don't know why it scrapes ALL games like this for AnalogHero though. I'll investigate this further and provide a fix once I figure out how to do it properly.
For now, if any of you guys are having the same issue, please let me know!
EDIT: It seems to be an error on screenscrapers part, returning faulty results, maybe because of some server issues they are having. So this problem might go away by itself. Still problematic though, as the localdb will be updated with this faulty data, so anyone haveing the issue will have to rescrape with '-s screenscraper --updatedb' once the problem is fixed on their end.
-
@muldjord After further tests on my end, i think screenscraper has issues as you said. Even when scraping a single rom it returns .hack-link :(
-
@analoghero yeah, but it doesn't seem to be a general problem. I don't get the errors here. Maybe they have 1 server that causes this. And because they use load-balancing I just happen to reach one of their servers that works fine.
-
@muldjord Not your fault ofcourse, but this problem makes 2.3.0 not very usefull atm, because screenscraper has the most images.
-
@analoghero Yes, agreed. But as you say, I can't really fix that unfortunately... I will look into the problem further when I get home though. I'm curious as to what is going on and would like to know why it returns the ".hack-Link" result in the first place... :S
EDIT: And let me just say that I find it quite unfortunate that it had to coincide with my 2.3.0 release. :D I had been testing and testing and to read about such a weird problem seemingly breaking the results for you made me think I had overlooked something really obvious... So I'm "glad" it is seemingly not a problem on my end. Sad that screenscraper is having issues of course, it really is a fantastic source when it works.
-
@screech Might be able to add some suggestion as to what is happening at Screenscraper. Unfortunately it looks like he hasn't been on the forum in the last month.
-
@muldjord many many thanks for the hardwork in this version, as i said, best scraper around.
@analoghero said in Versatile C++ game scraper: Skyscraper:
.hack link
I was getting this a lot other day,
Maybe Skyscraper simple could check if .hack link is returned from screenscraper, and if the file is not .hack link (words in filename), simple skip it?. Better to have nothing that a wrong result? -
@bleuge Agreed, I would like to filter those out so it doesn't overwrite valuable data in the localdb. Still curious as to why screenscraper returns it in the first place. I hope they know about it and fix it.
-
@muldjord I hope that, too. Dont know how to contact them, though.
-
@muldjord confirmed that sselph also does the same. Screenscraper works with hashes in the api? Maybe is some error returning wrong data when no hash is hit
-
I've now implemented a simple check for ".hack-Link" which will simply filter it out. It will be in 2.3.1 soon. I am currently looking into the "players" parsing to make it single digit.
-
@muldjord Great!
Installed new version, i can see ocasionally "libpng error: Read Error" in the texts in screen.
Can i suggest --showlocaldb_stats or something like this?
I'd love to see the total differences in my localdb between runs.
Also per-platform and a total at the end could it be great!Sorry for asking so much :) ... Could a debug log be enabled? So reporting errors and testing this could be made a lot easier?
Thanks muldjord!
-
@bleuge I could make it so it shows how many new entries have been added for the current run. That would be useful I think.
Don't know about debug log. I could easily add a '--debuglog [filename]' which would save whatever data I added to it. But the hard part is knowing what data to put in there. I might look into it at some point in the future.
-
I was just making a suggestion on the GitHub of another scraping tool when I came across this one.
I noticed someone suggested adding .scummvm for scraping ScummVM. May I suggest instead...
For ScummVM, simply parse the scummvm.ini file (\retropie\configs\scummvm\scummvm.ini) which contains all the information about installed games. This way there is no need to create dummy files to "trick" the scraper.
It shouldn't be difficult to parse this ini file and grab all the game names and paths from there. It has the benefit that the descriptions are the same for everyone (ScummVM always uses the same description for each game variant) so matching by name alone should be no problem.
Hashing ScummVM games does not work well, in my experience. There are far too many different variants of each game (compressed/uncompressed/stripped of extra files, etc.). If you use the values from the .ini file you can forget hashing entirely as ScummVM automatically identifies the games when you add them.
-
@analoghero You can contact Screech and the rest of the screenscraper team on IRC. Use the CHAT IRC button at the top of the screenscraper.fr site.
They seem to speak pretty good English as well as French :)
-
@muldjord Thanks!
Also, regarding the Skyscraper halting in my pi while scraping (the issue i opened in github). I've just finnished a total scrapping of all my platform (i built a big skyscript.sh as i told with all platforms). Worked perfectly, no more halts or whatever. Finnished every run ok. -
@stoo It's an interesting idea. But doesn't ES need the dummy files to have anything on the list at all? I assume it simply runs scummvm through runcommand with the dummy filename or similar.
-
@muldjord ES uses .svm as dummy files for generating the gamelist and, presumably, for the runcommand, and not .scummvm which is used, I believe, for Recalbox.
However, these files are empty by default, and the hack to get most scrapers to acknowledge them is to put the FULL NAME of the game as text into the dummy file.
This seems pointless, given that the .ini file is 100% accurate (ScummVM detects the games itself when you add them) and contains all the information you need already.
-
@stoo Skyscraper looks for both .svm and .scummvm files.
It won't get rid of the dummy files, but what I can do is, for each dummy file, look that filename up in the scummvm.ini file and use the name from that file when scraping. Then you could call the dummy files whatever you like, they would be recognized regardless.
EDIT: 'players' have been fixed to always be a 1-digit or in some cases '4+' now. I currently look for the formats "1 Player", "1-3 Players", "1-3" and "1 - 2 (2 simultaneous)" and convert them to "1", "3", "3" and "2". If players exceeds 9 it handles that aswell.
If you happen upon more formats Skyscraper needs to understand, let me know in a comment. :) -
@muldjord - I have seen a few 1 - 4 as well, might as well add those to cover bases, and since that's the maximum number of players (traditionally) supported by consoles.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.