Versatile C++ game scraper: Skyscraper
-
@muldjord I know youre taking a break, and i dont want to disturb, so consider this as a thing you do when you like to. When scraping amiga, the names of the files (minus extension) are displayed in ES instead of the gamename. Example: Agony_v1.3_0960 should be Agony.
Dont know if other systems are affected too. -
@analoghero Either you're using '--forcefilename' or it hasn't been found. Those are the only two reasons for a game to show up with the filename. :) Otherwise, if it was found as "Agony" it will be shown as "Agony".
-
@muldjord Strange issue. I never used --forcefilename.
#4/63 (T2) ---- Game 'Agony_v1.3_0960' found! :) ---- Scraper: localdb Search match: 100 % Compare title: 'Agony' Result title: 'Agony_v1.3_0960' (import) Platform: 'Amiga' (thegamesdb) Release Date: '1992-01-01' (openretro) Developer: 'Art and Magic' (openretro) Publisher: 'Psygnosis' (openretro) Players: '1' (openretro) Tags: 'Animalprotagonist, Autoscroll, Horizontal, Powerup, Shootemup, Sideways' (openretro) Rating (0-1): '0.7' (thegamesdb) Cover: YES (openretro) Screenshot: YES (openretro) Wheel: NO () Marquee: NO () Video: YES (import) Description: (thegamesdb) This horizontally-scrolling shoot 'em up features six long levels, all with detailed and mellow background graphics, aiming for a less hectic feel than contemporaries such as Project X. As a magician's apprentice, you have been turned into an owl to give you the best chance of destroying the many dark creatures to be faced, and thus discovering the secret of cosmic strength. These dark creatures include piranhas, giant ants and mosquitoes. Extra weapons and invincibility periods can be collected. The technical details include 3 layers of multi-directional parallax scrolling, background animation, and different title and in-game music. Elapsed time: 00:00:04 Estimated time: 00:01:10
Heres a example from my gamelist.xml:
<?xml version="1.0"?> <gameList> <game> <path>/home/pi/RetroPie/roms/amiga/Agony_v1.3_0960.lha</path> <name>Agony_v1.3_0960</name> <cover /> <image>/home/pi/RetroPie/roms/amiga/media/screenshots/Agony_v1.3_0960.png</image> <marquee /> <video>/home/pi/RetroPie/roms/amiga/media/videos/Agony_v1.3_0960.mp4</video> <rating>0.7</rating> <desc>This horizontally-scrolling shoot 'em up features six long levels, all with detailed and mellow background graphics, aiming for a less hectic feel than contemporaries such as Project X. As a magician's apprentice, you have been turned into an owl to give you the best chance of destroying the many dark creatures to be faced, and thus discovering the secret of cosmic strength. These dark creatures include piranhas, giant ants and mosquitoes. Extra weapons and invincibility periods can be collected. The technical details include 3 layers of multi-directional parallax scrolling, background animation, and different title and in-game music.</desc> <releasedate>19920101</releasedate> <developer>Art and Magic</developer> <publisher>Psygnosis</publisher> <genre>Animalprotagonist, Autoscroll, Horizontal, Powerup, Shootemup, Sideways</genre> <players>1</players> </game>
-
@analoghero Ah, that's because the title is prioritized from the "import" module. Just remove the "<source>import</source>" line from the "<order type="title">...</order>" in your priotities.xml under "[homedir]/.skyscraper/dbs/[platform]/priorities.xml" and rescrape with localdb.
:)
-
@muldjord yes youre right! Altough there was no <source>import</source> tag under the order type title in the priorites.xml i added <source>thegamesdb</source>. Now it uses the correct title.
Thank you for your help.
-
@analoghero Glad I could help. :)
-
@muldjord I'm having an issue with a few of my roms scraping incorrectly in simple mode.
They are showing incorrect names in EmulationStation.
I can give specific examples if that helps.
-
@maroonout09 It is not uncommon for a few roms to scrape incorrectly. I assume the name they are scraped as are quite close to the the one you expect. Skyscraper is based on filename searches for some modules, and checksum searches for others and use several different tricks to try and be as precise as possible. But there will be false positives, it cannot be avoided.
But yes, please give examples and also what version of Skyscraper you are running (important). I would like to make sure it is the expected behaviour and not something else entirely.
Quick note: If you want to avoid false positives completely, set '-m 100' on command line or 'minMatch' in '[homedir]/.skyscraper/config.ini'. Then it will only allow 100% correct results. But keep in mind that you will also loose a lot of the correct results if you do so. It's a bit of a balancing act.
-
@muldjord Here are the games that I found that were scraped incorrectly:
Filename: Pokemon_-_Yellow_Version.gbc
Scraped Name: Robopon: Sun Version
Comments: The scrape also included the description for Robopon: Sun Version, and for some reason, the images for Pokemon: Gold Version.Filename: Super_Mario_Advance.gba
Scraped Name: Chaoji Maliou Shijie
Comments: The scrape had the correct description and images.Filename: Super_Mario_Advance_3_-_Yoshi's_Island.gba
Scraped Name: Yaoxi Dao
Comments: The scrape had the correct description and images.Filename: Wario_Land_4.gba
Scraped Name: Waliou Xunbao Ji
Comments: The scrape had the correct description and images.I think those may have been the only ones that scraped incorrectly.
I'm using Skyscraper v2.4.3.
-
@maroonout09
Just tested all of them, these are the reasons and what you can do about it:Pokemon_-_Yellow_Version.gbc:
It returns a match for Robopon: Sun Version because of the "-" in the filename (it will include this in the search which messes with it, I will consider removing these dashes automatically in 2.4.4). And since that name matches 83%, it accepts it. You can make it work by changing the name of that file to "Pokemon_Yellow.gbc"Super_Mario_Advance.gba / Super_Mario_Advance_3_-_Yoshi's_Island.gba / Wario_Land_4.gba:
These titles are actually correct, they are just the 'wor' region titles for them and are the titles ScreenScraper returns for them. I was not aware that the 'wor' titles were sometimes to the japanese titles, so I'll prioritize the 'eu' and 'us' titles higher for the next release (2.4.4). In the meantime, please set 'region' manually with '--region us' or '--region eu' to prevent this from happening.Thank you for reporting this, I appreciate it.
-
@muldjord For Amiga: Deluxe Pacman is scraped as Deluxe Pac Man, and not found. Rock n Roll is not found, too.
With .lha files it doesnt add [AGA] anymore. Not really important though.Edit: Shame that we cant use LemonAmiga or HOL.
-
@analoghero [AGA]'s will be back in 2.4.4. :) And so will [CD32], [CDTV] and [Demo].
You can change the filenames of your lha's if you want better results. Try changing "DeluxePacManxxx.lha" to "DeluxePacmanxxx.lha" for instance, that might fix it. But for now many Amiga games with .lha suffix will scrape wrongfully since I have to convert the filenames on the fly to add spaces, and that is just bound to be a problem.
I'm working with Dom from the Amiberry team for a better solution in the future. But for now, this will have to do. I also would like to point out that Skyscraper is the only scraper to even support the .lha's at this point, so I guess anything is better than nothing. Skyscraper scrapes about 75% of the lha's at the moment.
EDIT: Agreed, I actually supported LemonAmiga and HOL half a year ago, but had to remove support since I couldn't get official permission to scrape from their sites... :S I never got a reply to my emails if I recall correctly. And without permission I won't use them of course.
-
@muldjord Yes i know that they were once supported, but removed. I think they assume a scraper for a well known platform such as retropie will cause a lot of traffic. Good idea just to rename files. Will try that. :)
-
@muldjord Thank you very much for your help!
-
@maroonout09 You're welcome. Good luck with it! :)
-
Just for reference we have also been testing this on our RetroPie base image for the Odroid XU4 and it does work well. The only item of note that we have found is that with that board a lot more folks use small EMMC or microSD cards for the base and then an ext drive for their games/media. With how the db is storing what we can tell are duplicates in the cache for quicker results when performing a rescrape it is easy to chew up the remaining space on the OS "drive" and filling it very quickly. Excellent work tho with how great the metadata that is returned for the gamelists and also the media itself.
-
@fnkngrv Thank you, glad you like it. You can change the dbFolder with '-d' and I will make sure it can be set in the config.ini file for the next release aswell. Then you can create a config.ini and add the 'dbFolder="[db base folder]"' in the main section of it, and it will put the cache there for all platforms in subfolders. That should give you the dynamic you are looking for. Will be in 2.4.4.
-
@muldjord Can you control in the sourcecode which image a scraper module downloads? When scraping amiga with openretro it sometimes gives strange results. For example RickDangerous_v1.3_2294.lha returns a screenshot from the trainermenu. ! It looks Like this but i cant find it on openretro.
Since screenscraper isnt an option with lha files, i manually imported some to replace the strange ones.
-
@analoghero I found the flaw. It seems that it returns all screenshots, including those from the cracked versions, most of them are just hidden but still exist in the source. So my function to return the screenshot even looks through the hidden ones. I'll fix this in 2.4.4 so it only chooses between the main ones. Thank you for reporting this. :) It's really helpful!
-
@muldjord Feeling bad for disturbing your break from Skyscraper. Dont know how but if i can help you with development in any other form then reporting minor bugs dont hesitate to ask. Maybe i can do something.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.