Scraping ScummVM games
-
I am trying to scrape a few basic ScummVM games (mainly Sierra) but I am having no luck. I thought it was possible to scrape via platform. Such as for King's Quest I could be scraped using kq1.svm but I am getting errors. I was looking at this spreadsheet for the platform name.
2016/04/15 10:38:03 INFO: Starting: kq1.svm
2016/04/15 10:38:03 ERR: error processing kq1.svm: hash not found
2016/04/15 10:38:03 INFO: Starting: kq2.svm
2016/04/15 10:38:03 ERR: error processing kq2.svm: hash not found
2016/04/15 10:38:03 INFO: Starting: kq3.svm
2016/04/15 10:38:03 ERR: error processing kq3.svm: hash not found
2016/04/15 10:38:03 INFO: Starting: kq4.svm
2016/04/15 10:38:03 ERR: error processing kq4.svm: hash not found -
If they haven't been added to sselphs scraper you may need to provide the hashes to him (I had to do that for all the HE games)
-
@herb_fargus said in Scraping ScummVM games:
If they haven't been added to sselphs scraper you may need to provide the hashes to him (I had to do that for all the HE games)
Hey Herb, I saw you had added several from the comments. I am a big fan of the HE games also, especially the Putt-Putt games. They've been played from my Mom down to the great grand children including my own girls.
I am not quite sure what to do. Is there a way to tell what games have been added? Here's the checksum from Windows but they are zero byte files. Also would I request possible file additions from Add scummvm support #37 - https://github.com/sselph/scraper/issues/37 Sorry for all the questions. Scraping ScummVM is new for me and it's really painful adding and modifying games one at a time. :)
File: kq1.svm
CRC-32: 00000000
MD4: 31d6cfe0d16ae931b73c59d7e0c089c0
MD5: d41d8cd98f00b204e9800998ecf8427e
SHA-1: da39a3ee5e6b4b0d3255bfef95601890afd80709 -
they are blank files so I guess the scraper needs to do it from the filename. I thought it did scraper scummvm but I would have to check.
-
@BuZz said in Scraping ScummVM games:
they are blank files so I guess the scraper needs to do it from the filename. I thought it did scraper scummvm but I would have to check.
It does go through several files in all the subdirectories and fails. The scraper seems to try all files in all subfolders? From what I understand that's why they were focusing on the .svm files but I might have misunderstood how scraper works for ScummVM games.
It scraped most of the games fine but surprisingly none of the Sierra family series including King's Quest 1-6, Space Quest 1-5, Leisure Suit Larry 1-6 (minus 4 which doesn't exist ;) and Hoyle Book of Games 1-3.
All the HE games are fine, Indiana Jones and The Secret of Monkey Island games.
scraper --help: pi@retropie:~/RetroPie/roms/scummvm $ scraper --help Usage of scraper: -add_not_found If true, add roms that are not found as an empty gamelist entry. -append If the gamelist file already exist skip files that are already listed and only append new files. -download_images If false, don't download any images, instead see if the expected file is stored locally already. (default true) -extra_ext string Comma separated list of extensions to also include in the scraper. -gdb_img string Comma seperated order to prefer images, s=snapshot, b=boxart, f=fanart, a=banner, l=logo. (default "b") -hash_file file The file containing hash information. -image_dir directory The directory to place downloaded images to locally. (default "images") -image_path path The path to use for images in gamelist.xml. (default "images") -image_suffix suffix The suffix added after rom name when creating image files. (default "-image") -img_format jpg or png jpg or png, the format to write the images. (default "jpg") -img_workers N Use N worker threads to process images. If 0, then use the same value as workers. -mame If true we want to run in MAME mode. -mame_img string Comma separated order to prefer images, s=snap, t=title, m=marquee, c=cabniet. (default "s,t,m,c") -max_width width The max width of images. Larger images will be resized. (default 400) -missing file The file where information about ROMs that weren't scraped is added. -nested_img_dir Use a nested img directory structure that matches rom structure. -no_thumb Don't add thumbnails to the gamelist. -output_file file The XML file to output to. (default "gamelist.xml") -overview_len N If set it will truncate the overview of roms to N characters + ellipsis. -refresh Information will be attempted to be downloaded again but won't remove roms that are not scraped. -retries N Retry a rom N times on an error. (default 2) -rom_dir directory The directory containing the roms file to process. (default ".") -rom_path path The path to use for roms in gamelist.xml. (default ".") -scrape_all If true, scrape all systems listed in es_systems.cfg. All dir/path flags will be ignored. -skip_check Skip the check if thegamesdb.net is up. -start_pprof If true, start the pprof service used to profile the application. -strip_unicode If true, remove all non-ascii characters. (default true) -thumb_only Download the thumbnail for both the image and thumb (faster). -thumb_suffix suffix The suffix added after rom name when creating thumb files. (default "-thumb") -use_filename If true, use the filename minus the extension as the game title in xml. -use_gdb Use the hash.csv and theGamesDB metadata. (default true) -use_nointro_name Use the name in the No-Intro DB instead of the one in the GDB. (default true) -use_ovgdb Use the OpenVGDB if the hash isn't in hash.csv. -version Print the release version and exit. -workers N Use N worker threads to process roms. (default 1)
-
AFAIK the scraper has support for scummvm games - https://github.com/sselph/scraper/issues/37
how are you scraping - via commandline or our scraper module ? what version of the scraper ?
-
If they aren't on sheet two here:
https://docs.google.com/spreadsheets/d/1HUj7nnDsJU2kpB_pFkzzHbO8rpwsdLQivZtHbPO3xTE/edit?usp=sharingand in thegamesdb online you'll have to add it to the gamesdb and the sheet then notify sselph to update it with your changes.
-
@BuZz said in Scraping ScummVM games:
AFAIK the scraper has support for scummvm games - https://github.com/sselph/scraper/issues/37
how are you scraping - via commandline or our scraper module ? what version of the scraper ?
I changed directory to cd RetroPie/roms/scummvm and just ran scraper with no additional options. The version I downloaded is v1.0.10. I also tried it through the menu but it found zero games that way (just scummvm checked for systems).
-
@herb_fargus said in Scraping ScummVM games:
If they aren't on sheet two here:
https://docs.google.com/spreadsheets/d/1HUj7nnDsJU2kpB_pFkzzHbO8rpwsdLQivZtHbPO3xTE/edit?usp=sharingand in thegamesdb online you'll have to add it to the gamesdb and the sheet then notify sselph to update it with your changes.
Ok, which file are you using for the hash checksum? That way I will add the Sierra games I have.
-
If posting output to the forum, please style it for easier reading - http://commonmark.org/help/
-
@BuZz said in Scraping ScummVM games:
If posting output to the forum, please style it for easier reading - http://commonmark.org/help/
Sorry Buzz, I corrected it with better formatting. Is using a code block ok? I'm not great with some of my forum etiquette. ;)
-
Looks better. You can specify language etc - I just put "text" in there and it doesnt try to syntax highlight it now.
-
Ok, I had to edit to see how it was done, much better without all the highlights. I am all for being tidy especially as years of information accumulate it's much easier to browse through threads that are clean.
-
The scraper does support scummvm it does it by just using the file names since those were generated by the system and are hardcoded in scummvm. My current list was generated by taking all the values listed in the source code of scummvm at the time then letting people match those to thegamesdb.
If there are more changes that need to be made, feel free to open up a new issue or reopen the scummvm issue with any updates. Check out sheet 2 of https://docs.google.com/spreadsheets/d/1HUj7nnDsJU2kpB_pFkzzHbO8rpwsdLQivZtHbPO3xTE/edit#gid=1414861348 to see what data is expected.
-
Hey,
i tried my luck with the Scraper, it works on most systems but on ScummVM (even though i checked to have the same filenames like in the sheet 2) i will get the following error:
XML syntax error on line 22: element <meta> closed by </head>
Any hints where to look for the problem ? Thanks
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.