Versatile C++ game scraper: Skyscraper
-
I've been reading through the docs and haven't found my answer (or if it is possible)...
I like to save all of my media to a single folder using file suffixes (like sselph scraper does). For example in
/roms/arcade/images/
I would have:zaxxon-image.png zaxxon-video.mp4 zaxxon-marquee.png zaxxon-launching.png
Is this possible with Skyscraper?
TIA.
John
-
@johnodon Hi John, You can change the output folder with the option described here, but not the names of the media files themselves.
-
-
Hi! First things first - I'm impressed with your work @muldjord and I really like Skyscraper a lot! I was looking for a way to handle one specific use case and was advised to post in this thread by @mitu (thank you). I would like to propose a feature suggestion for Skyscraper. I detailed my use case in this thread.
The ghist of it is a feature where one could provide a list of files that Skyscraper should avoid scraping completely, i.e.. some kind of "unallowed list". This is based on the fact that there are roms in the collection I know will not render any results, and I would like to avoid hammering the online services for these "known failures".
I realize after the replies in the thread that one could use the --excludefiles option. The caveat there is that one would have to assemble (and maintain!) a list of files manually into a string that is passed to the option (or, as also suggested in the thread - change the file names to match a certain pattern for files one would like to avoid).
It would be really neat if one could for example pass a "skipped-<platform>-<source>.txt"-list to Skyscraper when instructing it to scrape for a platform. Perhaps the --excludefiles option could be extended to also take such a file as input?
-
@tomfury Hi, thanks for the suggestion. I've implemented this in my local development version now. I just need to test it some more, then I will release it.
It will work by using either the CLI parameter
--excludefrom FILENAME
or by settingexcludeFrom="FILENAME"
in the config.ini[main]
section or a[PLATFORM]
section.The FILENAME must contain a list of filenames with full path.
I will post here when it is ready.
-
Skyscraper 3.7.0 out now: https://github.com/muldjord/skyscraper
- Moved '--fromfile' option to '--includefrom'. '--fromfile' still works, but is considered deprecated
- Moved '--includefiles' option to '--includepattern'. '--includefiles' still works, but is considered deprecated
- Moved '--excludefiles' option to '--excludepattern'. '--excludefiles' still works, but is considered deprecated
- Added '--excludefrom' option similar to '--includefrom' only the opposite (Thank you to user 'TomFury' for suggesting this)
- Skyscraper will now ignore any subfolders within the input folder where a file called '.skyscraperignore' is found (Thank you to user 'sromeroi' for suggesting this)
- Added platform 'easyrpg', only usable using the 'screenscraper' scraping module (Thank you to user 'zerojay' for suggesting this)
Documentation for the CLI versions of the new options is here. config.ini documenation is here.
If you run into issues, please let me know. Thanks! :)
-
@muldjord Thank you so much! This is a great add-on. I have tested it on version 3.7.2 and it looks like it works! I kind of had the expectation that one could use the generated "skip-list" as an input, but I think using full path works just as great (and also provide more control).
For my use case the only missing step now is to be able to convert the skipped-<platform>-<scraper>.txt files from a scraping session to an input file compatible with the --excludefrom flag (i.e. for every row change:
'<rom name>', No returned matches
... to:
/full/path/to/<rom name>.<extension>
... and populate my "final" list for excluded files. Perhaps that would be too much to ask for Skyscraper to handle. My use case might be a bit odd, and in that case it's probably better to create an external adapter-script that will do that kind of conversion for me. I mean, I guess most users want to have total control of a list which is used as input instead of relying on a list of unsuccessful scrapes (which might have failed for many reasons; not only that the rom is missing from the database).
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
- Moved '--fromfile' option to '--includefrom'. '--fromfile' still works, but is considered deprecated
When I use this option with a relative path, no matter where from, it is expanded relative to
~/.skyscraper
and not to{cwd}
. This makes it impossible to use tab-completion unless I: 1) use absolute path, or 2)pushd
orcd
into the.skyscraper
dir first.(It's not new,
--fromfile
used to do this also.)Is this by design?
-
@sleve_mcdichael said in Versatile C++ game scraper: Skyscraper:
Is this by design?
Well, no, not really. And I can see how that is confusing. I'll change it.
@TomFury That was actually the idea and I thought I checked the skipped files, but I probably checked the report files instead. I think I will change the skipped files to include filenames instead and cross my fingers people won't miss the old format.
EDIT: 3.7.4 now out with both of these things fixed.
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
@TomFury That was actually the idea and I thought I checked the skipped files, but I probably checked the report files instead. I think I will change the skipped files to include filenames instead and cross my fingers people won't miss the old format.
EDIT: 3.7.4 now out with both of these things fixed.
Awesome! I will give it a try the nearest couple of days :-).
-
@muldjord Thanks for the update! It's great that you still find the time to work on this. 👍
-
Decided to remove the code that replaces
:
in the Pegasus command string. I can't remember why I put it in there, but things seem to work without it. So maybe I put it in there before I added the newer Pegasus parser (which seem to work with the:
in place).So 3.7.5 released where
:
is now allows in Pegasus launch command. It's been requested a few times. -
@muldjord most of the time I am scraping only new titles, and I want videos, so I have
videos="true"
enabled in myconfig.ini
so I don't have to type the flag every time.Let's say I want to re-scrape the screenshots for a bunch of existing titles, so I want to disable videos only temporarily, for one run only. I see I can use
--flags nocovers,nowheels,nomarquees
to skip those assets but I don't see a way to negate the videos without editing my config. Is there something I've overlooked?[Edit]: ...also, is there a way to output something besides the "screenshot" to the gamelist
<image>
tag? If I just want the box art, for example. I guess I can output to the screenshot "type" with the cover "resource":<output type="screenshot" resource="cover"/>
Is this the right/only way?
-
but I don't see a way to negate the videos without editing my config. Is there something I've overlooked?
You can set
videos="true"
for all individual platforms that you want to use videos. And then not set it for the ones you want to use artwork.Example:
[nes] videos="true"
also, is there a way to output something besides the "screenshot" to the gamelist <image> tag?
Yes, you can set any output node in
artwork.xml
to substitue an export type to use a different artwork source.Example:
<output type="screenshot" resource="wheel" mpixels="0.1" width="640" height="400">
The above will export a screenshot but use the wheel artwork as source for it.
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
You can set
videos="true"
for all individual platforms that you want to use videos. And then not set it for the ones you want to use artwork.It's not specific to a certain platform. Here's my use case:
I have both images and videos for each title; the video plays in the gamelist and the image is shown on launch.
Yesterday, I wanted to scrape videos. Tomorrow, I will want to scrape videos. In the future when I add new games, I will want to scrape videos.
Today, I want to re-scrape some of my existing titles to get a better screenshot for the image. But I don't need to re-scrape all the videos or other image assets again, I just want the screenshots.
To do that, I edit my
config.xml
, disable videos, and then save the xml. Then I scrape with--cache refresh --flags nocovers,nowheels,nomarquees
to re-download the screenshots (and textual datas) only, then once I have all the new screenshots that I want, I edit the xml again and re-enable the videos so that when I add more new games again tomorrow, they have videos.Granted, this isn't something I'll need to do very often, it just would be convenient-er if there was a
novideos
flag as well instead of editing the config.xml twice.Yes, you can set any output node in
artwork.xml
to substitue an export type to use a different artwork source.Example:
<output type="screenshot" resource="wheel" mpixels="0.1" width="640" height="400">
The above will export a screenshot but use the wheel artwork as source for it.
Well, that was my question. It has to be called "screenshot", then? So I can't, for example, output a cover and a screenshot, and use the one called 'cover' in the gamelist.
-
It just would be convenient-er if there was a novideos flag as well instead of editing the config.xml twice.
I do consider this a niche case, as you also mention yourself. Having both a
videos
andnovideos
flag seems like bloating and contradicting to me. I'll forego it for the time being.It has to be called "screenshot", then?
In the case of an EmulationStation gamelist generation, the screenshot is the image used in the
<image>
tag. You can export any resource to that tag by using my example. If that is not what you mean, I'm not sure I understand what you are describing. -
@muldjord said in Versatile C++ game scraper: Skyscraper:
If that is not what you mean, I'm not sure I understand what you are describing.
I was describing something like:
<image>/home/pi/.emulationstation/downloaded_media/nes/covers/Castlevania (USA) (Rev 1).png</image>
...but I understand from your response, this is not possible and that I should use
type="screenshot" resource="cover"
in that situation instead.New question: what should I expect to see with
--flags interactive
? In my composite art, using the default artwork.xml, many games show a "title" screenshot instead of a gameplay one. For example:On www.screenscraper.fr, I can see that for this game there are three "title" screenshots and one regular.
What should I see when I use
--flags interactive
? I expected to be presented with a choice, perhaps something like:1. Screenshot Title (Europe) 2. Screenshot Title (Japon) 3. Screenshot Title (Monde) 4. Screenshot (Monde)
...but when I run
Skyscraper -p nes -s screenscraper --flags interactive "/path/to/Castlevania (USA) (Rev 1).7z"
, nothing different happens, it just processes the file and exits with no interaction. I even tried removing "unattendSkip" from my config.ini and it was the same. The config.ini I am using right now is:[main] ## define paths # inputFolder="/home/pi/RetroPie/roms" gameListFolder="/home/pi/.emulationstation/gamelists" mediaFolder="/home/pi/.emulationstation/downloaded_media" # cacheFolder="/home/pi/.skyscraper/cache" # importFolder="/home/pi/.skyscraper/import" ## cache settings # cacheCovers="true" # cacheScreenshots="true" # cacheWheels="true" cacheMarquees="false" ## video settings #videos="true" videoSizeLimit="42" videoConvertCommand="videoconvert.sh %i %o" videoConvertExtension="mp4" symlink="true" ## generic brackets="false" subdirs="false" theInFront="true" #unattendSkip="true" artworkXml="artwork-wide.xml" maxLength="10000" [arcade] artworkXml="artwork-tall.xml" [gb] artworkXml="artwork-tall.xml" ## allow gbc titles in gb folder # addExtensions="*.gbc" [gba] artworkXml="artwork-wide.xml" [gbc] artworkXml="artwork-tall.xml" [genesis] ## allow sega cd titles in genesis folder addExtensions="*.iso *.cue *.chd" [pcengine] ## use 'tg16' folders for pce titles inputFolder="/home/pi/RetroPie/roms/tg16" gameListFolder="/home/pi/.emulationstation/gamelists/tg16" mediaFolder="/home/pi/.emulationstation/downloaded_media/tg16" cacheFolder="/home/pi/.skyscraper/cache/tg16" importFolder="/home/pi/.skyscraper/import/tg16" [screenscraper] ## userCreds="user:pass" userCreds="REDACTED:REDACTED" [esgamelist] ## import customized ES textual data cacheRefresh="true" unattend="true" videos="false" cacheCovers="false" cacheScreenshots="false" cacheWheels="false" cacheMarquees="false" [import] cacheRefresh="true"
Is it because my rom is (USA) and the "Screenshot" isn't? But none of the "Screenshot Title"s are, either, and the one it is using appears to either be the "(Europe)" or "(Monde)" ("World") version.
-
@sleve_mcdichael said in Versatile C++ game scraper: Skyscraper:
New question: what should I expect to see with --flags interactive?
The
interactive
flag is meant for situations where a scraping module returns more results than one.screenscraper
never does because it is checksum based. Only the name search based ones do, such asthegamesdb
. The flag is not meant to let you choose between resources from a scraping module. That is not possible currently and probably won't be, sorry. -
@muldjord ah, I misunderstood the point of it, then. Thanks.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.