Emulationstation scraping - TheGamesDB.net API change
-
Due to a recent change in the TheGamesDB.net infrastructure, the built-in Emulationstation scraper cannot download images anymore. This has been reported here
- https://retropie.org.uk/forum/topic/17912/cannot-scrape-any-images-at-all-with-built-in-scraper-retropie-4-4-1
- https://retropie.org.uk/forum/topic/17911/scrapper-downloading-an-html-file-instead-of-image/ (closed as the poster found the next topic)
- https://retropie.org.uk/forum/topic/17904/emulationstation-scraper-saving-html-file-as-image
and also on the Recalbox forums (https://forum.recalbox.com/topic/14242/erreur-saving-resized-image-lors-du-scrape-via-recalbox/) and on Reddit.
The problem seems to stem from the fact that the API server incorrectly returns the base URL for the image download paths and the scraper correctly follows the URL to download, only to receive a redirect response (302) and a small HTML file with the redirect message.
Sample requests
- Scraper requests the game info from
http://thegamesdb.net/api/GetGame.php?id=2
, receiving
<Data> <baseImgUrl>http://thegamesdb.net/banners/</baseImgUrl> <Game> <id>2</id> <GameTitle>Crysis</GameTitle> <PlatformId>1</PlatformId> <Platform>PC</Platform> <ReleaseDate>11/13/2007</ReleaseDate> .... <Images> <fanart> <original width="1920" height="1080">fanart/original/2-1.jpg</original> <thumb>fanart/thumb/2-1.jpg</thumb> </fanart> <fanart> [...] </Data>
- Then, the scraper then tries to download an image from
http://thegamesdb.net/banners/fanart/original/2-1.jpg
, but receives a
HTTP request sent, awaiting response... 302 Found Location: http://legacy.thegamesdb.net/banners/fanart/original/2-1.jpg
which it saves as
.jpg
, but it's actually just an small HTML file with the redirection messages.
3. When trying to resize the image, an error is presented - similar to the one shown in the first topic mentioned.I'm going to page @pjft , @jdrassa for their opinion on how to handle the breakage. I've done a few tests by modifying the scraper's libCurl's requests to handle re-directs and this seems to fix the problem, but I'm not sure if we should report this upstream (TheGamesDb.net) or wait for their infrastructure to settle down or fix it in ES.
-
@mitu This is just my humble opinion. I would remove the internal scraper completly and just call commands to a sraper like sselph scraper or @muldjords Skyscraper.... Just my opinion ;)
+1 for tagging the right guys ;) They will surly help :D
-
@mitu Thanks for tagging me. I have a PR in to handle the redirect. I am going to post over at the there forums as well in case we can get them to fix it on their end as well.
-
@jdrassa Thanks for taking the time to go over it. Your PR is similar with what I've tried, I've also added (via
curl_easy_setopt
) a max limit on the number of redirects and restricted redirects to http(s) only. -
@mitu said in Emulationstation scraping - TheGamesDB.net API change:
I've also added (via curl_easy_setopt) a max limit on the number of redirects and restricted redirects to http(s) only.
Thats a good idea. I have updated the PR. Thanks.
-
Looks to be back in business as of now. Thanks guys for getting this fixed for us. 👍
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.