Versatile C++ game scraper: Skyscraper
-
@Used2BeRX : The reason why screenscraper exist is simple ^^ we want to share our work on collecting media and data in every langage and country we can ;) and with the help of the community we have now a great DB open to everyone who want to scrape ;) (again till it's free and open sources).
That's the power of open community project ;) great work by and for everyone thanks to all of you ;)
And muldjord work is a part of this ;) so no reason to refuse his software, even more, it's a good reason to help him ;)If that's simples words can help other "DB owner" to share their works for this project it's great. I haven't connection with other Scraping DB (I know it's a shame) But if some of you have some, don't hesitate to send them a small message (with kindness ;) no aggressivity).
(for your info, I just check some Stats, we got about 4 millions request a day (with an average loads of 20/30% of the server ressources). And Skyscraper generate only 1 or 2K... so don't worry about the overuse ;) )
-
Things are going rather well. I just got permission to use arcadedb aswell, and even got a full description of their API. Only downside is that I am forced to only using 1 thread when using it, but that is of course completely ok! It's better than not having it at all.
So the list of permissions are growing and it feels good to do it the right way now. Looking forward to getting Skyscraper back up online once all permissions are settled and I have coded the new API connections.
-
From reading this thread, I can't believe how far you got by scraping and not using API's (Hat off to you for perseverance though :)).
I use the Screenscraper API a lot and it's amazing (I use it to read against MD5 and CRC and SHA-1 then write back to the DB to help update missing info), you will have a much simpler time building your app when you are using all the API's from the sites. Most of them are really cool about giving out dev access too :).
Most of the Screenscraper API doco is in French so drop me a line if you have any questions, I might be able to help answer them for you, it took me a while to work it out ;).
If you parse in SS login details (The end user, not yours), the SS API will allow you to have as many threads as the user is entitled to. If the user makes a one off donation to SS they get something like 5 threads.
-
This post is deleted! -
@bladehunter Can you give me names of other sites that provide API's that could be useful to integrate into Skyscraper? API's are, of course, always preferred. I currently use the official API's for ScreenScraper and TheGamesDB and I am awaiting details for MobyGames.
-
@muldjord said in Versatile C++ game scraper: Skyscraper:
@bladehunter Can you give me names of other sites that provide API's that could be useful to integrate into Skyscraper? API's are, of course, always preferred. I currently use the official API's for ScreenScraper and TheGamesDB and I am awaiting details for MobyGames.
Apologies, I think I gave the wrong impression, I only have experience with SS (Screenscraper). I didn't bother integrating with any others as my goal was always to find one and help build on it (SS being the obvious choice with Retropie and UXS / SSelphs scraper). I built tools that allow me to enhance the SS database from what I can get from other sources. If you are already up and running with the SS API then that's the extent of my experience :).
Keen to chat anyway, we may have some ideas that can help each other. Feel free to drop me a line if you like :).
-
Skyscraper 2.0.0 released: https://github.com/muldjord/skyscraper
- Back to basics: Removed several web sources. Now only allows the ones I have explicit permission to use.
- Properly implemented official API for 'arcadedb' module
- Added scraping module info to output per result but only when using '--verbose'
- Added check for unreasonably bad scraping runs, making Skyscraper exit if 30 of 30 files miss from the get-go
We're back in business! As promised Skyscraper has been brought back online. I have removed most of the scraping modules. They will be put back in if I get permission to use them.
Please let me know if you run into problems with this release. I've only done some minor testing myself and everything seems to work well. :)
-
@muldjord Nice to see you back in business. Will try it out asap!
-
Currently implementing region support for the 'screenscraper' module. User can define region with '-r' in next release. For instance '-r jp' or '-r de'. ScreenScraper will then make use of this and use the correct rom name, date, description and so on. It will default back to 'wor' and then 'us' if the specified region doesn't exist in the data. Still trying to figure out if I should also let user define language, since region and language aren't the same thing.
-
@muldjord Is there a way I can private message you or something?
-
@BladeHunter I prefer just to have it on here, so I can check it when I have the time. I have a million projects going on at the same time, and having private messages popup between them sortof muddles my workflow. :D Unfortunately I don't see any private messaging functionality on these forums.
-
@muldjord yeah I'm surprised there is no pm function. I really don't want to post my comment on here (you will understand why when I talk to you) can you drop me a line at please? I don't need anything from you but I have a comment which will help you :) if you want me to post my comment here I will but I'm not overly comfortable with it.
-
-
@bladehunter said in Versatile C++ game scraper: Skyscraper:
...can you drop me a line at ****@.com please?
Man... take that down lol. 8 hours of bots just got your email. I hope it was a junk account.
I second the motion to have a PM feature on this board. That would be pretty sweet. I thought there might have been one but I just hadn't figured out how to use it yet. :)
-
@used2berx there will never be a PM feature on this forum. All it would end up being is people sharing romsites.
-
@herb_fargus Yeah. You're probably right. ;)
Kind of sucks that it's hard to collaborate with people you meet here on things without putting your email address up for the world to see though.
Too bad there isn't an in-between solution. Unless there is? Do you have any suggestions on that?
-
@used2berx collaboration on code is best done on github imo
-
@herb_fargus Cool. I have an account there but haven't looked into it much. Thanks.
-
@used2berx said in Versatile C++ game scraper: Skyscraper:
@bladehunter said in Versatile C++ game scraper: Skyscraper:
...can you drop me a line at ****@.com please?
Man... take that down lol. 8 hours of bots just got your email. I hope it was a junk account.
I second the motion to have a PM feature on this board. That would be pretty sweet. I thought there might have been one but I just hadn't figured out how to use it yet. :)
Hehehe it's my junk account :). I have taken mine out but it's still quoted in muldjords post, it's not a big issue for me :).
-
@BladeHunter Censored :)
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.