[SOFT] Universal XML Scraper V2 - Easy Scrape with High Quality picture
-
@screech: I've come across some oddness when scraping NeoGeo.
I have a folder with ~130 NeoGeo roms, and when I scrape them about 45 of them come back as "2020 Super Baseball". They also download the images for 2020 Super Baseball.
You can see a copy of the gamelist.xml here.
I've had a look at a few of the affected titles on screenscraper.fr and they all seem to have correct data and images, so I don't know what could be causing it.
I've deleted the gamelist.xml and the images and tried scraping again, but the same thing happened again to the same games.
Any thoughts?
-
wow... that's really weird :S
Can you send me your log : screech[@]free.fr ? Just to check ?
(My neogeo folder scrape well ^^)For information : 2.2.0.2 is out :
https://github.com/Universal-Rom-Tools/Universal-XML-Scraper/releases/tag/2.2.0.2Corrected :
- Full Scrape don't neverend anymore.
- SSH kill works now on scrape demande.
- Date with only a 'Year' will work now (the date will be 'Year/01/01')
- When adding Missing Rom, Name were "(Clone, Beta, Demo, ...)". It's OK now ;)
- When you force JPG or PNG, Video stay in MP4.
Modification :
- New ProgressBar changing color (Green = OK, Red = Timeout or error download, Yellow = Not found)
Added :
- New, in case UXS Hang (it never happen ^^), when you re-launch a Scrape, it will ask you if you want to generate a gamelist.xml with the temporary file found.
- New Shortcut was created at the first launch : 'Silent_UXS'. You can now run UXS silently ;)
- New Visual info about Engine Use. Some check box corresponding to the Thread Number will tell you if the Engine is used or not.
- 2 New Advanced Menu : Reset Autoconfiguration Path and Alt Autoconfiguration Path (only for Retropie, it put the Rom's Folder Path)
- New Element Type : RomExcluded
Exemple :
<Element Type="RomExcluded"> <Source_Type>Variable_Value</Source_Type> <Source_Value>%AutoExclude%</Source_Value> <AutoExcludeEXT>bin|img|iso|ccd|sub</AutoExcludeEXT> <AutoExcludeValue>(Track |[Bios]|(Bios)</AutoExcludeValue> </Element>
Mean :
When a duplicate name file (without extension) is found:
- it check the extension. If in list (bin|img|iso|ccd|sub) it won't scrape it
- it check if the file contain a value ("(Track ","[Bios]","(Bios)"). If yes, it won't scrape it -
@screech I used to open this file using the source code zip download by running "Universal_XML_Scraper64.exe" included in the 2.1.0.6 download. I noticed that file is no longer included in the source download and when I try to run either "Scraper.exe" or "Scraper 64.exe" nothing happens.
How do I open the Scraper?
-
@screech said in [SOFT] Universal XML Scraper V2 - Easy Scrape with High Quality picture:
https://github.com/Universal-Rom-Tools/Universal-XML-Scraper/releases/tag/2.2.0.2
just download the Universal_XML_Scraper.exe or Universal_XML_Scraper64.exe that's all.. The other stuff are the source code...
And scraper.exe or scraper64.exe are the scrape engine and can't work as standalone...
-
@screech Thanks. For some reason I can't get those to download because my browsers detect a 'virus.' I'll just have to disable it. I think you're aware of this and it was covered previously in the thread.
-
Yep :( sorry for that :( the language I use (Autoit) is well know by antivirus because in the past a lots of people create malicious software with it...
So every AV block every Autoit software...
I contacted some of the main AV compagny to tell them it's a false positive.
Some respond and it's ok now with some of them...
Some never answer... And False positive is still here :'(so you need to "accept" UXS in your AV. sorry...
(In case of doubt if there is malicious code, all sources are on the github ^^) -
@screech Ya. I sent it to Microsoft alerting a false positive as their Defender AV is picking it up and really not even letting me open it in any fashion. Really not even download it. I added the download URL to the exclusion list, was able to download it to Edge's temp folder. Would it be possible to include the .exe with the source code .zip in the future? Seemed to circumvent the issue.
-
@screech said in [SOFT] Universal XML Scraper V2 - Easy Scrape with High Quality picture:
wow... that's really weird :S
Can you send me your log : screech[@]free.fr ? Just to check ?
(My neogeo folder scrape well ^^)I'm at work at the moment, so it will have to wait until later tonight. I might try scraping with a different version (maybe the new one) and see if that makes a difference.
On that note, congrats on getting the next version out. I know I bug you a bit, but UXS is seriously one of the best programs I use on a regular basis.
-
@screech: I emailed the log files to you.
-
Great ^^
I just received and studied your logs...
What I see :
Your hash weren't found in the DB...
So UXS try to find a matching name. And can't....
But you have the "In Zip" Option activated. So it unzip the file, and try to match at least 1 of the file in the ZIP.
And he can !!! but, the problem on Neogeo is some eprom are exactly the same from a game to another.
Like "000-lo.lo"... (I think it's a Neogeo Bios File)It found this file every time. And UXS/Screenscraper API take the first match every time :(
That's explain your duplicate :'(
My advice :
Try in GENERAL OPTION to- uncheck the "Auto select your system"
- uncheck the "Unzip File"
- Select in Research Mode : "2 - Filename"
So when you launch the scrape it will ask you wich system it is.
Select "Neogeo"
If you haven't renamed the file (or it's a standard name) like I see in your log, it will work ;) -
@screech there are some psx games that the scraper can't find, I even changed their names so they match the name in screenscraper
-
@kbronctjr screenscraper can be set to use hashes. Did you try that? It may be the default actually.
-
@kbronctjr Have you tried to find them using the built-in scraper or the sselph scraper?
-
@AlexMurphy I don't think I can use hashes, as most of my games are compressed in pbp format
And yeah I had all games scraped with the builtin scraper, but I wanted to get the collages this scraper provides and also the games description in my language
-
I hate to say this, but sometimes it is just easier to manually enter the ones that the scrapers don't find. They are not always 100% accurate and complete. It is pretty easy to find the videos/images you need for the roms it misses....that is unless you are talking about 100's of misses!
-
@kbronctjr Have you tried? I don't think it caused me any issues. I def have images, the 3 image mix from screenscraper for
.pbp
and I only use.pbp
for PSX. -
The problem is the MIX images. I wish the program included a tool to create those mixes using our own image. In this case I could manually get the images from screenscraper an use them as an input in this tool.
I'm gonna try in hash mode just in case
-
@TMNTturtlguy Yeah, I like making my own images too.
-
@kbronctjr well there are solutions for that as well, again not ideal, but if you have access to photoshop it is very easy to do. If you don't have photo shop there is a program called gimp, which is basically a free version of photoshop but not as user friendly. you can collect the images and simply layer them in gimp and save. Takes about 5 minutes of time to do each one. Not fun, but if you are missing 5 roms, it is nice to be able to complete your set.
-
@TMNTturtlguy If you want something easy to use. GIMP can be a bit intimdating, try https://pixlr.com/editor/
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.