• Recent
  • Tags
  • Popular
  • Home
  • Docs
  • Register
  • Login
RetroPie forum home
  • Recent
  • Tags
  • Popular
  • Home
  • Docs
  • Register
  • Login

[SOFT] Universal XML Scraper V2 - Easy Scrape with High Quality picture

Scheduled Pinned Locked Moved Projects and Themes
scrapescraperuxs
728 Posts 111 Posters 741.8k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    screech
    last edited by 14 Dec 2016, 10:03

    @hansolo77
    First of all : Everything is free, we are working (UXS and Screenscraper staff like retropie staff) only for you and for free. Donation is just rewarded by some few threads to thanks guys who help us to pay for a new dedicated server more powerfull, to increase ressources dedicated to users... (actually only one person, MarbleMad, pay for all technical infrastructure). You can use Software and database for free...

    Grabbing datas is a hard job by the quantity of datas and medias needed to have a good database. And we aren't so much to work on it (about 10 guys working hard on it + some "small" contributor).
    With more than 2.000.000 API request by day (more than 300 unique IP who scrape by day) we just want a little "help", not so much time.
    Like @vbs say : 1 validated contribution and you have 2 threads, 2 validated contributions and you have 3 threads... If every User only contribute 2 times a day (less than 1 minute) we will have more than 600 new datas by day to have a great DB for all of us... And I can say we are really far from that...

    Just a little translation to what vbs say :

    • 0 contribution : 1 Thread MAX
    • 1 validated contribution : 2 Thread MAX
    • 2-49 validated contributions : 3 Thread MAX
    • 50-199 validated contributions : 4 Thread MAX
    • 200-499 validated contributions : 5 Thread MAX
    • 500-749 validated contributions : 6 Thread MAX
    • 750-999 validated contributions : 7 Thread MAX
    • more than 999 validated contributions : 8 Thread MAX

    If you want to help us pay server

    • from 5 to 9€/$ : + 1 Bonus thread
    • more than 10€/$ : + 5 Bonus thread

    It's just a gift and a "thank you thing" to those who helping us. Not a "I buy thread thing", DON'T BUY thread, donate only if you want (and can) donate...
    We prefere you participate to have a big and nice DB by submitting new datas and medias. (that's why "award" are easier and bigger on DB participation)

    So, now how it's work :
    We don't want to have crap in the DB, so every submission is validated 1 by 1 by admin or moderator. It take a lot's of time to check every contribution but it's a "quality proof" of the DB.
    When new submission are validated, you must wait near midnight (French hour) to gain your new threads (Server do is job calculating Stats).

    Just check some of your submission :
    Hyper Black Bass '95 on gameboy is not the same game as Black Bass: Lure Fishing So I can't validate your submission.
    Soccer Manager on GBC is not the same game as Player Manager 2001 So I can't validate your submission.
    Sorry we don't take synopsis like "A one player game published and developed by XXX in YYYY." It as no value (all info are already in the DB, synopsis we want are "real" synopsis)
    ....

    Don't forget you can clic on the small flag (upper left corner) to change the website language.

    @mattrixk
    I need to update the Wiki (info are outdated and are for the V1 :S sorry)
    Can you send me your new "zip" so I can check what's going on ?
    As I saw in the log you pastebin, there is a problem with XML (I think a wrong tag or something like that)

    M 1 Reply Last reply 14 Dec 2016, 22:19 Reply Quote 1
    • H
      hansolo77
      last edited by 14 Dec 2016, 15:51

      Thanks for the feedback and information. I was also noticing a lot of the information I was providing was a little low on descriptions. I figured anything was better than nothing, and the site I found the descriptions at just had that information. I'll avoid posting in simple 1 sentence descriptions in the future. I agree, having a nice full database is the better way to go. As for the non-matching games.. I found through searching for those games that they are just renamed versions of the same game only in a different region. Since the list it provided me had the original region's version, I just linked it to that. I don't know how to ADD a new game to show it's an alternate region clone.

      But yes, thanks again for the helpful feedback. I'll do better. :)

      Who's Scruffy Looking?

      1 Reply Last reply Reply Quote 1
      • V
        vbs
        last edited by 14 Dec 2016, 22:05

        One question please: The database improves every day so what do I have to do to re-scrape my system? Is it sufficient to just scrape a system again and everything will be updated?
        I know that the gamelist.xml will be recreated from scratch but what about the images? Will they be re-generated from the latest data or do I first have to delete them manually?

        V H 2 Replies Last reply 14 Dec 2016, 22:17 Reply Quote 0
        • V
          vbs @vbs
          last edited by 14 Dec 2016, 22:17

          @vbs
          Well, I justed tested and it seems the images get regenerated automatically. So no need to delete manually.

          1 Reply Last reply Reply Quote 0
          • M
            mattrixk @screech
            last edited by 14 Dec 2016, 22:19

            @screech said in [SOFT] Universal XML Scraper V2 - Easy Scrape with High Quality picture:

            Can you send me your new "zip" so I can check what's going on ?

            I've put it in my dropbox here.

            there is a problem with XML (I think a wrong tag or something like that)

            I made a copy of the existing Standard (3img) and literally the only thing I changed was the <Profil> name to match the folder name.

            My ES themes: MetaPixel | Spare | Io | Indent

            1 Reply Last reply Reply Quote 0
            • H
              hansolo77 @vbs
              last edited by 14 Dec 2016, 22:37

              @vbs I was wondering this as well. There is an option to UPDATE, but according to the program it looks like it just adds new ROMs... it doesn't actually UPDATE the metadata/art. I think a feature that should be added would be some extra data in an xml or something that identifies what data was scraped. Then the next time you run the scrape, it compares what's online with what it already has. That will cut down on all the re-creating identical data, wasting a lot of time. Things you could get would be all the metadata fields (and have the "found.xml" file record with 1's and 0's (like gamename/publisher=1 then gamename/description=0 and it'll skip adding publisher data but get the description if it exists), and the artwork (if using MIX+3 or MIX+4, just have it again record 1's and 0's if it has it/doesn't). After comparing, it would get the new data, update the gamelist.xml as needed, and recreate any new artwork. I could probably write out all that in BASIC (the only programming language I know lol). So I can see this easily being implemented into the UXS program.

              Who's Scruffy Looking?

              1 Reply Last reply Reply Quote 0
              • P
                paradadf
                last edited by 14 Dec 2016, 22:56

                I believe, without real understanding of how UXS works, that pulling data from the server doesn't that any considerable amount of time, but creating the mix images. I doubt that comparing anything with the db will be faster than just downloading the whole data.

                1 Reply Last reply Reply Quote 0
                • H
                  hansolo77
                  last edited by 14 Dec 2016, 22:58

                  My biggest slowdown when scraping is all the hash-checking it does. There should at least be a file that stores all the hash numbers so it doesn't have to re-hash every time.

                  Who's Scruffy Looking?

                  1 Reply Last reply Reply Quote 0
                  • P
                    paradadf
                    last edited by 14 Dec 2016, 23:04

                    I don't know how big your roms are (what system), but hashing a regular (not cd based) rom doesn't take any longer than 0,1 s per file.

                    1 Reply Last reply Reply Quote 0
                    • H
                      hansolo77
                      last edited by hansolo77 14 Dec 2016, 23:05

                      I'm having a really hard time with this now.
                      (refer to this post)
                      I think UXS has some how corrupted all my gamelist.xml files. Probably because yesterday I was having trouble getting the new version to work with the correct paths. I suspect it has multiple copies of them somewhere, and it's throwing everything out of whack.

                      But yea, my Amstrad CPC, Atari 800, and Atari St systems all took upwards of 4 hours each to hash, and they're no bigger than 1.2mb at the most (typical 3.5in floppy).

                      Who's Scruffy Looking?

                      V 1 Reply Last reply 14 Dec 2016, 23:11 Reply Quote 0
                      • V
                        vbs @hansolo77
                        last edited by vbs 14 Dec 2016, 23:11

                        @hansolo77 said in [SOFT] Universal XML Scraper V2 - Easy Scrape with High Quality picture:

                        I'm having a really hard time with this now.
                        (refer to this post)
                        I think UXS has some how corrupted all my gamelist.xml files. Probably because yesterday I was having trouble getting the new version to work with the correct paths. I suspect it has multiple copies of them somewhere, and it's throwing everything out of whack

                        Make sure to delete all the partially created gamelist files in all the Rom folders!

                        1 Reply Last reply Reply Quote 0
                        • P
                          paradadf
                          last edited by paradadf 14 Dec 2016, 23:12

                          I'd put the launcher on a new folder and try again. Do you scrape over the network, right? To be honest, I've never done than because the pi is too slow imo. I'm scraping ST right now on my pc with only one thread and with the 3img profil is taking about 30 min for 305 roms.
                          My results:
                          alt text

                          1 Reply Last reply Reply Quote 0
                          • H
                            hansolo77
                            last edited by 15 Dec 2016, 00:47

                            I need to either find an alternative rom set or go through and erase a lot of junk ROMs. My AtariST set has 3260 files. I got this set from my GameBase collection on my PC. It took 4 hours to scrape yesterday.

                            Who's Scruffy Looking?

                            R 1 Reply Last reply 15 Dec 2016, 01:15 Reply Quote 0
                            • R
                              Rion @hansolo77
                              last edited by 15 Dec 2016, 01:15

                              @hansolo77 I think it would be best to Scrape your collection on you PC and then copy over the images to your pi.

                              So make a copy of your roms folder on to the pc.

                              Or you could like i do run everything of my USB stick. Rom folder only that is.

                              FBNeo rom filtering
                              Mame2003 Arcade Bezels
                              Fba Arcade Bezels
                              Fba NeoGeo Bezels

                              1 Reply Last reply Reply Quote 0
                              • C
                                chuzzwuzzer
                                last edited by chuzzwuzzer 15 Dec 2016, 08:56

                                I'm having real problems with this scraper for some reason. I setup the autocofiguration on a new retropie 4.1 install. Autoconfiguration path : \retropie\roms. Kill emulation station. Select your system. Choose the system and click the main 'scrape' burron on the bottom right it asks for the sytem again. Then it tells me 'rom path cant be found' so I change the path to the standard roms directory manually. Now its just telling me 'Hashing please wait' which is taking forever. What am I doing wrong here ?

                                P 1 Reply Last reply 15 Dec 2016, 09:32 Reply Quote 0
                                • P
                                  paradadf @chuzzwuzzer
                                  last edited by 15 Dec 2016, 09:32

                                  I'd say not waiting XD. The hashing of big files takes long, nothing to do.

                                  1 Reply Last reply Reply Quote 0
                                  • C
                                    chuzzwuzzer
                                    last edited by 15 Dec 2016, 09:49

                                    If I have the 'downloaded images' folder in each system rom folder backed up from a previous scrape can I just copy and paste 'downloaded images' into the corresponding rom folders of a new install of retropie . Will it automatically recognise the images or do I have to mess about editing paths etc

                                    1 Reply Last reply Reply Quote 0
                                    • P
                                      paradadf
                                      last edited by paradadf 15 Dec 2016, 10:33

                                      You need to copy both, the downloaded_images folder AND the gamelist.xml file. Don't forget to stop EmulationStation before that. But you must put them inside their original folders, otherwise you'll need to change their paths, which is actually not so difficult to do.

                                      1 Reply Last reply Reply Quote 0
                                      • C
                                        chuzzwuzzer
                                        last edited by chuzzwuzzer 15 Dec 2016, 11:04

                                        @paradadf Many thanks for the help. I have an old install on another sd card. I have every system scraped. I can see the images \configs\all\emulationstation\downloaded_images but they dont show in es. I have copied the gamelist.xml for each system and the images into the new install on the other sd card and it just reproduces the same problem with the images not showing ? Is the problem somehow with the xml files ?

                                        P 1 Reply Last reply 15 Dec 2016, 23:55 Reply Quote 0
                                        • P
                                          paradadf @chuzzwuzzer
                                          last edited by 15 Dec 2016, 23:55

                                          @chuzzwuzzer i'm sorry but I can't give any advice about that because I don't use retropie. Maybe if make a pastebin an upload some lines of your xml someone else can see what's the problem.

                                          1 Reply Last reply Reply Quote 0
                                          84 out of 728
                                          • First post
                                            84/728
                                            Last post

                                          Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                          Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.

                                            This community forum collects and processes your personal information.
                                            consent.not_received