RetroPie forum home
    • Recent
    • Tags
    • Popular
    • Home
    • Docs
    • Register
    • Login

    [SOFT] New Scraper in the works

    Scheduled Pinned Locked Moved Projects and Themes
    scrapingscrappersoftware
    253 Posts 7 Posters 61.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • mituM
      mitu Global Moderator @kiro
      last edited by

      @kiro said in [SOFT] New Scraper in the works:

      Is there any process that needs to be taken?

      Any modifications to RetroPie are added via the Github project, at https://github.com/RetroPie/RetroPie-Setup, either directly (by project maintainers) or via Pull Requests.

      are you not accepting any more scrapers? Thanks!

      No, we're open to improvements. I would advise first to open topic in the forums with installation/operations instructions so users can test it and provide feedback, before submitting it to RetroPie-Setup. Also, while providing the source for your program is not mandatory, it would be useful for users to be able to compile on their own platform without relying on a distributed binary.

      I tried downloading the program from your web page, but clicking on the download link opens a 'application/json' page filled with binary data - the download link is not working properly.

      kiroK 1 Reply Last reply Reply Quote 0
      • kiroK
        kiro @mitu
        last edited by

        Hi @mitu Thanks for the explanation, really appreciate it!!

        I'll have a look on why you're getting that! Thanks! I've had others download without issues. Which browser/os are you using by the way?

        I will provide a source for the scraper as soon as I feel it is ready to be distributed, still working on it.

        mituM 1 Reply Last reply Reply Quote 0
        • kiroK kiro referenced this topic on
        • mituM
          mitu Global Moderator @kiro
          last edited by mitu

          @kiro said in [SOFT] New Scraper in the works:

          Which browser/os are you using by the way?

          Firefox on Linux/Windows or plain old wget:

          wget http://77.68.23.83/dist/retroscraper.rpi
          --2022-04-06 15:45:37--  http://77.68.23.83/dist/retroscraper.rpi
          Connecting to 77.68.23.83:80... connected.
          HTTP request sent, awaiting response... 200 OK
          Length: 194481976 (185M) [text/plain]
          Saving to: ‘retroscraper.rpi’
          
          retroscraper.rpi                           81%[==================================================================>                ] 150,53M  11,1MB/s    eta 4s
          
          kiroK 1 Reply Last reply Reply Quote 0
          • kiroK
            kiro @mitu
            last edited by kiro

            @mitu and did you try executing the retroscraper.rpi file? (chmod 755 and ./) ?

            Warning: if you execute with the --cli modifier it will start scraping directly!
            It will not overwrite the gamelist.xml until each system is done. I need to implement a backup of gamelist.xml to prevent losing any manual changes that may have been done to them!

            Thx!

            mituM 1 Reply Last reply Reply Quote 0
            • mituM
              mitu Global Moderator @kiro
              last edited by

              @kiro said in [SOFT] New Scraper in the works:

              and did you try executing the retroscraper.rpi file ?

              Yes, and I didn't get any message, just stands there doing something/nothing ?
              I've tried with --cli and/or --help and nothing is displayed. I've just interrupted it from what it did, noticed it has created an - almost - empty retroscraper.cfg file.

              kiroK 1 Reply Last reply Reply Quote 0
              • kiroK
                kiro @mitu
                last edited by

                Thanks @mitu ... without --cli flag it will work only if you have a desktop (sdl2) available.... It is strange about the 'nothing happening' with the --cli... Are you trying on a machine that has the es_systems.cf in the usual place? (/etc/emulationstation)?

                That it creates an empty cfg is normal upon first run.

                Is your machine connected to the internet? I assume yes, but I have not seen anything hitting the backend, so it looks like it is not connecting?

                I'll have a look...thanks again!

                mituM S 2 Replies Last reply Reply Quote 0
                • mituM
                  mitu Global Moderator @kiro
                  last edited by mitu

                  without --cli flag it will work only if you have a desktop (sdl2) available....

                  I see no error message about not having a desktop env.

                  It is strange about the 'nothing happening' with the --cli... Are you trying on a machine that has the es_systems.cf in the usual place? (/etc/emulationstation)?

                  Yes, I have a standard RetroPie EmulationStation installed.

                  Is your machine connected to the internet? I assume yes, but I have not seen anything hitting the backend, so it looks like it is not connecting?

                  Well, yes, otherwise I wouldn't have been able to download it.

                  To be honest, I don't like the fact that's trying to connect somewhere without my choosing any options. Which 'backend' does it connect to ? I don't see anything mentioned on the download page.

                  kiroK 1 Reply Last reply Reply Quote 0
                  • kiroK
                    kiro @mitu
                    last edited by

                    @mitu it connects to my server to verify that the version running is ok, and to download the images (if running on non cli mode) and translations . It is the same IP as the download server...but it is strange I do not see it hitting this server (I see in the logs if scraping is being accessed)... I'll try later again in my RPI, but it should at least hit the server to download the startup data...strange..

                    mituM 1 Reply Last reply Reply Quote 0
                    • mituM
                      mitu Global Moderator @kiro
                      last edited by

                      @kiro said in [SOFT] New Scraper in the works:

                      it connects to my server to verify that the version running is ok

                      I though it would use one of the various scraping sources - doesn't it use them for downloading the artwork/metadata ?

                      kiroK 1 Reply Last reply Reply Quote 0
                      • S
                        sleve_mcdichael @kiro
                        last edited by

                        @kiro said in [SOFT] New Scraper in the works:

                        Are you trying on a machine that has the es_systems.cf in the usual place? (/etc/emulationstation)?

                        What if I'm not?

                        https://retropie.org.uk/docs/EmulationStation/#es_systemscfg-edits

                        kiroK 1 Reply Last reply Reply Quote 0
                        • kiroK
                          kiro @sleve_mcdichael
                          last edited by

                          @sleve_mcdichael it does look in all those possible directories, if not you can edit the config if running in cli mode or choose it if you're running in windowed mode.

                          1 Reply Last reply Reply Quote 0
                          • kiroK
                            kiro @mitu
                            last edited by

                            @mitu nope, it uses its own server.

                            mituM 1 Reply Last reply Reply Quote 0
                            • mituM
                              mitu Global Moderator @kiro
                              last edited by

                              @kiro said in [SOFT] New Scraper in the works:

                              nope,

                              I see. Then this is a nope from me also.

                              1 Reply Last reply Reply Quote 0
                              • kiroK
                                kiro
                                last edited by

                                If anyone is interested, I've released the source code here: https://github.com/zayamatias/retroscraper

                                Enjoy :-)

                                kiroK 1 Reply Last reply Reply Quote 1
                                • kiroK
                                  kiro @kiro
                                  last edited by

                                  A new version is out, it allows you to keep your favorites and play count after scraping (among other bug fixes).

                                  https://github.com/zayamatias/retroscraper

                                  kiroK 1 Reply Last reply Reply Quote 0
                                  • kiroK kiro referenced this topic on
                                  • kiroK
                                    kiro @kiro
                                    last edited by

                                    A new slimmed-down version of the scraper is available and should be easier to install and run, check it out here:

                                    https://github.com/zayamatias/retroscraper-rpie

                                    Thanks for your feedback!

                                    F 1 Reply Last reply Reply Quote 0
                                    • F
                                      Folly @kiro
                                      last edited by Folly

                                      @kiro

                                      Hi,
                                      Perhaps you knew already but I am the developer of this script.

                                      So for me it would be nice to find a good solution for generating gamelists that can be shared from within the script or perhaps could even be scraped from within the script.

                                      We already have some predefined gamelists with media that can be downloaded from within the script for the categories :
                                      konamih, tigerh , etc. (many done by @DTEAM)

                                      The script can install arcade categories like "shooter" or "pinball" too.
                                      However, for these categories there aren't predefined gamelists with media yet.
                                      Sadly these "categories" are not recognised by your scraper.
                                      Basically not a big problem so I renamed them to arcade and scraped them and renamed them back.
                                      Now I should have a proper gamelist+media, right.
                                      Well It doesn't work that way because the gamelist.xml contains full paths to files so I have to rename the roms directory from /arcade/ to /shooter/ to get it working again.

                                      Well, we had a different approach with our predefined gamelists.
                                      For our predefined gamelists, have a look here :
                                      https://drive.google.com/drive/folders/1f_jXMG0XMBdyOOBpz8CHM6AFj9vC1R6m
                                      You will see that we use relative paths rather than full paths.
                                      This solution makes it easier to copy to an other named folder or to a computer with an other username without editing the gamelist.xml.
                                      So my question is, could you incorporate that solution ?

                                      We also set the images/videos in the directory media/emulationstation/ .
                                      Which emphases that the media is used by emulationstation.
                                      So when running emulationstation only 1 media folder is seen instead of images/marquees/videos.
                                      I would like you to think about this too.
                                      For your folder it would mean :
                                      media/emulationstation/images
                                      media/emulationstation/marquees
                                      media/emulationstation/videos

                                      If you see something in both suggestions then we apply somewhat the same "standard" to the gamelists.
                                      With this we could somehow join some forces.

                                      What do you think ?
                                      Let me know.

                                      kiroK 1 Reply Last reply Reply Quote 0
                                      • kiroK
                                        kiro @Folly
                                        last edited by

                                        @Folly Hi, Sure it makes sense! I'm away this weekend but will put this as new features in the upcoming versions. I just need some time to understand exactly what's done.

                                        I believe that we could actually have the same gamelists created from my scraper, without the need to actually have 'predefined' gamelists.

                                        I'll definitely have a look.

                                        By the way, my scraper takes into accounts arcade systems such as 'Capcom classics', 'konami classics' and so forth, this is an example of my es_system for some of these classics:

                                          <system>
                                            <name>atariclassics</name>
                                            <extension> .7z .cue .fba .iso .zip .7Z .CUE .FBA .ISO .ZIP</extension>
                                            <platform>arcade</platform>
                                            <theme>arcadeatariclassics</theme>
                                            <command>/opt/retropie/supplementary/runcommand/runcommand.sh 0 _SYS_ arcade %ROM%</command>
                                            <path>../roms/atariclassics</path>
                                            <fullname>Atari Classics</fullname>
                                          </system>
                                        

                                        The scraper will match the name tag 'atariclassics' in this example.

                                        F 1 Reply Last reply Reply Quote 0
                                        • F
                                          Folly @kiro
                                          last edited by

                                          @kiro

                                          Ok, nice to hear you agree.
                                          Would really be great if you could accomplish that.

                                          So if I can somehow add <platform>arcade</platform> to a category then it should be recognised right ?

                                          kiroK 1 Reply Last reply Reply Quote 0
                                          • kiroK
                                            kiro @Folly
                                            last edited by

                                            @Folly You do not need to add anything like 'platform'.... my scraper recognizes the roms based on their checksum, not their names, directories or anything. In principle, you could put all roms into a single directory and it would recognize them as long as their checksum is in the database.
                                            If the checksum is not in the database then the <name> tag will help the scraper try to figure out the game, but it is not mandatory.

                                            To answer your question about the <platform>, I'm not sure why this tag is there, it is not taken into consideration by the scraper :-).

                                            F 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post

                                            Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.

                                            Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.