Game statistics
-
@langest said in Game statistics:
But it would probably be best to cache all the results on disk so you don't have to recalculate it every time you run restart the program.
Isn't Skyscraper already doing this ? You could compare the
timestamp
with the file date and rely on Skyscraper's already built cache - of course, for entries which are present there (i.e. without roms added since the last scraping session). -
@langest Ah, you mean the cache rom id's. Yes, they are a mix of sha1 sums of the actual rom data and sha1 sums of filenames, depending on whether the input is a script (or zipped, read on for explanation) or an actual rom. So in the case of .cue files for instance, it is a sha1 of the filename. If the files are more than 50 megs, they are also id'd by the sha1 of their filenames for speed optimization. But please know that the cache rom id's are completely separatate from the checksums I do when using the ScreenScraper module. In fact I was thinking of renaming the "sha1" attribute in db.xml entirely to just me "id" as to avoid confusion. But then that would break backwards compatibility and the db.xml files aren't means to be edited by hand anyway. It should always be edited with
--cache edit
.And there's reason behind the madness. I could just id the files by filenames. But some people like to have games in subfolders and potentially have two games with the same filename under the same platform even if the games are different. To avoid this I use the sha1 checksum instead. But of course this doesn't make sense for scripts, which change often. So for those I have to use the sha1 of the filename instead. I also always use the sha1 of the filename for zipped files, since people might unzip them and rezip them. This would also break the id for that game. So yeah, there's some considerations behind the madness, just to let you guys know. I'm not saying it's the best way, but it's kind of locked in now, as to avoid breaking people's caches, and it works well (although slow, as you point out, for the bigger roms) so.
-
@langest said in Game statistics:
I also figured out why using skyscraper data is "slow". It is because you might be reading and hashing a big file, dreamcast rom et.c.
And the IO is what takes the most time. I have added a function cache to reduce this problem. But it would probably be best to cache all the results on disk so you don't have to recalculate it every time you run restart the program.I actually thought about doing this. Do a lookup if the file hasn't changed since last scrape. Something for 4.0 or 3. 5 I guess.
-
@muldjord
That answers my question.
It seems this leaves two options for fixing the issue.- Implement the same hashing and calculation of all the id's as they are implemented in skyscraper so I can use the db.xml as a dictionary.
or - Extend skyscraper to cache rompath -> id/sha1sum in a file given an argument.
I can't see why 1 would be a useful option, that would be double work and I don't want to extend the stats tool to have scraper functionality.
2 makes a lot of sense to me. It opens up the db.xml data to be used with external tools and it would save time if you were to run skyscraper again. I imagine this wouldn't be too complicated to implement.
Is this something you would be able to implement in the near future or should I consider taking a look? - Implement the same hashing and calculation of all the id's as they are implemented in skyscraper so I can use the db.xml as a dictionary.
-
@langest said in Game statistics:
Is this something you would be able to implement in the near future or should I consider taking a look?
I'd go with option 1 in this case. I'm not currently looking for contributions and I'm eager to work on this feature myself as it's an interesting problem (the checksum lookups) with high optimization potential. But I am not able to give you an ETA.
EDIT: Just a thought: Is it not a bit much to make this rely so much on Skyscraper's db.xml (unless it's just one of several lookup methods you plan to implement)? Just my 5 cents.
-
@muldjord
Alright,
Maybe I could suggest adding the rom path to the Resource struct
https://github.com/muldjord/skyscraper/blob/225136b245c1fca4936062f1fb5430691beed283/src/cache.h#L40
You already seem to have it in the GameEntrys path member. And then together with timestamp it would be easy to see if you need to update the resource.
You probably can't save the path to sha1 mapping in the same resource nodes that the rest of the resources use, but it shouldn't be a problem to introduce a new node either in db.xml or in a separate file. -
@langest I promise it will be well thought through and backwards compatible in the sense that it will convert old entries if needed on-the-fly. If anything this has made me realize that now is probably the time to look into this.
EDIT: A file location for the rom is not very useful. If you move your files to a different directory (or a different system entirely, or someone sends you their cache), they need to still be identifiable. This has to be contained to the file itself, so modification time makes sense for this purpose.
I might also change it to always only do the checksum on the first 512 k of data instead of the entire rom. This will make it quite a lot faster to begin with. And combined with the lookup table this will prove a significat optimization. And for anyone following this, I will of course make sure this will convert "old" cache id's on the fly, so it won't break anything. -
now is probably the time to look into this
Glad to hear it. :)
checksum on the first 512 k of data instead of the entire rom
This sounds like a really good idea.
-
@muldjord said in Game statistics:
Just a thought: Is it not a bit much to make this rely so much on Skyscraper's db.xml (unless it's just one of several lookup methods you plan to implement)? Just my 5 cents.
I implemented it as separate module. You can easily replace it with something else to get the rom meta information. You just plug in a new
get_title_info(som_path: str, system: str) -> title
. The reason why I depend on skyscraper is because I don't want to build a scraper, so I need to depend on at least one other scraper to get the meta info. And skyscraper is the one I am using on my system, so it makes sense for me to use it. I understand that the db.xml is not supposed to be part of an external api (in its current state) and that because of this it might seem a bit strange to depend on it.
You can use the tool without any meta information but then the title be whatever your rom name is. And at least for arcade this could be somewhat confusing. -
Why don't you use the info from the
gamelist.xml
instead of the Skyscraper's cache ? This way it's the name that appears in ES and you don't have to hash the ROM.
Regarding arcade games ROMs, Emulationstations comes with a list of ROM name -> Game name mappings, it's in/opt/retropie/supplementary/emulationstation/resources/mamenames.xml
. It uses the list to show the pretty name in gamelists for systems which are considered Arcade like. -
@mitu
Because I forgot that this file should have all the info I need.
You're absolutely right. This is the way to go. -
@langest I've been working all day to implement a "quick id" system into Skyscraper. It's functional right now, but untested. You were right about the necessity of the filepath for this system to work. I basically do a check on lastmodified and filepath. And if lastmodified is in the past or equal and filepath matches, I use the cached id from my quickid list. Otherwise it reverts to calculating the id from the file itself. More info soon as I test it a bit further. And for obvious reasons I'll move any further comments I have on this to the Skyscraper thread. :)
-
Did the implementation using gamelist.xml, was very easy and the program is lightning fast now that it doesn't do any file IO or hashing.
Not as interesting problem to solve, but it works just as expected. (Might have some minor bug)
Thanks for the input @mitu and @muldjord .
I am still looking forward to the cache improvements of skyscraper. I'll make sure to look by the skyscraper thread to see when it is finished. -
I have implemented a bash menu, similar to the ones used in the rest of RetroPie.
I am also usingdialog
since I though it would look consistent.
However, it seems like controller support doesn't work by default in the menu.
How are the controller configured to work in the retropie-setup.sh for instance? -
-
Thanks
-
@mitu
Do you think it would be possible to assume I am running on RetroPie and source the helpers into my scrip? Or do I need to copy the code? -
@langest It's up to you - however without a RetroPie installation the
joy2key
program might not be available to run. -
@mitu
What would make the most sense to me would be to reuse the already implemented functionality in retro pie, because I don't plan to use the dialog interface anywhere else.
The end goal of the tui is to add it to experimental programs in the retro pie install script.
I tried to source the files, but I most likely did something wrong.
I haven't really done much bash programming before so I am not sure how everything should be set up. Could you give me some pointers to get started?
ThanksOn a separate note, I have come up with some neat new features for the python program that I am going to add. Such as bar chart for visualization and weekly/daily activity.
-
@administrators Maybe this thread should have been in the ideas and development sub-forum.
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.