Manually generate Mame2003 gamelist?
-
@herb_fargus said in Manually generate Mame2003 gamelist?:
should essentially be the files stripped of their extensions
perfect.
-
Hi,
I have a little php script which I use to parse history.dat to generate or update gamelist.xmlIt's not a scraper but it's been handy for me to populate the game infos quickly without spending a long time scraping.
I just provide a snap directory and history.dat file and run my script on the pi, it takes about a minute to run for a large history.dat + large directory
might need tidying up a bit with instructions but am happy to post.
-
@kixut That sounds very useful! I wouldn't mind giving it a go.
@caver01 I've also updated my sheets filling in the content I have. the catver and history dat should get the rest mostly? I also wasn't sure what the distinction between developer and pubisher was in the metadata seems sometimes its both, sometimes its only one and not the other, sometimes may technically be neither... bit confusing.
You'll also note my paths are relative as I intend for this to just be a package someone can dump in their romlist and just have it automatically scraped that way as everyone should more or less have the same romset.
-
@herb_fargus said in Manually generate Mame2003 gamelist?:
@caver01 I've also updated my sheets filling in the content I have. the catver and history dat should get the rest mostly? I also wasn't sure what the distinction between developer and pubisher was in the metadata seems sometimes its both, sometimes its only one and not the other, sometimes may technically be neither... bit confusing.
You'll also note my paths are relative as I intend for this to just be a package someone can dump in their romlist and just have it automatically scraped that way as everyone should more or less have the same romset.
Looks good to me! It's very close now. Did you see how I used a similar sheet to build out a new one with the XML text? Maybe there's an easier way, but I am happy to try. That PHP script sounds promising.
-
Hi,
I've added a few comments now hope it helps someone, I haven't had time to retest but let me know if anything needs to be explained.
To use firstly php needs to be installed, the script needs to be on the pi with a history.dat file in the same directory, the $system needs to be set in the source (to select the rom dir), then run with php -f file.php - see more details in the script comments for more info.
Edit3: I would love to see the correct display of the Japanese names of games but either the character encoding is getting messed up somewhere in my script or emulation station doesn't either handle the character encoding or support the character set, not sure at the moment.
Edit2: in the comments it implies that it generates the gamelist.xml, it really updates an existing gamelist.xml file, so it is expected to already exist - the script fills in the missing gamelist.xml info from data from the history.dat file.
Edit: derail alert
On another note, a little while back a upgraded my pimorini picade with the latest version of retropie and also wanted to sort my roms out, I wanted to do tasks such as fill info from history.dat, or move clones into a subdirectory, or be able to select games and move a big selection into a subdirectory and other various rom management tasks.I thought that maybe a good solution for this would be to create a web app that is served from the pi then the whole thing can be managed remotely, I've a lot of experience in this so I should be able to put something together quite quickly, but haven't looked at what management solutions are already available, I don't want to re-invent the wheels but would be interested if anyone thinks there would be a need to that type of solution.
<?php /************************************************************** / / Script to populate gamelist.xml from history.dat for retropie / / By Garry Pankhurst 12/12/2016 / / Populates :- / Title / Description / Publisher / Date published / Programmer / Players / Imagefile if {{snap directory}}/{{gamerome}}.png exists / that matches either the main rom or clone **************************************************************** Instructions :- 1) Install php sudo spt-get install php5 2) Change $system below to which system (rom directory) where you want to update the gamelist.xml file 3) Set $listpath and $snappath , see comments below 4) make sure that you either exit Emulation Station or Save Metadata on Exit is turned off (otherwise changes will be overwritten) 5) back up your gamelist.xml file 6) Save this file to the same place on the pi where there is a history.dat file as "thisfile.php" from there, at the command prompt run "php -f thisfile.php" ****************************************************************/ // uncomment a system //$system = "mame"; //$system = "mame-advmame"; //$system = "fba"; //$system = "arcade"; $system = "mame-libretro"; // you dont need to change this :- $path = "/home/pi/RetroPie/roms/".$system; // do you keep your list path with your roms ? $listpath = $path."/gamelist.xml"; // or here ? //$listpath = "/home/pi/.emulationstation/gamelists/".$system."/gamelist.xml"; // path to the snap directory from roms directory - or can use absolute path $snappath = "./images"; $cwd = getcwd(); // remember starting current directory chdir($path); // go to where roms are $gameslist = simplexml_load_file($listpath); // load gameline.xml $history = array(); // start an empty history array // strip junk from history.dat attribute function sss($string) { return preg_replace ('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', ' ', $string); } function get_attr($attr, $line) { if(preg_match('/^'.$attr.' : /i', $line)) return rtrim(preg_replace('/^'.$attr.' : /i', '', $line)); } function scrape_history() { global $history, $path, $cwd, $snappath; echo "Loading history...\n"; // open history, go through each line and add every game as an // object to the $history associative (hash) array $fp = fopen($cwd."/history.dat","r"); if(!$fp) { echo "Cannot open history.dat"; exit(0); } $line = fgets($fp); while(!feof($fp)) { // find $info while(!feof($fp) && !preg_match("/^\\\$info/", $line)) $line = fgets($fp); if(feof($fp)) break; // array of zips, the first zip being the the original // the following are the clones $zips=preg_split("/[\s,]+/", preg_replace("/^\\\$info=/","",$line), -1, PREG_SPLIT_NO_EMPTY); // build a $game new object from history.dat $game = (object)new StdClass; //$game = array(flag => true); // skip to "bio" while(!feof($fp) && !preg_match("/^\\\$bio/", $line)) $line = fgets($fp); if(feof($fp)) break; // first lines after a bio line // get title, publisher and release date from bio $line = fgets($fp); // blank $line = fgets($fp); // title $game->title = sss(rtrim(preg_replace("/\(c\).*/", "", $line))); $game->publisher = rtrim(rtrim(preg_replace("/.*\(c\).*[0-9] */", "", $line)),'.'); $relyear = preg_replace("|.*\(c\).*[ /]([1-2][0-9][0-9?][0-9?])[\. ].*|", "\\1", $line); if($relyear==$line) $relyear = preg_replace("|.*\(c\).*[ /]([1-2][0-9][0-9?][0-9?])$|", "\\1", $line); $relmonth = preg_replace("|.*\(c\).* ([0-9][0-9])/([1-2][0-9][0-9][0-9]) .*|", "\\1", $line); if($relmonth==$line) { //echo "bad rel month [$line]\n"; //exit; $relmonth='01'; } if($relyear==$line) { echo "bad rel year [".rtrim($line)."]\n"; } else $game->releasedate = rtrim($relyear).rtrim($relmonth).'01T000000'; $line=''; $bio = ""; // use 'just' the rest of bio as description $gameid=''; $players=''; $programmer=''; $b=0; while(!feof($fp) && !preg_match("/^\\\$end/", $line)) { // just first 10 lines will do if($b++<10) $bio .= $line; $line = fgets($fp); //if(!isset($game->gameid)) //$game->gameid = get_attr('Game ID', $line); if(!isset($game->players)) $game->players = get_attr('Players', $line); if(!isset($game->programmer)) $game->programmer = sss(get_attr('Programmer', $line)); if(!isset($game->programmer)) $game->programmer = sss(get_attr('Programmers', $line)); } $game->bio = trim(sss($bio)); // find first image/clone image in set // this image will be used for clones // if no clone image exists for any clone for($i=0; $i<count($zips); $i++) { if(!isset($game->image) && file_exists($snappath."/".$zips[$i].".png")) $game->image = $snappath."/".$zips[$i].".png"; } // Add a game entry to the history array // for each zip for($i=0; $i<count($zips); $i++) { if(!$zips[$i]) continue; $g = clone $game; $g->clone = ($i>0); // lets not store bio for clones if($g->clone) { unset($g->bio); } // override main image for the set if a // specific matching clone image exists if(file_exists("./images/".$zips[$i].".png")) $g->image = "./images/".$zips[$i].".png"; if(preg_match('/\?\?/', $game->title)) $g->title = $zips[$i]; elseif($i>0) $g->title .= ' ('.$zips[$i].')'; $history[$zips[$i]] = $g; } //print "loaded ".$game->title."\n"; $line = fgets($fp); } fclose($fp); } function merge_history($game) { global $c, $u, $history; $name = basename($game->path, ".zip"); if(!isset($history[$name])) { print "not found ".$name."\n"; return; } $gamehist = $history[$name]; $u=true; if($game->name != $gamehist->title) { $game->name = $gamehist->title; $u=true; } //if(!$game->desc) { $game->desc = $gamehist->bio; $u=true; } if(!$gamehist->clone) { $game->desc = $gamehist->bio; $u=true; } // update if missing or different if(!$game->publisher) { $game->publisher = $gamehist->publisher; $u=true; } if($game->publisher != $gamehist->publisher) { $game->publisher = $gamehist->publisher; $u=true; } if(!$game->releasedate) { $game->releasedate = $gamehist->releasedate; $u=true; } if($game->releasedate != $gamehist->releasedate) { $game->releasedate = $gamehist->releasedate; $u=true; } if(!$game->players) { $game->players = $gamehist->players; $u=true; } if($game->players != $gamehist->players) { $game->players = $gamehist->players; $u=true; } if(!$game->developer) { $game->developer = $gamehist->programmer; $u=true; } if(!$game->image && isset($gamehist->image)) { $game->image = $gamehist->image; $u=true; } if(isset($gamehist->image) && (!$game->image ||$game->image != $gamehist->image)) { $game->image = $gamehist->image; $u=true; } if($u) { echo "-------------------------\n"; //echo " Game ID = " . $gamehist->gameid . "\n"; echo " Title = " . $gamehist->title . "\n"; echo " Publisher = " . $gamehist->publisher . "\n"; echo " Developer = " . $gamehist->programmer . "\n"; echo " Released = " . $gamehist->releasedate . "\n"; echo " Players = " . $gamehist->players . "\n"; echo "-------------------------\n"; } } // first build our history.dat file into an quickly accessible // array of games (indexed by rom name) scrape_history(); // then go through the existing gamelist.xml file // // for each game that still exists // merge any missing information gathered from the history.dat file $c=0; for($i=0; $i<count($gameslist->game); $i++) { $u = false; $game = $gameslist->game[$i]; $name = basename($game->path, ".zip"); $games[$name.".zip"] = $i; echo "$i, $c) " . $name . " : " . $game->name; // if the rom exists then merge any better data from history.dat if(isset($game->path) && file_exists($game->path)) { merge_history($game); if($u) { echo " - updated"; $c++; // save every 50 games if( ($c % 50) == 0 ) { echo "\nSaving ..."; $gameslist->asXml($listpath); } } } echo "\n"; } // we've checked whatever was in gamelist.xml // now look for any new roms in the rom directory if($dh = opendir($path)) { while(($file = readdir($dh)) !== false) { $i++; if(preg_match("/\.zip/", $file) && !isset($games[$file])) { echo "$i, $c) insert : filename = " . $file . "\n"; $game = $gameslist->addChild('game'); //$game->addChild('path', $path.'/'.$file); $game->addChild('path', './'.$file); merge_history($game); $c++; // Save every 50 if( ($c % 50) == 0 ) { echo "Saving ...\n"; $gameslist->asXml($listpath); } } } closedir($dh); } // final save, we're done $gameslist->asXml($listpath); ?>
-
still need to check out the php script, haven't gotten a free moment yet.
Though as an aside to parsing the xmls, as I haven't really got the greatest scripting skills, a quick hack you can download the sheet as a csv, upload that csv to here: http://www.convertcsv.com/csv-to-xml.htm and tweak a few settings and it pukes out a functional gamelist.xml
Once the spreadsheets are filled, should be simple enough to generate :)
-
Very interesting web tool, will try it; thanks !
-
@herb_fargus @caver01
Wrote some vba code in excel to parse History.dat (182) and extract the "official" mame .78 games descriptions.
Here's the resulting csv file, fields are: mame .78 rom name, latest mame name, game title and description. Total 4714 roms.Should now be quite simple to integrate your Google sheet with this info (by vlookup ?) and make a "global" table to generate gamelist.
-
@herb_fargus if useful, there is now a 0.78 catver.ini in the mame2003 repository: https://github.com/libretro/mame2003-libretro/tree/master/metadata
One potentially nice thing about this is that corrections or updates can be made to the catver.ini in the libretro repository and the changes could percolate down to frontends and users of the core. I can see this RetroPie effort helping 'upstream' improvements.
Speaking of which, it may also be possible to add a history.dat to the mame2003 repository. history.dat is the only other of the major MAME metadata files that I haven't pursued yet in my quest to add them to the various mame repos.
@UDb23 that's some nice work! (I recently created a simple AHK script to regenerate catver.ini files from one version to another if you ever need it)
Two questions:
- could your VBA be modified to produce a properly-formatted history.dat that we might try to add to the mame 2003 repository?
- do you have a way to list the titles that are missing records in your csv so that the could be completed or at least tracked?
-
@markwkidd Brilliant work. How did you handle the file renaming changes through the versions? or did you use the hashes from the dat?
Also the retropie wiki has 4705 games listed but your list is 4723 (wiki could be wrong I'll have to look over the dats again)
-
@markwkidd Basically it starts from a list of roms and finds the corresponding descriptions in History.dat.
So if you need to extract additional or different roms it is very simple: just adding them to the initial list.
Note: History.dat contains current mame rom names (vs 78 names); therefore initial list must have 78. romname and current (182) romname (this last is the one used to search in History.dat's content).What's the exact "properly formatted" structure you'd like History.dat to be "converted" ?
Please note that it contains huge amount of text. Descriptions already are very long.
I'm currently fixing a minor issue with CR/LF that is not correctly saved in my csv. -
@markwkidd To handle the name changes I wrote some other code some months ago based on renameset.dat by Antopisa.
-
@UDb23 Game names/zip file commonly names changed over time in the MAME sets. The game may have one filename and title in MAME 0.182 and be completely different in MAME 0.78.
The .dat formats also changed throughout the years. For example catver.ini and cheats.dat each go through at least three format changes between MAME 0.37b5 and now. Because I haven't been able to find an old history.dat for that era yet, I'm not sure if a modern history.dat can be read by the mame2003 core itself. I'll look into that though!
@herb_fargus you may be giving me credit for UDb23's work -- they are the one who created the awesome history.dat CSV ! : )
-
@UDb23 said in Manually generate Mame2003 gamelist?:
@markwkidd To handle the name changes I wrote some other code some months ago based on renameset.dat by Antopisa.
Nevermind my thoughts on name changes. Nice!
-
@herb_fargus said in Manually generate Mame2003 gamelist?:
@markwkidd Brilliant work. How did you handle the file renaming changes through the versions? or did you use the hashes from the dat?
Also the retropie wiki has 4705 games listed but your list is 4723 (wiki could be wrong I'll have to look over the dats again)
There should be a total of 4720 ZIP files in a MAME 0.78 set, of which 15 are BIOS. This is true of a set I have access to which scans as complete in ClrMamePro.
-
@markwkidd ah thats right, when I listed it I removed the bios because I didn't consider them games. Ok. sure. um. I'll need to look them both over and compare against the name changes just to verify it all.
@UDb23 looks great so far.
-
@herb_fargus Here's an improved version of the csv with the descriptions.
Basically I got rid of all those leading (and trailing) non readable characters (CR/LF etc) that were present in the description in History.dat.btw: I wonder how they came up with that "unstructured" history.dat file format A little bit tricky to extract the descriptions without specific delimiters. ;-)
-
Concerning number of ROMS:
my list includes 4714, so if ClrMamePro states 4720 it should be easy to find the 6 missing ones by comparing. Or maybe I just missed some of the BIOS files to be removed (actual game roms should be 4705 according to @markwkidd ).
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.