Making premade gamelist.xml with xtra media collections
-
So... I'm going to have to change some of that above code. When checking through all of the games, I noticed that there were 3 games that now didn't have any text to the body at all. It turns out that this was because I hadn't put a space between the single paragraph entry and the citation URL, and the code is telling the script to not copy over the URL line as well as the line above and below it.
So this pointed out all 3 entries I had out of 2,116 that had this problem, right? Wrong. :(
This only told me which ones had this typo that were single paragraph entries. Without opening every document and verifying this is correct on all files, I really have no way of knowing if this was a problem on other entries that had more than one paragraph. I only caught these because they were single paragraph entries and there was no body text at all in EmulationStation.
So I'm just going to change that code to delete the URL citation line and the line below it. It doesn't hurt anything to have a kilobyte or two of extra blank lines at the end of all the collective game entries in the gamelist.xml just to be on the safe side in case I made this typo in other spots. :)
-
I'm really starting to figure this stuff out now. It took me several hours here, but through a lot of trial and error I've completely re-written the code for scraping the
<desc>
and<url> / <urlalt>
tags, and I believe now they work perfectly. :)I'll post the revised code here, as well as the internal notes that I have inside the file that explain what everything is doing in case I look at all of this a few months from now and forgot everything I've done.
# desc : the content below "______", stopping before URL(s) / Usable element in RetroPie # NO_DESC_FLAG - If the "--no-desc" argument was used, the <desc>, <url>, and <urlalt> tags won't be filled. if [[ "$NO_DESC_FLAG" == 0 ]]; then desc="$(sed '/^__________/,$!d' "$file" | tail -n +2 | tr -d '\r' | sed 's/&/&/g' | grep -v ^http | sed '/^$/d' | sed '/^$/d;G')" # 1. sed '/^__________/,$!d' "$file" = Begins looking at any line that starts with 10 "_" characters. Writes to file using all # parameters following #1 # 2. tail -n +2 = Starts copying text from 1st line after "_" line above. If a space, it will be deleted with #6. # 3. tr -d '\r' = UNSURE. I believe this has to do with formatting line breaks? # 4. sed 's/&/&/g' = UNSURE. I believe this part of the line converts "&" into "&" # 5. grep -v ^http = This part of the line deletes any lines that begin with a URL (http and/or https) # Won't delete URLs in the middle of a line. # 6. sed '/^$/d' = Deletes all blank lines in text that's being copied over. # 7. sed '/^$/d;G' = Double Spaces the body output. (Added good double spacing after removing all blank lines for forced uniformity) fi # url : Synopsis cited link #1 - Citation link for info in all synopsis.txt files / Not usable in RetroPie (05/29/2018) if [[ "$NO_DESC_FLAG" == 0 ]]; then url="$(sed '/^__________/,$!d' "$file" | grep -m1 ^http | tr -d '\r' | sed 's/&/&/g')" # 1. sed '/^__________/,$!d' "$file" = Begins looking at any line that starts with 10 "_" characters. Writes to file using all # parameters following #1 # 2. grep -m1 ^http = Finds and writes only first URL line. "-m1" stops grep after first instance found. fi # urlalt: Synopsis cited link #2 (Typically only for Hacks and Translations) / Not usable in RetroPie (05/29/2018) if [[ "$NO_DESC_FLAG" == 0 ]]; then urlalt="$(sed '/^__________/,$!d' "$file" | awk ' f && NR==f+1; /^http/ {f=NR}' | tr -d '\r' | sed 's/&/&/g')" # 1. sed '/^__________/,$!d' "$file" = Begins looking at any line that starts with 10 "_" characters. Writes to file using all # parameters following #1 # 2. awk ' f && NR==f+1; /^http/ {f=NR}' = Searches for "http[s]" and writes the following line, even if it also begins with "http[s]", # Which in this case it always will if it exists. fi
I've tested this on many different examples of my
synopsis.txt
files that cover a vast array of possibilities and this doesn't seem to have any negative effects in any instance. Some more testing is required after a full re-run of the gamelist.xml, but I do believe that I've "perfected" the <desc> code and what I've been looking for here. Now there will be zero extra spaces where they aren't wanted, and there should be enough versatility in this code to allow for quite a few accidental variances in my synopsis.txt files that would be fixed when run through this script.Next step is to try to figure out the reverse code that it looks like Meleu actually had added or at least started to add. I had no idea that it has already been in there. I'm hoping that it was working or was really close to working and I can figure it out from there. :)
-
The Reversal code to convert the
gamelist.xml
to[synopsis].txt
is working great now!Had to add a bunch of tags and tweak a lot of the output, but I can now create an exact duplicate of any inputted
[synopsis].txt
file by running the Reverse code after thegamelist.xml
is generated!Actually, because of the code I put in both the regular script and the Reverse code, running the files through both of these processes can actually clean up things like leading blank lines in the descriptions, missing blank lines between the description and the citation URL's and any accidental extra lines in between paragraphs. After running them back and forth they will all be completely standardized, which is great news for the XBox since it displays the txt files exactly as they are written.
Even better, I figured out how to create a
Reverse/[system]/
directory and write them there to save on clutter. One day when everything is completely tested and sure to work, I will just direct these files be written over the old[synopsis].txt
files that were used to generate thegamelist.xml
in the first place, but at this point I haven't tested it enough to feel comfortable with that idea.My big hurdle right now is trying to figure out how to generate
[synopsis].txt
files for the Folders. This is something that @meleu didn't figure out and he even had notes in the script sayingTODO: not sure what to do if the gamelist.xml has a <folder> entry.
With all I've done so far, I'm not confident I can figure this one out if Meleu didn't.
It's not imparative since there are only about 30 of them for the NES, and that's probably the system I have with the most sub-directories/rom categories. Some systems only have 5 or 6 of them. If I have to end up doing these manually, that won't be all that big a deal considering that in this case it's only about 30 files in nearly 2,150. It would just mean that I'd have to edit the txt files individually rather than editing them in the spreadsheet, which I actually might decide is the preferred method when doing any edits to the
<desc>
field anyhow.I'm thinking that the spreadsheet editing process is really only going to be extremely beneficial for fields like
<genre>
and<publisher>
and such where I can alphabatize the results and fix anything that is a typo or doesn't conform to whatever standard(s) I decide at that time. -
Anybody here wouldn't know of a magical command that could change every instance of
&
to&
when I'm trying to reverse back to text files from thegamelist.xml
, would they?All 46 entries that had an
&
in the file name were ignored in the reversal process, so their[synopsis].txt
files were not created at all.Also, I can't seem to figure out how to convert all cases of
&
back to plain&
in this code either. The reversal code seems to rely almost exclusively onxmlstarlet
, which seems to be much different than what I've been learning elsewhere in the code and I'm not coming up with any searches online that are giving me any answers to these two questions. -
@used2berx Do you have an example ? I thing
xmlstarlet
should be the first choice when extracting data from XML, the&
thing is one of the XML escape characters that can possible appear in a rawCDATA
text section. -
@mitu Hey mitu.
I did figure it out. It was
xmlstarlet
too.@meleu had 6 lines that used the
xmlstarlet
command in the reverse code.I spent quite a few hours here banging my head against the wall. I finally decided to try to create a copy of the gamelist.xml, replace all instances of
&
with&
, run the reverse code and then delete the file and rename the copied file the original name.That worked out well until it didn't know what to do with the
&
characters anymore and I totally broke the code.It turns out the solution was really simple. I just had to add
-T
to thexmlstarlet sel
to force it to handle the&
as text.Example:
IFS=$'\n' names=($(xmlstarlet sel -t -v "/gameList/game/xtrasname" "$GAMELIST"))
Became:
IFS=$'\n' names=($(xmlstarlet sel
-T
-t -v "/gameList/game/xtrasname" "$GAMELIST"))This solved both of my problems by putting this in all 6 instances. Now it will write files with the & when it needs to, and it also converts every instance of
&
to&
from the gamelist.xml to the [synopsis].txt files. :)It's going to take a few more days to re-run the reverse code for all the files now, but I've been occasionally checking a few things as it's going and so far , so good.
Thanks
-
@mitu Well.... nope....
It didn't work. My testing was flawed.
Although it seems to fix all writing instances of the
&
problem, it's still ignoring the files themselves. I'm assuming it must have something to do with the path variable that states where the file is located. I didn't think to test that when I thought everything was good. As an example, here's the path for one of the D&D games:<path>/home/pi/RetroPie/roms/nes/(1) Licensed/(1_1) US Licensed/AD
&
D Pool of Radiance.zip</path>I'm not concerned with any other occurance of this when it's pathing the images and videos and such. The reversal code is just writing information back to an individual text file per game that doesn't need or display any pathing info. The only reason I need this to be read correctly now is to get these synopsis.txt files generated.
I'm actually not even sure that this is the problem. I'm just spitballing right now.
-
Nevermind. I got it. :)
This pointed out an earlier flaw in the code that I wasn't even aware of since it didn't prevent creation of any entries in the gamelist.xml or any of the tags that RetroPie uses.
I have a new tag on there called <xtrasname> which is the file name. It turns out that the way I had it coded this tag wasn't being filled out for any game entry that had an
&
in the file name when thegamelist.xml
was being generated. The reversal code was re-written to rely on that data to start the new[synopsis].txt
generation, so if that field is blank than it will just skip to the next entry. That's why all 46 games with an&
in the filename weren't being created now.I had to change the following code:
xtrasname="${file_name%.*}"
to read as this instead:
xtrasname="${file_name//&/&}"
I'm going to re-run the
gamelist.xml
now and when it's done I'll start the reversal code. If we end up with 2,116 files when it's done then this all finally worked. :) -
Awesome. All 2,116 synopsis files were successfully created with the reversal script!
I checked about 100-150 of them and they all look good so far. In fact, only 3 of them had any differences at all in Diffchecker from the original before the
gamelist.xml
was created. :)I still want to check all of them to see if there are any examples of bad things happening, but the 4 differences that were found in the 3 files were all desired. They were the removal of an extra blank space at the end of file, Capitalization of two of the tags that weren't capitalized, and the removal of a tag that was blank in the original because it was unused for that game.
Rather than check all 2,116 of them in Diffchecker which was taking forever and not really worth it because most of them were identical, I'm going to get a folder matching program and only test the ones that come up with any differences. I've got a good feeling that everything is perfect this time.
Man.... this is going to save so much time in the future once I make a script that will automatically edit bad characters and other misc things that I may have missed when I put these things together. I spend so much time trying to make them perfect by hand, but there are still problems. Now I won't have to go back in and continually edit all of them manually anymore.
I should be ready to try out your script to convert the gamelist.xml to a spreadsheet. Your code will already convert it back when you're done editing?
Once I'm able to do that part, I can seriously go in and start editing fields like
Genre
to make sure that everything is represented properly. :) -
@used2berx said in Making premade gamelist.xml with xtra media collections:
I should be ready to try out your script to convert the gamelist.xml to a spreadsheet. Your code will already convert it back when you're done editing?
No, my script only exports to Excel the gamelist.xml. To export back, you can export directly from Excel.
In fact, I think you can ditch my script altogether and import the XML directly into Excel, modify it, then export it back. Excel will generate an 'XML Schema' based on your document structure, when importing the data, then export it using the same Schema back to an XML file. See an example at
If you'd like me to take a look, paste a gamelist.xml file with just 1 entry and I'll try to import it/export it directly with Excel.
-
@mitu Awesome man. I'll check out that link before I try anything.
Here's an example of the
gamelist.xml
that does a fairly good job of showing almost everything. It's a translated game, so there are more fields filled out than a standard official release. Hacks would use some alternate tags. All of the tags that I currently have represented in any NES/FDS synopsis are included here though, even if they're not being utililzed by this game. None of the games use all of the tags.<game> <xtrasname>100 World Story</xtrasname> <path>/home/pi/RetroPie/roms/nes/(2) Translated/100 World Story.zip</path> <boxfront>/home/pi/RetroPie/Media/nes/Artwork/Box Front/100 World Story.jpg</boxfront> <cart>/home/pi/RetroPie/Media/nes/Artwork/Cart/100 World Story.png</cart> <title>/home/pi/RetroPie/Media/nes/Artwork/Titles/100 World Story.png</title> <action>/home/pi/RetroPie/Media/nes/Artwork/Action/100 World Story.png</action> <threedbox/> <xtrasmarquee/> <gamefaq/> <manual/> <vgmap/> <video/> <image>/home/pi/RetroPie/Media/nes/Artwork/Box Front/100 World Story.jpg</image> <marquee/> <name>100 World Story: Tales on a Watery Wilderness</name> <alternatetitle/> <originaltitle>Hyaku no Sekai no Monogatari: The Tales on a Watery Wilderness</originaltitle> <platform>nes</platform> <xtrasplatform>Nintendo Entertainment System</xtrasplatform> <region>Japan</region> <media>Cartridge</media> <controller>NES Gamepad</controller> <genre>Strategy / Tactics</genre> <gametype>Translated</gametype> <releasedate>19910101T000000</releasedate> <xtrasreleaseyear>1991</xtrasreleaseyear> <translationreleaseyear>2007</translationreleaseyear> <hackreleaseyear/> <developer>ASK</developer> <publisher>ASK</publisher> <license/> <programmer/> <musician/> <translatedby>AlanMidas</translatedby> <hackedby/> <version>BS.VQt</version> <players>4</players> <xtrasplayers>1 to 4 VS</xtrasplayers> <desc>This Famicom title by ASK combines the mechanics of a tabletop board game with the battle style and character progression of a role-playing game to create an interesting hybrid. One to two players can embark on a great adventure to explore the realm of Yukiria and complete quests for its inhabitants, with multiple endings possible!</desc> <url>http://www.romhacking.net/trans/85/</url> <urlalt>https://gamesdb.launchbox-app.com/games/details/26139</urlalt> </game>
Oh... and there is an option in @meleu's script that will only make the tags that are currently usable in RetroPie. I might have to tweak a few things with it, but once I get everything "perfect" I intend to make a
gamelist.xml
for the RetroPie that isn't bogged down with all of these extra tags that are currently unusable by Pi users.Thanks again! :)
-
Here is a full list of all of the synopsis fixes that were made by creating a gamelist.xml for the RetroPie and using the new Reversal script to create [synopsis].txt files usable in Madmab Edition emulators, as well as how many instances were fixed:
PAGE SETUP / FORMATTING:
-
All Windows based (CR LF) line breaks replaced with Linux based (LF) line breaks: 1,228 (This is a document count. Actual instances of this fix is likely in the 30,000 to 40,000 range)
-
Added double spaces to lines that should have had them: 38
-
Corrected established order for "Original Title: " and "Alternate Title: " lines: 29
-
REMOVED extra blank line in between Description and first URL: 1
-
Line had a line break in the middle where it shouldn't have been: 2
-
Two extra lines after Description when there was no URL: 12
-
Extra spaces at end of lines: 5
-
Blank line with only a single space in it. The original code kept that line because of the space, as well as adding a Double Space before and after it, leaving 3 blank lines between the end of Description and the beginning of the first URL: 1
-
Added "Blank Space" under "_____________" that notes the beginning of the Game Description text. (NOTE: This looks much better in the XBox [synopsis].txt format than not having it, but this extra line is removed in the [synopsis].txt to gamelist.xml conversion code because this leading blank space does not look good on the RetroPie.): 1
GRAMMAR:
-
"Original (T)itle" Capitalized: 10
-
"Hacked (B)y" Capitalized: 16
-
"Translated (B)y" Capitalized: 6
UNUSED FIELDS REMOVED:
-
REMOVED blank "Alternate Title: " line: 56
-
REMOVED blank "Original Title: " line: 41
BLANK FIELDS THAT SHOULD HAVE HAD INFO (Usually "Unknown" except for Players info):
-
Missing "Players: ": 5
-
Missing "Release Year: ": 19
-
Missing "Hack Release Year: ": 27
-
Missing "Translated Release Year: ": 1
-
Missing "Hacked By: ": 1
-
Missing "Version: ": 2
MISSING TAGS OR INCORRECT TAGS, BUT INFO WAS THERE:
-
Missing "Players: " Tag, so line and data were lost: 4
-
Missing ":" in "Version: ", so line and data were lost: 2
-
"Release Year: " misspelled "Relase Year: ", so line and data were lost: 1
-
"Release Year: " misspelled "Releaese Year: ", so line and data were lost: 1
-
"Alternate Title: " mislabeled as "Alternative Title: ", so line and data were lost: 3
-
"Translated By: " written as "Translated y: ", so line and data were lost: 1
-
"Publisher: " written as "Publisher:" without space. Somehow code re-wrote this as "Publisher: Publisher: " after the reversal script. (Assumed code would mimic this behavior for any tag missing the space before the info): 1
URL TAG FIXES:
- URLS that started with "www." were ignored by all code and just showed up in the Description rather than the <url> or <urlalt> tags: \
(Fixed this so no matter what order the two links are, or combination of "www." "http" or "https" they are they will both print on separate URL tags and in the order they are in the [synopsis].txt)
PROBABLY NOT GOING TO BE "FIXED":
On VERY rare occasion there are more than two links in a synopsis file (For instance, if multiple hacks were applied to a single file). The code currently adds any more than two to the <urlalt> tag with a line break in between them. Most likely will not put in additional tags for the 0.0002% of games that would have this extra link or two.
In the next week I will create a new gamelist.xml file and run the reversal code again and compare the new output. I expect after doing this a second time with some code changes that I made today that the input [synopsis].txt files should be a 100% exact match to the output [synopsis].txt files.
The above fixes should then be applied to any synopsis files for other systems I work on in the future. Also, in the future, rather than do the crazy amount of editing I did by hand that could be handled with code like this, I will be adding more things to the existing code to automate all of that process.
Man... I'm glad I finally started learning how to do this for myself. :)
-
Contributions to the project are always appreciated, so if you would like to support us with a donation you can do so here.
Hosting provided by Mythic-Beasts. See the Hosting Information page for more information.