What is the maximum number of roms combined across all systems it can handle before start up and/or crash is a problem?

Haven't done any extensive testing, but I recall a while ago a forum member tried to load more than 100k games and ES crashed due to insufficient memory. That was on a RPI 3, which had something less than 800Mb RAM available (the rest being taken by the GPU memory).

On a RPI4, with more RAM available, things might be better. Of course, if you're not using a SBC, RAM may not be an issue.

Is it really necessary to go through checks? Eg. if I symlink within a system to a specific folder on a large HDD, could it build a list only when I go in to that particular folder?

Symlink-ing won't help much, you can already mount your ROM folders from an USB drive and this would speed things up for the initial scan.

I suppose if I built a massive gameslist.xml and not parse, would this help or would that be just the same as that's the way ES works?

It would help with the initial loading, otherwise you'll need to wait for ES to re-scan the folders on start and depending on how many games you have, that can take a while (not only an a Pi).