Actually looks like some of the performance issues have been resolved for a couple of shaders in later version of Retropie. But some seemingly simple and old filters have terrible performance, like 2xsai.
What i like to use in Retroarch is the Simple2x CPU filter + bilinear filtering. This way the pixels still look like squares, but they are not razor sharp.
In Retropie the closest is sharpbilinear shaders, but those are way too sharp for my taste.
Personally i do not like the looks of the "advanced" CRT shaders. The phosphor and grille look just too messy to me. I like a more clean look.
So i look for a much more simple scanline filter. But not with black scanlines, but maybe something like 50% brightness of the normal brightness.
Some time ago i changed crt-pi to look like simple scanlines.
I changed these parameters :
#pragma parameter SCANLINE_GAP_BRIGHTNESS "Scanline gap brightness" 0.50 0.0 1.0 0.01
#define MASK_TYPE 0
But i understand that zfast should perform better. I might try to do the same with the zfast_crt_standard. Although the code looks quite a bit different.