I think people often think too negatively about modifier keys - but this really only stems from the fact the most of the modifiers are in terrible locations on a standard keyboard.
For example, my backspace key is Extend-H (which on a "normal" keyboard would be the equivalent of LAlt-M). You might think that backspace, being a quite a common key (unless you are extremely accurate, which sadly I'm not) should have pride of place in the primary layer. But actually Extend-H, despite being a two-key combo, is easier and more comfortable than the single default backspace key. I'd even go so far as to say it's probably better than over half of the single-keys on the keyboard.
Modifier placement is pretty much ubiquitous, so the negative view of modifiers is understandable. Very few people have ever seen or tried a keyboard with a thumb cluster. I use "backspace a word" more than backspace, but I agree that both are more convenient on a layer than the default backspace location. I don't use keys that are far away from the home row like backspace and the function keys.
That said, if you're not sold on Extend yet...
It's not that I'm not sold on Extend. I used a navigation/extend layer for a while even before I switched to Colemak. It was great when I was on Windows and using software designed to be used with a mouse. A navigation layer isn't nearly as useful for keyboard-driven programs where you can get to any visible location in a few keypresses and can navigate by semantic units. I still use some of the keys that would be on the extend layer (like backspace), but using a modifier with local rebindings (to more useful program-specific functionality) makes more sense for this case since with a global extend layer you can't map something like "Mode_switch + key" directly like you might with "control + key". Instead you have to indirectly remap what "Mode_switch + key" is bound to.
then putting Escape on CapsLock sounds reasonable to me.
What I was saying is that the two aren't mutually exclusive. The most common thing to do is to bind caps to both escape and control, but escape and Mode_switch (or whatever key for extend) works fine as well.