Indeed, regardless of whether R or H is the slightly more frequent, they are pretty close and only one of them can get a place in the top 8. I instinctively feel Colemak's choice of having R on the home row is correct. But the main thing is, whichever one loses out needs to get a next-best slot, which of course is the point of Mod DH.
I started looking at the Peter Norvig bigrams data in a bit more detail, comparing it to the data I have been using previously (which I'll call the carpalx data). Interestingly, the order of same-finger bigrams occurrence is also noticably different, although the overall totals are pretty much the same. I have excluded punctuation bigrams because the Norvig data doesn't have them.
With Norvig bigram frequencies, I find the most frequent Colemak same-finger bigrams are:
SC 0.1547%
UE 0.1475%
PT 0.1058%
NL 0.0638%
NK 0.0516%
KN 0.0514%
EU 0.0312%
DG 0.0310%
WR 0.0308%
YI 0.0288%
Whereas the carpalx data gives:
KN 0.1124%
UE 0.1110%
SC 0.1025%
NK 0.0941%
NL 0.0840%
PT 0.0727%
LK 0.0397%
YI 0.0393%
LM 0.0366%
WR 0.0329%
The hand balance is also a little different (letter keys only):
Norvig: L: 48.34% R: 51.66%
carpalx: L: 46.96% R: 53.04%
Not sure there is any major conclusion to be drawn, but thought I'd post the info anyway.
Last edited by stevep99 (31-Jul-2015 14:20:21)