Ah, now for the fun bit. Look at this one:
This image shows a weighted average of key frequencies in the 4 biggest Western languages using Latin letters - English, Spanish, French and German. I used data from CodePad's page together with language usage data KryssTal and some other stuff (links above). There were some choices to be made:
– I added the Portuguese speakers to the Spanish ones. They'll resent me for that I'm sure, but I felt that the letter frequencies in those two languages probably are closer to each other than to other languages thus hopefully justifying my action.
– I had a hard time figuring out the figure :) of 510 M English "speakers" in the world. Another place on the same site it's stated that there are 300 M first-language, 300 M second-language and 100 M foreign-language users of English - but those numbers don't quite make sense to me. I mean, there must be what – 400+ M? – first-language English speakers living in the UK and US alone? My best guess then is that the (300+300+100) M figures are outdated and the 510 M figure represents a more current number of native English speakers.
– Therefore, for the image above (but not for the next one below!) I added a stipulated number of 500 M to the English-speaking figure, bringing it to 1010 M English "users" in total versus 643 M Spanish+Portuguese "users".
In the image below, I made a direct comparison of the Wikipedia numbers (top halves) with the weighted language figures (bottom halves) - this time not adding anything to the English usage (bottom halves) to make any differences all the more clear:
The most interesting features are in my opinion:
– There's a big difference in usage for a few letters, mainly because of Spanish and French being more Latin languages than English and German. Most of these letters are rare however, and thus won't matter much to a keyboard layout. This effect is replicated in the images through the contrast setting and the visual similarity of dark red tones. The really striking one is H.
– Colemak solves the H issue elegantly by keeping it on its strong-finger stretch so it's neither too well nor too badly off. In contrast, Dvorak has a far too good H placement if you happen to write mainly Spanish or French! The right hand then gets in pretty much the same trouble with D vs. H as the left hand does with U vs. I.
– The rest of the picture is surprisingly calm.
– This bodes very well for keyboard layouts optimized for English but used by other American or West European users (disregarding some minor/-ity languages), as long as the H issue is addressed. Caramba! :)
Everyone have different keyboard usage of course. I myself write maybe 70% English and 30% Norwegian these days. If you happen to use any combination of the 4 languages I studied here, you can fiddle with the usage numbers in my Excel book to make it represent your individual assessments of your own language balance. You just enter the percentages in the Usage row, deleting the real-world figures that are there now; use any scale you like as it's balanced by the sum automatically.
But it's also about what you type about. If you write about X-rays and X-boxes a lot, you'll use the X much for instance. Not easy to adjust for that unless you measure it yourself.
However, there is one interesting conclusion to draw with some certainty from this exercise I think: That the Colemak layout should work well for you (as far as single-letter statistics is concerned) no matter what combination of English/Spanish/French/German you happen to be using! Not 100% optimally maybe, but apparently really well. That's reassuring. (As a matter of fact, it looks to me as if both H and W are more optimally placed for the weighted language average than for English on the Colemak!)
Of course, there are many other important measures (digraph rolling, same-finger, hand alteration etc) and I haven't touched on those here. Piepgrass made a digraph table, but I haven't done anything with it nor do I have the energy to do so any time soon I think.
*phew* Man, that was fun. :)
Last edited by DreymaR (10-Jul-2014 10:33:31)