I recently added a new ability to my analyzer to load in a different set of monogram and bigram frequencies. This makes it possible to analyze layouts for different languages, in which the letter and bigram frequencies may be differ somewhat from English. You can try this for yourself on my updated layout analyzer, but the results are also summarized here:
Note: to generate these, I used frequency tables for the language in question, but the layout tested is the just the standard Colemak(DH)/Dvorak/Qwerty layout, without the special characters or modifications (QWERTZ/AZERTY/etc) which some of these languages use. So it's not a comprehensive analysis as it might be, but hopefully still a useful indicator.
Thanks for that update of the analyzer Steve. I was looking for an alternative layout for English, German and Dutch and a bit French. Colemak is optimized for English only. For us in Europe a more robust layout would be great. There are some full-blown optimized layouts like Bone, Koy, AdNW which can handle several languages pretty well (mostly optimized for English and German). But those layouts change the complete keyboard.
I like the idea of Colemak to have a 80 or 90 % optimization while staying as close as possible to qwerty. Especially taking into account that one can never optimize to 100%, because the use case is somewhat different for different persons always. Even a single person will handle different tasks and for programming a different layout might be best then writing in a native language. Also keeping zxcv and qw as well having the option to learn in steps is a plus IMO.
While trying to create a layout with similar goals like Colemak (and to some part also Minimak) I came up with the following layout, which is pretty robust for several languages.
1 2 3 4 5 6 7 8 9 0 - =
q w d f y k l o u p [ ]
a r t s g ; n e i h '
z x c v b j m , . /
The performance for English is a tad behind Colemak (0.4% higher SFB), but not that much. The layout also solves the problem some people have with the H in the middle-position for vanilla colemak.
Here are the results of the analyzer (rounded to two significant numbers -- because I think a finer grained resolution suggests a certainty about the results which is not given -- having in mind that it is not clear how to weight the different parameters to match the human perception best)
Language sf-bigrams score left/right
English 2.1% 1.8 45 55
French 2.6% 1.7 45 55
Spanish 2.6% 1.7 47 53
German 2.9% 1.7 46 54
Danish 3.7% 1.8 47 53
Finnish 3.7% 1.8 40 60
Swedish 3.7% 1.8 50 50
Polish 4.6% 2.0 50 50
Almost all languages score better with this layout thant Colemak(-DH). The exception is Polish. I find the trade-off for English worthwhile and guess there are many people who could benefit from such a layout. I personally write roughly 40 or 50 % in English and the rest in German and Dutch.
The layout can be adopted easily, without sacrificing the performance very much. For example swapping z and y (like on German keyboards). Also using the German umlauts in their normal place would be possible without a problem (ö would then move to the qwerty-h place).
When I counted correctly there are 15 keys changed (11 change fingers and 2 change hands). That's a tiny bit less than Colemak. Also most keys which change stay relative close to their qwerty-location.
I will very likely start using that layout for myself. Maybe I'll call it MiniMax - getting most out of it with relative minimal amount of efforts. Also giving a nod to Minimak as well :-)
Oh, how does the analyzer handle special letters like the umlauts (öäü)? Are those letters skipped or taken into account when one adds those to the layout or are they counted as the letters oau -- which would not make sense, except one writes the letters with a dead-key -- which is not true for German, but true for Dutch... ;-)
Could you add Dutch to the analyzer as well? There are word lists available online for free and I can point you to one or send you the data.
The analyzer from Arne Babenhauserheide https://dariogoetz.github.io/keyboard_layout_optimizer/ uses a pretty detailed scoring system, trying to cover as many important aspects as possible. This makes it tangible to the "weighting question" especially. But nonetheless I think it can give an additional view on how different layouts perform.
The results for some layouts are as follows, where the calculated costs (lower is better) are:
Typical English, News.
Neo 2.0 419
German (mixed text)
Neo 2.0 377
It is noticeable that in that model the layouts which do not have any strong restrictions like staying close to qwerty are the best. The question is how much that difference is worth in the daily usage and how much higher the price for learning is -- as well if those layouts are as robust to different use cases like others (or even better!?)
The Colemak / MiniMax group has roughly 70 % of the qwerty-efforts, while the full-blown group (Bone, Koy) has just about 55 % of the qwerty-efforts in that model for German. For English the differences are lesser and are only about 70% respectively 65 % compared to the qwerty-efforts. So for English I doubt that the difference is noticeable that much, for German that might be the case!?
On the other side. Under the assumption that the calculated costs correlate well with the human perception of the efforts one has to put into typing a text one could also argue that the Colemak / Minimax layouts result in similar and much lower efforts to qwerty for both English and German, while for German the qwerty efforts are especially high, so the gain is already higher, but interestingly could be even optimized more than the English ones!? I doubt that this calculation reflects that, because English uses less letters and should on an absolute scale be less costly to type than German. So it makes sense that the efforts for German (qwerty) are higher on an absolute scale, but even after optimization the costs for German texts must be higher than for English. So the model is -- with the current weighting factors -- not fully plausible to me.
Counting SFB and finger-travel-efforts is a good start for sure, but misses important parameters. So IMO we see that a model is as good as our knowledge is about the topic, which in most cases is not that good as one might think first.
Btw this relates to many other 'hot topics' as well where we are told that the models would tell some "truth"! I won't go deeper into that off-topic, but maybe someone should think about that more often or deeper!?
Last edited by rpnfan (04-Sep-2022 20:56:42)