Concerning German users (and others interested) - Experiences

stiflou
Member

Reputation: 0
Registered: 01-Sep-2014
Posts: 6

01-Sep-2014 20:57:14#1 Concerning German users (and others interested)

Hello,
I recently started taking interest in alternative layouts, and of course, Colemak appears as a serious one, especially for its international diffusion (I have not tried it yet). However, concerning German typing, I have noticed some embarrassing points (I know that a layout can hardly be perfect, especially as it was conceived english in mind) :
- a “high” rate of same-finger-digraphs (without counting letters repeated) : about 2.8% (vs 1.1% in English); here I list the main digraphs : SC (of course, 0.76%), E, (0.28%), HL, HN, EU, GT, HM, NK, UE (0.1-0.2%), etc… with 1% under the only right index;
- an overloading of the right index, with 23% of the typing, and of the right middle finger, with 21% of the typing;
- an imbalance of hands distribution (43% vs 57%, you can guess who’s who);
- the odd EIN trigraph, second german trigraph (behind ICH) with 0.96% of the typing.
(I precise that I made my study on a german corpus of about 1MB, results may slightly vary with a different one, but the main thing is here; I use for my study a program that I have coded in Ada)

I am not here to take Colemak down (it is always better balanced and with 4.2% less of same-finger-digraphs than QWERTY), I know there are German users, and I just ask them (and anyone who read this post) to share their experience, particularly on the listed points, and to discuss it.
I know there is some others post approaching this issue, I read them, but these posts mostly deal with SC (C can be typed with index, but not on a orthogonal keyboard) or EIN, and I am more focused on the fingers and hands balance, particularly on the right index (is it not too hard to have a quarter of typing on one finger, with 1% of typing of same-finger-digraphs, plus 1.1% with letters repeated ?)

Thank you for your answers :)

Last edited by stiflou (02-Sep-2014 10:08:35)

Offline

0

DreymaR
Member

Reputation: 220
From: Viken, Norway
Registered: 13-Dec-2006
Posts: 5,401

Website

02-Sep-2014 06:12:29#2 Re: Concerning German users (and others interested)

Good points. Colemak was optimized for English; it seems to me to work surprisingly well for other latin languages but not optimally for germanic ones (Dutch isn't so nice according to a user). For my own language Norwegian, I notice a few minor problems such as KJ, but they're nothing bad. With practice one learns to devise techniques for such road bumps; for instance, the KJ bigram I write using the middle finger on J which works rather nicely!

My main point though, it that I write a lot of English each day. And I certainly don't want to learn two layouts! So Colemak works well enough in Norwegian and perfectly in English and that's what I want.

Last edited by DreymaR (02-Sep-2014 06:14:23)

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline

0

stiflou
Member

Reputation: 0
Registered: 01-Sep-2014
Posts: 6

02-Sep-2014 09:11:05#3 Re: Concerning German users (and others interested)

Thanks for your answer :)

Dutch is pretty similar to German in fact (i didn't study this language because i don't need it), however, concerning latin language … it's almost worse than for German, especially for French, particularly because of one finger, the middle right (UE digraph very common, for QUE); in French, it is about 2% of same-finger-digraphs (there is no letters repeated) under this only finger, with UE, EU and “E,”, while A is almost the most used letter for Spanish and Portuguese, and ZA-AZ is between 0.1 and 0.2% in these languages, on the left pinky, and the layout is quite unbalanced for French — French is maybe the worst common language to type with Colemak. But I know that English is more an more used worldwide (more in Norway or in Germany than in France), however, if I'm not working on a big world company but writing a book in German or French, the issue can be critical.
Yes, when I want to make up my mind, I work hard on it. (And I don't speak about Dvorak here, but to reassure certain, I find it worse)

Any other experience about index overloading ?

Last edited by stiflou (02-Sep-2014 10:08:16)

Offline

0

ghen
Member

Reputation: 23
From: Belgium
Registered: 26-Feb-2008
Posts: 487

Website

02-Sep-2014 09:44:34#4 Re: Concerning German users (and others interested)

The problems with Dutch are mostly combinations with "J", which is much more common in Dutch than in either English or German.
je / jij / jou / jouw (you / your), ja (yes), ...

Offline

0

stiflou
Member

Reputation: 0
Registered: 01-Sep-2014
Posts: 6

02-Sep-2014 10:06:47#5 Re: Concerning German users (and others interested)

To speak about Dutch is a good idea too, because concerning index overloading, it is almost worse than German, the decrease of the H frequency is balanced by the increase of J and K, less accessible.
I don't really understand why J-combination are so problematical, because it is just hand rolling, even if J is not well placed, but according to you, the index overloading and hands imbalanced is not a noticeable problem ? (I don't know in Dutch about H-combination that are problematical in German Colemak)

Offline

0

stiflou
Member

Reputation: 0
Registered: 01-Sep-2014
Posts: 6

02-Sep-2014 20:11:10#6 Re: Concerning German users (and others interested)

Interesting, I would never thought that index overloading will concern Czech, because letters frequency really differs of German or Dutch, and are, to my mind, quite acceptable; maybe it is indeed due to a “bad” typing technique (or short fingers, or misjudgment on my side). As for “kolik”, words with K are “bad examples for QWERTY because the position of K on the “home row” is a joke regarding common anglo-saxon or latin languages (En, Fr, Es, Pt, De), but with a little training of few tries, I found it almost faster (not more confortable, that is truly tricky) to type with Colemak than with QWERTY; concerning JE (I'm concerned because of French), the problem comes too from the non-orthogonality of the keyboard, I invite you to try a more ergonomic keyboard (TypeMatrix is a good and affordable example), it can be truly beneficial !
Thank you very much for your contribution :) (first time I speak to a Czech \o/)

Any other experience ? I still have no German ^^'

Last edited by stiflou (02-Sep-2014 20:15:04)

Offline

0

stiflou
Member

Reputation: 0
Registered: 01-Sep-2014
Posts: 6

02-Sep-2014 21:56:03#7 Re: Concerning German users (and others interested)

Yep, your signature, I'm stupid ><' (impressive !)
I agree with you by half on the lack of alternation, I didn't mention it at the beginning, because Colemak community is pretty sensitive to this subject ^^ (joke inside), but placing the O, by instance, on the left hand would not change a lot. I concede that prevent this kind of issue is difficult in the word of designing a layout, priority being of course given to digraphs, English digraphs particularly, so a lot different of other languages (like German, because it's the initial subject, or Czech); which language else use TH ?
The issue of Czech is of course his lack of international importance, and that is difficult to change in standardization … This is the problem, this always have been and will be. The advantage of all using the same alphabet is balanced by our differences in languages making standards awful to create. But, as Czech, you are lucky with ZXCV placing the V in a very good place, Polish are so less lucky … Polish, why the Z is the 2nd consonnant, and W the 3rd (K is not far behind) ? comparing to English, it's almost completely the opposite. And what can I say, as a latin speaker (fortunately, I type in English and German too), of the H position, so useless in latin languages, like W, Y and K … To make a standard is so difficult because you have to judge the importance of a language (and the country behind), regarding several criteria, and too bad for small countries, and particularly for people living in. It's sad.
Sorry for the outburst ^^'
Your exemple is well chosen, it make me think to “lamelle” (in French, or “mollement” but this words are not common).

Offline

0

stiflou
Member

Reputation: 0
Registered: 01-Sep-2014
Posts: 6

02-Sep-2014 22:01:12#8 Re: Concerning German users (and others interested)

I hope some Germans will show up :)

Offline

0

pieter
Member

Reputation: 2
Registered: 25-Oct-2013
Posts: 136

12-Sep-2014 10:55:17#9 Re: Concerning German users (and others interested)

Hi Stiflu, I'm not German but Dutch, I made my own layout (temporary called Juli16)

. u o p y x c l b v
a i e n h m d r t s
: , ? k q f g w j z

Search this forum and you'll find more.

Either you stick to one of the existing layouts. In practice, this means either Dvorak or Colemak, since these are the only one with (some) support in the market.
Or you make your own layout based on your demands. In know of 2 keyboard generating tools: carpalx and mtgap. Google and you'll find out. Carpalx is written in perl; mtgap in C. For me it was easier to compile and run mtgap, plus I think the analysis is better - anyway, I used mtgap.

The 10 easy steps you must take are these:

1. Think what you want to optimize: optimize the mix, or optimize the most used language (at the expense of being suboptimal in other languages). In other words, will the layout be a triathlon athlete, who is good at everything; or a very fast specialized runner, bicycle rider or swimmer, who does the other sports less well? (I optimzed for the mixed corpus, that also included English)

2. Find or make a corpus that represents your use

3. Find the most used letters and digrams -> see my earlier posts. I used Lalop's python scripts.

4. Feed your keyboard generator with letter and digram frequencies

==> Your program will spit out layouts. Pick the best one. Now you have your keyboard.

5. Program your keyboard. In Linux, I did this by modifying an xkb file; in Windows by using pkl. In windows, I think you can also use a microsoft layout program (I'm no windows expert)

6. Get amphetype (I had to compile it myself.. ?).

7. Make lessons. I made lessons based on the most used digrams (see earlier). And I used "1000 most used words lists" fonr Dutch and English

8. Practice

9. Practice

10. Practice more.

Good luck!

Offline

0

pieter
Member

Reputation: 2
Registered: 25-Oct-2013
Posts: 136

12-Sep-2014 11:25:36#10 Re: Concerning German users (and others interested)

This website tells me this:

GERMAN
Order Of Frequency Of Single Letters: E N R I S T U D A H G L O C M B Z F W K V P J Q X Y
Most Common Digraphs: en er ch de ge ei ie in ne be el te un st di un ue se au re he it ri tz
Most Common Trigraphs: ein ich den der ten cht sch che die ung gen und nen des ben rch
Common Two-Letter Words: ab, am, an, da, er, es, ob, so, wo, im, in, um, zu, du, ja ab

Using Juli16 for German:

- German and Dutch are similar, but the U is much more used in German. It's location may be suboptimal
- digraphs and trigraphs (my assessment). Easy = bold. Normal = normal, Hard = italic
Most Common Digraphs: en er ch de ge ei ie in ne be el te un st di un ue se au re he it ri tz
Most Common Trigraphs: ein ich den der ten cht sch che die ung gen und nen des ben rch
Common Two-Letter Words: ab, am, an, da, er, es, ob, so, wo, im, in, um, zu, du, ja ab

For Dutch, it is slightly better. Yet, depite this, even for German, Juli16 scores better than Colemak, Dvorak and all other layouts in Patorjk. I compared in Patorjk Juli16 to the layouts Colemak, Dvorak and Balance 12 (which is a very good layout). For various languages. These are the winners.
For Italian: winner is Juli16
For German: winner is Juli16
For French:winner is Juli16
For Dutch: winner is Juli16
For spanish: winner is Balance12, # 2= Juli16 #3/#4 = Dvorak or Colemak
For English: winner is Balance12, # 2= Juli16 #3/#4 = Dvorak or Colemak

In case your interested, some stats for Juli16 on my corpus:
Left hand: 57% - Fingers: pinky 9% ring 10% middle 22% index 16%
Right hand: 42% - Fingers: pinky 8% ring 10% middle 12% index 14% (I am righthanded by the way; the layout is a bit skewed to the left, but not problematically so).
Inward rolls:7.5%; Outward rolls: 2.9%
Same hand:36.4%; Same finger:1.5%;
Row change:12.3% Home jump: 0.7% (home jump is on a qwerty board for instance iec or mu

For comparison, here is Colemak (on an English corpus!) :
Left hand: 46% - Fingers: pinky 8% ring 7% middle 11% index 18%
Right hand: 53% - Fingers: pinky 9% ring 10% middle 15% index 18%
Inward rolls: 4.6%; Outward rolls: 2.5%
Same hand: 42.6%; Same finger: 1.9%
Row change: 18.6%; Home jump: 1.3%

And here is Dvorak (on an English corpus!)
Left hand: 44% - Fingers: pinky 8% ring 8% middle 12% index 14%
Right hand: 55% - Fingers: pinky 11% ring 13% middle 13% index 16%
Inward rolls: 4.1%; Outward rolls: 1.3%
Same hand: 31.1%; Same finger: 3.2%
Row change: 14.4%; Home jump: 0.5%

So, my layout scores better on distance (not included here); has more rolls than Colemak and Dvorak, lower row change than both, lower same finger than both. The alternation (same hand stat.) is between Dvorak and Colemak. Dvorak favors left-right, left-right (LRLR etc), my layout favors LLRRLLRR etc. Dvorak does best on home jumps: very low. Mind that the Colemak and Dvorak stats are done with and English corpus (for which they are optimized), my stats are done with a 85% Dutch corpus, for which it is optimized. When Colemak and Dvorak are "fed" with Dutch text, they will score worse, and the gap with my optimized layout will be larger.

One of the problems for me with Colemak is the placement of D, H, G and J => index finger overload in Dutch. Dvorak is better, but has the I and D, and to a lesser degree the G, wrong. Plus, LRLRLR is too much alternation for me. Juli16 has very nice short rolls in Dutch: en ne ie ei dr tr st ts bl ou

To be perfectly clear (I am on a Colemak forum after all ;-) : Dvorak and Colemak are both way better than Qwerty, also for Dutch and German. If you don't want to go the "custom layout" route, please try both Colemak and Dvorak and pick the one you like best.

Last edited by pieter (12-Sep-2014 11:28:38)

Offline

0

DreymaR
Member

Reputation: 220
From: Viken, Norway
Registered: 13-Dec-2006
Posts: 5,401

Website

12-Sep-2014 11:31:58#11 Re: Concerning German users (and others interested)

Could a pan-Germanic layout be an option? That'd command a greater user base, but on the other hand it'd have trust issues with the optimization "freaks" who want only the best their preferred model can offer. I haven't looked into how similar the Nordic/Germanic languages are, N-gramwise.

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline

0

lalop
Member

Reputation: 0
Registered: 04-Apr-2013
Posts: 538

12-Sep-2014 14:51:27#12 Re: Concerning German users (and others interested)

pieter said:

Dvorak favors left-right, left-right (LRLR etc), my layout favors LLRRLLRR etc.

I'm wondering how you determined this (I don't think my own analyses ever were able to detect such things).

Tarmak Transitional Layouts (QWERTY -> Colemak)
Amphetype (lalop edition)

Offline

0

pieter
Member

Reputation: 2
Registered: 25-Oct-2013
Posts: 136

12-Sep-2014 15:26:36#13 Re: Concerning German users (and others interested)

@lalop: for my own Juli16 layout:
a) based on feel. Admitted, this is subjective...
b) based on the objectives of the mtgap model, as michael dickens writes (again, I admit this is a weak argument)
c) based on hand alternation stats. Dvorak has a same hand stat of 31% (see above). I interpret this (correct me if I'm wrong) that for any given key stroke, the chance that the next stroke is on the same hand is 31%. LLRRLLRR would indicate that on average, 50% of all keystrokes are same-hand.
LRLRLRLR would have 0% samehand. LRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLRLLRRLRLRLRLRL are 70 key changes. 68 of those switch to the other hand, 2 stay on the same hand. This would imply a same hand% of 3% (right? )

I now see that I am wrong in two ways on this stat. In the first place because LRLRL would mean 0% same hand. My layout scores 36% same hand. which means roughly that 1 in three strokes is on the same hand. LLRRLLRR would be 50%. My second mistake is that this is percentage over all key strokes in the corpus. It's an average, and I am also interested in the spread (variance etc.)

For Dvorak:
a) based on Dvoraks ideal to have high alternation, by putting all vowels on one hand. A typical word in languages like English, Dutch, French and German consist of a consonant, a vowel, a consonant (CVC). Look at the words typical or German. Of course there are words like area (VCVV). Spanish has more words of a VCV structure (ave, olá, odio etc.), German has more consonants (schmerz, mehrfach), slavonic languages have even more (like the chech "Krkonošský národní park"), more like CCCVCCVCC :-)
b) based on subjective feel.

But, all in all I admit to jump to conclusions here. This screams for a Pyhton script to analyse a corpus. The script would need three inputs: lefthandlettersoflayoutX.txt and righthandlettersoflayoutX.txt plus corpus.txt. It would give two outputs: one LRLLRRLL text file, and one file with stats..... Let's see if I can find some time....

Offline

0

lalop
Member

Reputation: 0
Registered: 04-Apr-2013
Posts: 538

12-Sep-2014 18:43:06#14 Re: Concerning German users (and others interested)

LLRLLRLLRLLRLLRLLR is an example of 33% same-hand, and yeah, LLRRLLRR would be 50%. (On the other hand, so are LLLLLLLRLRLRLRLRL and LLLLLLLLLLRLRLRLRL respectively. This is another reason same-hand percentage is not a very insightful measure.)

A good way to make a general analyzer might be to take patterns (e.g. LLRR, RRLL vs LRLR, RLRL) as input and just count the number of occurrences. I'm not 100% sure we'd be able to interpret this data as "one tends toward LLRRLLRR and the other toward LRLRLRLR", but it'd be a start.

I haven't had time to update any of my analyzers, unfortunately. Feel free to try your own hand at one.

General thread point: still keeping an eye out for good (and preferably large, representative) sample texts/n-gram stats for non-English languages.

Tarmak Transitional Layouts (QWERTY -> Colemak)
Amphetype (lalop edition)

Offline

0

pieter
Member

Reputation: 2
Registered: 25-Oct-2013
Posts: 136

12-Sep-2014 20:32:10#15 Re: Concerning German users (and others interested)

lalop said:

General thread point: still keeping an eye out for good (and preferably large, representative) sample texts/n-gram stats for non-English languages.

Yes, the problem is my corpus was partly made up by text from research reports that I wrote for clients, and from reports that students of mine wrote. in other words, I must anonimize it before I put it on Pastebin or where ever. I'll try and do that in the next week.

Offline

0

pieter
Member

Reputation: 2
Registered: 25-Oct-2013
Posts: 136

12-Sep-2014 21:04:28#16 Re: Concerning German users (and others interested)

Regarding the analysis, indeed: counting LRLR etc patterns, and calculating percentage perhaps? I'll chew on this!
I was thinking of a different solution: giving letters a value. The value of the bold letter is calculated as a function of the letters that surround it (I ignore the first and last letter of the text for now. Also: spaces don't count)

LRL -> 2 (R is surrounded by 2 L letters)
LRR -> 1 (R is surrounded by 1 L letter)
RRL -> 1 (R is surrounded by 1 L letter)
RRR -> 0 (R is surrounded by 0 L letters)

same thing for the Ls.

Next, sum the numbers and divide by the total number of letters in the text. In a 1x1 text, every letter will have value 2. So, the mean value (total/#letters) is 2. So, 2 is maximum alternation.

In a 1x0 text (all Ls or all Rs), all letters will have the value 0. Mean is 0 as well. So 0 = no alternation at all.

In a 2x2 text, every letter will have an L and an R as neigbour, so every letter has value 1. Mean is 1 = 2x2.

Let's take a 3x3 text. LLLRRRLLLRRRLLLRRRLLLRRR. Ignoring the starting and finishing letter, the values are 011011011011011 etc. Mean value is (0+1+1)/3 = 0,67.

Finally, 3x1x1x2. This one is irregular (no fixed rithm) and imbalanced: the left hand has 3 + 1 = 4 strokes, the right hand has 1 +2 = 3 strokes. Again, ignoring first and last one. LLLRLRRLLLRLRRLLLRLRRLLLRLRR This gives 012211121221 1012210121221 = 32 / 28 = 1,14

I'm not happy with this yet, because irregularity does not show in the metric..... I'll think some more. Anybody else has a betetr idea? Feel free to critisize, improve, or steal !

Offline

0

lalop
Member

Reputation: 0
Registered: 04-Apr-2013
Posts: 538

13-Sep-2014 03:55:22#17 Re: Concerning German users (and others interested)

Continuing back in my alternation thread to avoid too much derailing

Last edited by lalop (13-Sep-2014 04:09:13)

Tarmak Transitional Layouts (QWERTY -> Colemak)
Amphetype (lalop edition)

Offline

0