• You are not logged in.

    Modifications for other languages

    • Started by SpeedMorph
    • 14 Replies:
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303

    I was looking at some common European languages to see if I could improve on Colemak. I found a couple of small changes.

    For German and French, switching U and O helps. Especially German, because O is rare in German.

    Switching Y and Z could help in German. That would put Y in its original position, plus Z is more common than Y. I'm not sure about IZ/ZI frequency though.

    In Spanish, switching T and D helps. D is a lot more common, plus it leaves T in its original column.

    Switching K and J could help in Spanish. K, H, and J are all rare, so the right index is a bit underused. Moving stuff around could probably fix that.

    Offline
    • 0
    • Reputation: 15
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 451

    I don't know what methods and corpuses you used for these findings, but could you make a similar analysis for Dutch? :-)

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303
    ghen said:

    I don't know what methods and corpuses you used for these findings, but could you make a similar analysis for Dutch? :-)

    I didn't make my own, I used the first website that came up when I searched "<language> letter frequency".

    Dutch would be a bit trickier. Here's my best shot.

    Dutch letter frequency: e n t a o i r d s l g h m v b w k u p c ij z j f é x

    The home keys should be e n t a o i r d. Is ST common in Dutch? If not, you can switch D and S. By the way, what's IJ? I found it here: http://www.cryptogram.org/cdb/words/frequency.html

    B is in a bad place. It's too common. Switching it with F would be an improvement.

    There are already a few things that I like better about Colemak in Dutch. I think in English, K is too rare for the spot it has. But in Dutch, it's just about perfect. Also C is rarer than in English, so that annoying spot down there is easier to reach.

    Considering how much more common G is than P, a G-P switch would help.

    q w b g p j l u y ;
    a r d t s h n e i o
    z x c v f k m , . /

    Offline
    • 0
    • Reputation: 0
    • Registered: 17-Mar-2008
    • Posts: 192

    Letter frequency is only one objective to optimise for. Watch out for: same-finger, quality of di/trigraphs (rolls), hand balance and alternation. And lastly: You need to test any changes for what it subjectively feels like!

    Offline
    • 0
    • Reputation: 114
    • From: Oslo, Norway
    • Registered: 13-Dec-2006
    • Posts: 4,744

    I agree with Tomlu. Furthermore, nowadays more and more users are typing a substantial amount of English no matter where they're from. This creates a difficult situation for a would-be non-English optimizer.

    I'm leaning towards making one good English layout, leaving a few keys open for adding national characters and praying that the result is good enough for almost all. If we could reduce the layout jungle somewhat I envision a lot of benefits coming from it. I shudder at the thought of the Chinese factories making separate keyboard batches for Norway, Denmark, Sweden, the Faeroes, Iceland and who knows whether they have a Greenlandish one as well...

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303
    tomlu said:

    Letter frequency is only one objective to optimise for. Watch out for: same-finger, quality of di/trigraphs (rolls), hand balance and alternation. And lastly: You need to test any changes for what it subjectively feels like!

    Well I agree. I didn't want to learn 5 different layouts though, especially now that I've forgotten Colemak.

    @ DreymaR: Most European languages have very similar letter frequencies. E, N, I, and R are all among the 8 most common keys in all the main European languages.

    Offline
    • 0
    • Reputation: 15
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 451
    SpeedMorph said:

    Dutch letter frequency: e n t a o i r d s l g h m v b w k u p c ij z j f é x

    The home keys should be e n t a o i r d. Is ST common in Dutch? If not, you can switch D and S. By the way, what's IJ? I found it here: http://www.cryptogram.org/cdb/words/frequency.html

    "ij" is a common ligature in Dutch, sometimes even considered a single character (like in the website you referred to), and pretty hard to type on Colemak.  j itself is also quite common, mostly in combination with vowels: "jij" (you), "ja" (yes), "je" (you, but with less focus).  I think it would be beneficial to swap j and y, as y is very uncommon in Dutch.  But then again, I probably type as much (if not more) English that Dutch, cfr. DreymaR's remark.

    I see a lot of potential in your suggested d-s swap though.  According to your source, d is slightly more frequent than s (eg. "de" means "the"), and "dt" is a common and annoying same-finger digraph on Colemak.  Also d and t are easily mixed up (quite impossible with Azerty or Qwerty) which leads to embarassing so-called "dt-errors" in Dutch (since d, t and dt are the most common suffixes in verb conjugation).

    Offline
    • 0
    • Reputation: 114
    • From: Oslo, Norway
    • Registered: 13-Dec-2006
    • Posts: 4,744

    Did you see the work that user Checkit, myself and others did on letter frequencies? https://forum.colemak.com/viewtopic.php?id=128

    On that page we discussed languages a bit. My conclusion as I remember it, was that languages have quite different letter frequencies for some letters - but overall SpeedMorph is right: The 11 or so most common letters tend to be the same in most latin- or germanic-family languages. I know from typing Norwegian that polygraphs is another matter, but that's complex to look into of course. Judging from single-letter frequencies alone, I found that Colemak was in fact a very decent layout for all the languages I studied. It even solved Spanish and Portuguese somewhat better than Dvorak does, as I remember it - and those are fairly common languages.

    My notes and tools for that project can be found here: http://folk.uio.no/obech/Files/Keyboard … encies.zip - there's an Excel sheet in there that you may find interesting.

    @ghen: As I see it, the question isn't as much whether the j is much more common than the y in Dutch, as it is whether the difference really is big enough to be worth the confusion and extra hassle involved with moving it. After using Norwegian Colemak for some years I've experienced a growing annoyance at the non-compliance with the "standard" US layout - although I do need the three extra Norwegian characters on my layout. I just feel that changing as little as possible is a very good thing when all comes to all, because of the potential synergy effects. Think for instance how much easier it'd be to program a keyboard-related application or help documentation if people's carets and tildes were in the same place and you only had to change a few bits here and there instead of huge chunks.

    Last edited by DreymaR (20-Aug-2008 10:06:06)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 15
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 451

    DreymaR: yes, I agree with "better have one good international layout (with perhaps minor local tweaks per language) than hundred ultra-optimised national layouts".  I was just looking for an indication on how "close to ideal" Colemak is for Dutch vs. for English. :-)

    The visual representations of key frequencies you referred to are indeed very nice, although I find the original black/grey/white images a lot clearer than the ugly "blackbody" versions.  Btw, they also appear on the Spanish Wikipedia page on Colemak (which still exists).

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303

    The J-Y switch would put Y in its original position, making it easier to learn.

    Yes, Spanish is very common, some sources say it's more common than English. (As a first language.)

    Offline
    • 0
    • Reputation: 15
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 451
    SpeedMorph said:

    The J-Y switch would put Y in its original position, making it easier to learn.

    Yes, that's a good argument!  Perhaps I should actually make this change to my layout.  Shai, what's your opinion on swapping j and y?

    Yes, Spanish is very common, some sources say it's more common than English. (As a first language.)

    Definitely: List of languages by number of native speakers (Wikipedia).

    Offline
    • 0
    • Reputation: 0
    • Registered: 17-Mar-2008
    • Posts: 192

    Swapping j and y would probably increase the same-finger ratio. There's a good reason why both index fingers only serve consonants.

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303
    tomlu said:

    Swapping j and y would probably increase the same-finger ratio. There's a good reason why both index fingers only serve consonants.

    Well actually, Y has surprisingly low same finger with most consonants. Y should not be paired with L, R, O, E, or A. <EDIT: or M. end edit> Everything else is fine. NY/YN is about as common as FT/TF or RL/LR. However, N has L on the same finger, and LY is to be avoided. Of course, these stats are for English. I'm not sure how it works out for other languages.

    Last edited by SpeedMorph (21-Aug-2008 16:19:07)
    Offline
    • 0
    • Reputation: 0
    • Registered: 17-Mar-2008
    • Posts: 192

    What do you know! Well, I never really thought Y was a vowel anyway... :)

    Last edited by tomlu (20-Aug-2008 22:09:04)
    Offline
    • 0
    • Shai
    • Administrator
    • Reputation: 7
    • Registered: 11-Dec-2005
    • Posts: 380

    If jij is common in Dutch, switching the Y and the J without any further changed wouldn't be great, as it would be a multiple same-finger combo on a weak finger. Also the word "my" would be very uncomfortable to type.

    Offline
    • 0