• You are not logged in.
  • Index
  • General
  • carpalx - recnet experiences in evaluating/improving colemak

    carpalx - recnet experiences in evaluating/improving colemak

    • Started by martin.krzywinski
    • 6 Replies:
    • Reputation: 0
    • Registered: 22-Aug-2009
    • Posts: 6

    Some of you may be familiar with my carpalx project

    http://mkweb.bcgsc.ca/carpalx

    which attempts to construct a model of typing effort, evaluate existing layouts using the model and try to find more effortless alternatives.

    I've done quite a bit of revamping of the model and analysis, and wanted to therefore drop a note here to invite you over to the project pages and make a few comments about my recent experiences in evaluating and improving layouts. The recent release also includes PKL layouts.

    As most of you expect, Colemak is a great layout with arguably the best statistics out of the popular alternatives.

    http://mkweb.bcgsc.ca/carpalx/?popular_alternatives

    My model of typing effort actually had a hard time on creating a layout that improved on Colemak on all three types of effort: finger distance, finger and row penalty and stroke path. Initially, I wanted to generate a justifiable model of typing effort and then, having created one, wanted to know what kind of layouts could be generated by asking this effort to be minimized.

    The details about the model, and its components, can be found here

    http://mkweb.bcgsc.ca/carpalx/?typing_effort

    While many optimized layouts for English exist, fewer are available for non-English typists. The typing effort model can be applied to discover the best layout for non-English texts.

    One benefit of a generalized model is the ability to incrementally improve a layout. For those not yet ready to jump from the QWERTY ship, I suggest the partially optimized layouts, in which either 5 or 10 most efficient key swaps were made.

    http://mkweb.bcgsc.ca/carpalx/?partial_optimization

    Of course the ultimate question for English typists is - can Colemak be improved? If so, by how much? And, if by little, is there any point in switching.

    These two layouts attempt to address this

    http://mkweb.bcgsc.ca/carpalx/?improving_colemak

    Both of these layouts (PBFMWJ which rearranges ZXCV, and GYLMWP which does not) have significantly lower hand asymmetry than Colemak (1-2% vs 6%), somewhat lessen the burden on the pinky (14-15% vs 16), and make less use of the bottom row (5-7% vs 9%). The major component of typing effort decrease (about 5% lower than Colemak) is due to decreased use of the pinky and bottom row, which in the model are penalized.

    The improvement introduced by these layouts is incremental, of course, since Colemak is already heavily optimized. For some, the improvement may not be seen as such, because typing preferences vary and minor details of layouts so close to the edge of optimum are arguable.

    If your desire is to lessen the use of the pinky, at the expense of slightly increasing finger travel distance, then the fully optimized layouts (these vary based on whether ; is remapped and whether zxcv is fixed) are ideal

    http://mkweb.bcgsc.ca/carpalx/?full_optimization

    If, on the other hand, you prefer shorter finger travel at the expense of pinky use then the two Colemak improvements suit nicely.

    In conclusion, while Colemak does not minimize any typing parameter (e.g. Hallingstad's Arensito has the lowest finger travel distance of all popular alternatives that I looked at), it does strike an excellent balance with many parameters close to a global minimum. Whether it is worthwhile to incrementally minimize the parameters further in practise is debatable, though it's an interesting academic problem. However, applying the carpalx model to non-English texts should be fruitful in generating optimized layouts for other languages.

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303

    Carpalx is a good model, I think, and it is much better put together than any other alternative layout besides Colemak, but it has some serious flaws. Most importantly, it either completely ignores or greatly underrepresents the value of same finger usage. Any good layout should attempt to minimize this as much as possible. The improvements that your layouts gain are only at the expense of high same finger. If you look at Colemak, Capewell, Arensito or my layout (which are IMO the four best layouts), you'll notice that they all have very low same finger.

    This is something that is a matter of much controversy and that Shai would probably not agree with, but I think that your layouts do not emphasize finger rolls enough. If you look at ARENSITO for example, it has numerous rolls on the home row: RE/ER is very common, IS and TO are both common, and there are various other relatively common finger rolls that occur. Your layouts, however, do not do a good job of emphasizing this point.

    I do like your idea of emphasizing a combination of same hand/hand alternation with the two on one hand/one on the other. My own keyboard optimization algorithm does not use this, but only for efficiency purposes. It's about five times faster if I use digraphs instead of trigraphs.

    P.S. How does the program work? I cannot read Perl.
    P.P.S. Why Perl? Why not a lower-level language like C?

    Offline
    • 0
    • Reputation: 210
    • From: Viken, Norway
    • Registered: 13-Dec-2006
    • Posts: 5,345

    I agree with SpeedMorph: Same-finger is evil.

    Doesn't seem to me that a ringfinger-to-pinky roll (such as the TO roll in ARENSITO) is good though?

    Not trying to answer for Martin, but: Perl is a very good string-handling language and less hassle to write in than C for small applications - more to the point though, some people are better at perl than at C!  :)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303
    DreymaR said:

    I agree with SpeedMorph: Same-finger is evil.

    Doesn't seem to me that a ringfinger-to-pinky roll (such as the TO roll in ARENSITO) is good though?

    I was mostly using that as an example. I don't like it too much either. However, I do like RE/ER and IS. Interestingly, on my layout version 2.0 the home row is OANTSERI, so completely by accident, the pinky doesn't have to do a lot of rolls.

    Not trying to answer for Martin, but: Perl is a very good string-handling language and less hassle to write in than C for small applications - more to the point though, some people are better at perl than at C!  :)

    String-handling, huh? Maybe that's why his program has a "worst word" function while it would be extraordinarily difficult to implement something like that into my (C) program.

    Offline
    • 0
    • Reputation: 0
    • Registered: 22-Aug-2009
    • Posts: 6
    SpeedMorph said:

    Most importantly, it either completely ignores or greatly underrepresents the value of same finger usage.
    ...
    If you look at Colemak, Capewell, Arensito or my layout (which are IMO the four best layouts), you'll notice that they all have very low same finger.

    The model does take same finger into account, though not explicitly. Same finger, as well as same hand and same row usage, are dealt with in the stroke path component.

    http://mkweb.bcgsc.ca/carpalx/?typing_e … troke_path

    This is also the component that tries to measure finger rolling. Unfortunately, this stroke path quantity may try to do too much, and spreads itself too thinly. It's likely that I need to revisit the parameter values for this component to make sure that clearly undesirable characteristics have meaningful penalties.

    Using my corpus, I find that Arensito has the highest finger alteration, with 94% of successive keystrokes using different fingers. However, this keyboard's use of the pinky and ring finger is very high. Colemak and Capwell both have this same quantity at 93% and my own optimized layouts at 91-92%.

    I agree that avoiding same finger usage is a valuable characteristic. It's something that comes at a cost of other characteristics. Reducing same finger usage requires that work is balanced between all fingers, and this necessarily means assigning more work to the pinky and index fingers. I guess that in my model I have tried to reach a balance between reducing pinky use and same finger usage.

    P.S. How does the program work? I cannot read Perl.
    P.P.S. Why Perl? Why not a lower-level language like C?

    carpalx uses simulated annealing to find a layout with minimum effort. Since this is a stochastic process, many iterations are required to ensure that a global minimum has been found (strictly speaking, to increase the probability of this happening - since you cannot be assured you've found the best layout until you've examined them all). I used Perl because this is a language I use on a daily basis and am fluent at it. I'm not concerned about speed, because I have a large cluster that I can run my code on (which I'd still have to use even if the app was coded in C).

    Offline
    • 0
    • Reputation: 0
    • Registered: 08-Mar-2008
    • Posts: 303
    martin.krzywinski said:

    The model does take same finger into account, though not explicitly. Same finger, as well as same hand and same row usage, are dealt with in the stroke path component.

    http://mkweb.bcgsc.ca/carpalx/?typing_e … troke_path

    This is also the component that tries to measure finger rolling. Unfortunately, this stroke path quantity may try to do too much, and spreads itself too thinly. It's likely that I need to revisit the parameter values for this component to make sure that clearly undesirable characteristics have meaningful penalties.

    Using my corpus, I find that Arensito has the highest finger alteration, with 94% of successive keystrokes using different fingers. However, this keyboard's use of the pinky and ring finger is very high. Colemak and Capwell both have this same quantity at 93% and my own optimized layouts at 91-92%.

    The problem here appears to be a conflict of corpuses. By my corpus, Colemak and GYLMWP look like this:

    q w f p g  j l u y ;
    a r s t d  h n e i o
    z x c v b  k m , . '
    
    Fitness:       19.69
    Distance:      1900.91
    Inward rolls:  4.88%
    Outward rolls: 3.76%
    Same hand:     21.17%
    Same finger:   0.81%
    Row change:    9.03%
    Home jump:     0.50%
    To center:     3.72%
    
    g y l m w  p f u b ;
    r s n t d  h a e o i
    z x c v q  j k , . '
    
    Fitness:       19.75
    Distance:      1915.20
    Inward rolls:  3.19%
    Outward rolls: 4.07%
    Same hand:     16.51%
    Same finger:   1.99%
    Row change:    7.49%
    Home jump:     0.27%
    To center:     4.98%

    Fitness is highly subjective, and distance is somewhat subjective, but the rest are absolute as far as my corpus goes. Colemak does far better on same finger and somewhat better on distance, in exchange for worse same hand. I think that Colemak is more balanced. My corpus (see http://mtgap.bilfo.com/theory-of-letter-frequency.html for more info) contains the entire Carpalx corpus (thanks :P ) as well as a lot more stuff. I think that the Carpalx corpus is somewhat biased since it is entirely made up of books (mostly old ones) and Java, Perl, and Ruby code. My own corpus includes news articles, emails, websites, and C code.


    I agree that avoiding same finger usage is a valuable characteristic. It's something that comes at a cost of other characteristics. Reducing same finger usage requires that work is balanced between all fingers, and this necessarily means assigning more work to the pinky and index fingers. I guess that in my model I have tried to reach a balance between reducing pinky use and same finger usage.

    This is not entirely true. Reducing same finger can sometimes just mean using an uncommon key with a vowel, and this is especially easy on the pinkies. Colemak uses Q and Z with A on the left pinky, for instance, and this works very well.

    Offline
    • 0
    • Reputation: 0
    • Registered: 01-May-2009
    • Posts: 68

    I would like to know what typing data you have based your weights of the various metrics on.  If they aren't based on real life typing data, what good are they?  You should base how slow you think certain factors are, based on how slow they actually _are_, and likewise for fast patterns.  Real typing data may show that depending on how it is done, some "distance" may matter more than others, and may or may not be "worth" certain kinds of "same finger", for instance.  I believe we must therefore look at real typing data rather than trying to "optimize" arbitrary balances between various invented factors assumed to correlate in a linear fashion with keyboard value.

    Offline
    • 0
      • Index
      • General
      • carpalx - recnet experiences in evaluating/improving colemak