• You are not logged in.

Engram layout

  • Started by binarybottle
  • 35 Replies:
  • Reputation: 2
  • From: NY, NY
  • Registered: 19-Mar-2021
  • Posts: 6

I have used the Colemak layout for over 10 years, and have enjoyed discussions on this forum about optimizing layouts.

For those who may be interested, I've optimized an English keyboard layout, called the Engram layout, distributed it via Keyman, am using it on my Ergodox, Kinesis, and standard keyboards, and have submitted a publication for peer review detailing the optimization strategy and layout.  Please see the preprint: https://www.preprints.org/manuscript/202103.0287/v1

https://engram.dev: Official website
https://github.com/binarybottle/engram: GitHub public repository for engram optimization software
https://keyman.com/keyboards/engram: Keyman distribution website
https://github.com/binarybottle/text_data: GitHub public repository for text data

I am currently including additional layouts and am looking into additional evaluation criteria for the final publication.

Cheers,
@rno
--
binarybottle.com

Last edited by binarybottle (19-Mar-2021 21:35:45)
Offline
  • 0
  • Shai
  • Administrator
  • Reputation: 36
  • Registered: 11-Dec-2005
  • Posts: 423

From a quick look, it seems to do very poorly on same-finger. Especially same-finger on weaker fingers, e.g. PI/KI on the ring finger and GH/HY/WR/RL on the little finger.

The strain caused by the LS/SL combo on the little finger on Dvorak was one of the reasons that drove me to create Colemak. The GH combo alone on Engram is is twice as common as that. Also in general, the little finger and the ring fingers are way overworked.

Also, I don't like where the " is located (QWERTY Y position), as in some contexts it can be used often (speech, programming).

I've tried it briefly, and I find it to be even more straining than QWERTY due to the way it overloads the weaker fingers.

Moving rare keys such as punctuation makes it more difficult to learn, and thus less practical for people. The vast majority of people don't want to switch keyboard layout due to the huge pain involved in learning a new layout. Moving the punctuation makes the pain much greater, and with no clear benefit.

Also, the text_data corpus is very poor (quantity and quality), e.g.

@miamiiboii dead @ yu gettin on wen im leavin
@daniela_95616 hahaa!! i just realized "impune"
10 Favs and i post this revamp for @L7Kroonos ? https://t.co/thjyAUZS7B

I'm hoping to release soon a high quality cleaned up corpus.

Last edited by Shai (19-Mar-2021 21:38:36)
Offline
  • 1
  • Reputation: 2
  • From: NY, NY
  • Registered: 19-Mar-2021
  • Posts: 6

Shai -- I appreciate that your initial impression is based on a quick look, but as I'm sure you can appreciate from the many spirited discussions on this forum and the hard work that many of our colleagues have put into optimizing key layouts, (1) there is a great difference of opinion about the use of the little finger, especially on the home row, and (2) a superficial impression does not often stand up to deeper scrutiny based on statistical analysis. I am currently adding different keyboard layout evaluations to compare 15 different keyboard layouts, Engram and Colemak among them. I will be sure to share the results with this forum.

While I don't feel it's fair to discount all the different publicly available text sources I used based on some of the tweet data, I do look forward to making use of your cleaned up text corpus when it is available.

Cheers,
@rno

Offline
  • 0
  • Shai
  • Administrator
  • Reputation: 36
  • Registered: 11-Dec-2005
  • Posts: 423
Offline
  • 1
  • Reputation: 2
  • From: NY, NY
  • Registered: 19-Mar-2021
  • Posts: 6

Thank you, Shal.

Offline
  • 0
  • Reputation: 117
  • From: UK
  • Registered: 14-Apr-2014
  • Posts: 979

Ah, another layout! They seem to be coming thick and fast these days. I've also only had a cursory look so far, and I would like to look at your model in more detail, but here are my initial thoughts: I like the systematic approach and I largely agree with the basic design principles being used. But what I find curious and interesting is that despite all that, the end result layout that has come out nonetheless seems to have some clear flaws - from my point-of-view at least. I think the clearest conclusion this demonstrates is just how subjective everything related to keyboards and layouts is.

I agree with Shai that the pinkies are the most obvious problem. GH on same finger - and pinky at that - is a significant flaw. But the L position would also be a dealbreaker for me. Pinkies do seem to divide opinion - at the one extreme are layouts like this and at the other are ones like BEAKL that are very anti-pinky. I'm somewhere in the middle and think the best compromise is to allow moderate pinky use on home-row keys, but they shouldn't need to move much, and they should definitely avoid SFBs. Colemak fits this criterion quite well.

Engram reminds me most of Halmak - another layout which in my view is somewhat flawed - but which your scoring system rates as the next best after Engram! So it just goes to show how personal preferences in the model parameters influence what makes a good layout. I think it would be beneficial to acknowlededge this subjectivity in the paper.

Some other more minor quibbles are putting Q and Z outside the main block, as this reduces portability for people who like minimalist keyboards, and changing around all the symbols on the number row, which seems like a lot of effort for diminishing returns.

One final point, you say other layouts don't take sufficiently into ergonomic factors such finger lengths, strengths and movements, etc, and that you are designing an optimal layout without reference to what has gone before. But, somewhat ironically, you have fallen foul of this very principle in your standard-staggered design, which adopts the traditional fingering scheme. If the aim is to prioritize ergonomics over tradition, I'm surprised you haven't accounted for the Angle Mod in your analysis at all.

One other minor correction, Colemak-DH is from 2014, not 2017.

Last edited by stevep99 (20-Mar-2021 12:55:24)

Using Colemak-DH with Seniply.

Offline
  • 1
  • Reputation: 2
  • From: NY, NY
  • Registered: 19-Mar-2021
  • Posts: 6

stevep99 -- Thank you for the constructive feedback based on your initial impressions.  I agree that the quest for an optimal layout is a subjective one, given that people have different needs and preferences, particularly with respect to different forms of fatigue, strain, and injury. I will be sure to state this more explicitly in the article.

Regarding the use of the little finger, my goal was to maximize the number of inward rolls for high-frequency bigrams while simultaneously reducing the number of same-finger, high-frequency bigrams. Using all four fingers on the home row helps to maximize the number of inward rolls for high-frequency bigrams and minimize the number of same-finger bigrams by spreading out the effort across the fingers. But of course this necessarily results in greater use of the little finger. I have in my software an optional matrix that weights the fingers according to their relative strengths. When I have made use of it, I ended up with more same-finger bigrams (as expected), but perhaps more people would prefer this if it means reducing use of the little fingers.  What do you think?

The finger strengths that I am referring to are based on peak keyboard reaction forces (in newtons) from Table 4 of
"Keyboard Reaction Force and Finger Flexor Electromyograms during Computer Keyboard Work"
BJ Martin, TJ Armstrong, JA Foulke, S Natarajan, Human Factors,1996,38(4),654-664:

middle     2.36
index      2.26
ring       2.02
little     1.84

Do these seem like reasonable strength values to use, or do you know of more relevant published measures?  Or perhaps a better gauge of predicting fatigue/strain would be relative speed?  Dvorak measured taps per second over 15-s intervals, and I also evaluate using a separate publication of inter-key speeds.  Again, what are your thoughts?

As for incorporating angle modifications for standard-staggered design, I primarily use ortholinear (Kinesis and Ergodox) keyboards myself, and am a bit wary of including penalties for this, as keyboards vary.  But let me reconsider this for my analysis.  If you have any suggestions for best practices, I would love to hear!

And thank you for the correction -- I have already made the change of date from 2017 to 2014 for the next round of review of the manuscript. 

I appreciate hearing from you, as it makes this endeavor more enjoyable and will benefit the final outcome and manuscript.

Cheers,
@rno

Offline
  • 1
  • Reputation: 1
  • Registered: 20-Mar-2021
  • Posts: 10

As a keyboard and keyboard layout enthusiast who greatly values scientific effort in attempts at optimized layouts, I felt the need to create an account here just to show my appreciation. But I am particularly intrigued by your layout, because I realized you used the same interkey timing measurements from İşeri and Ekşioğlu (2015) as I currently do in my attempt at layout optimization. The measurements of that study are obviously biased for multiple reasons, one of which being the right-handedness of the participants that you tried to correct; but I found the empirical data they present to be in line with what most people try to heuristically implement in their own projects. In fact, I used their data as-is to optimize a 30-key layout without any other constraints (other than keeping shift pairs together), and came up with this:

 z  ,  '  k  .   v  h  c  g  b
 o  a  e  i  y   m  n  s  t  d
 q  ;  j  u  x   r  l  f  p  w

Apart from overloading right index finger, and with a questionable placement of R key, I find this layout to be surprisingly sensible. I wanted to share this here because in my opinion empirical data should be prioritized in determining layout optimization criteria, and I think this layout kind of proves me right. So, my main criticism against your work is, the artificial constraints in your scoring model seems to convolute the empirical data. Other than that, I am greatly interested in your work, and I appreciate that you open sourced your project; I will follow next iterations of it with great interest.

Offline
  • 0
  • Reputation: 2
  • From: NY, NY
  • Registered: 19-Mar-2021
  • Posts: 6

Thank you very much, necdetaliozdur!  Just this morning, in fact, I generated a layout solely based on their inter-key timing data -- what a coincidence!  However, I'm not happy with the confounds in their data, and I'm not convinced that speed is a good proxy for comfort, so I am pursuing a modified version of my flow model to account for the underlying structural problems in keyboards that I believe underpin the timing data.  I am close, and will keep you updated!

Offline
  • 1
  • Reputation: 117
  • From: UK
  • Registered: 14-Apr-2014
  • Posts: 979
binarybottle said:

Regarding the use of the little finger, my goal was to maximize the number of inward rolls for high-frequency bigrams while simultaneously reducing the number of same-finger, high-frequency bigrams. Using all four fingers on the home row helps to maximize the number of inward rolls for high-frequency bigrams and minimize the number of same-finger bigrams by spreading out the effort across the fingers.

It's a legitimate aim I think, but my impression is that for most people in the community, there is a consensus that avoiding same-finger bigrams is a higher priority than inward vs outward polls. Inward rolls are a bit nicer, but probably it's largely outweighted by the observation that rolls in either direction are great *on strong fingers* (Colemak ST/TS/EN/NE), whereas rolls on weaker fingers (AR/RA/IO/OI) are much less satisfactory. I'm not a huge fan of the Colemak AR despite it being an inward roll - but again, this might be personal preference to some extent.

binarybottle said:

But of course this necessarily results in greater use of the little finger. I have in my software an optional matrix that weights the fingers according to their relative strengths. When I have made use of it, I ended up with more same-finger bigrams (as expected), but perhaps more people would prefer this if it means reducing use of the little fingers.

But there are layouts like Colemak with much lower same-finger utilization, and of course Colemak-DH which is similar but with reduced centre-column usage. Do these score relatively badly in your metrics because of a lower proportion of inwards rolls?

binarybottle said:

The finger strengths that I am referring to are based on peak keyboard reaction forces (in newtons) from Table 4 of
"Keyboard Reaction Force and Finger Flexor Electromyograms during Computer Keyboard Work"
middle     2.36
index      2.26
ring       2.02
little     1.84

I do appreciate that fact that found some relevant research on this. This is useful data to consider, but I think there is a question over how to make use of it exactly. I mean, the little finger force is 22% lower than the middle finger according to those results, but that does that mean the little finger should only have 22% less work? I'd say the difference overall between the fingers is greater than these results would imply. There could be other factors like the dexterity and stamina involved too, and also the fact that the ring and pinky fingers are connected and less independent. I also think the index finger is the best finger for typing, but that doesn't quite match with this force data.

I my own analyzer, I have used "effort values" (i.e. lower is better) of index finger=1.0 (baseline), middle=1.1, ring=1.3, pinky=1.6. But these values are entirely my own judgement and not based on a scientific paper, so I would welcome the chance to have more solid and evidence-based foundation for relative finger strengths. You have done a great job of looking through the scientific literature though, so I wonder if you have any thoughts on how the research you have found can be used inform the relative weightings of finger usage for the purposes of keyboard layout design?

Last edited by stevep99 (22-Mar-2021 15:02:22)

Using Colemak-DH with Seniply.

Offline
  • 1
  • Reputation: 214
  • From: Viken, Norway
  • Registered: 13-Dec-2006
  • Posts: 5,368

Maybe some weighting based on these measured finger strengths and the timing data mentioned by necdetaliozdur could provide useful? I'd say that the rapidity of a finger may prove even more useful as data for keyboard effort modeling, but strength should play a part since a weaker finger will grow tired more quickly I think.

Last edited by DreymaR (21-Mar-2021 18:40:55)

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline
  • 0
  • Shai
  • Administrator
  • Reputation: 36
  • Registered: 11-Dec-2005
  • Posts: 423

Have you read the Colemak Design and the Design FAQ page?

Have you seen stevep99's typing effort grid article?

They'll discuss some of the aspects of layout design that you may have missed (typing effort grid, design constraints, design priorities, same finger, the characteristics of each finger (strength, agility, autonomy, flexibility).

You'd might want to rethink your scoring priorities, because rolls should have very low impact on the scoring model, compared to other aspects.

Also, have you actually managed to attain high proficiency with your layout? Some of the layout design issues (specifically same finger) become significantly more problematic at higher typing speeds (80+ WPM).

Offline
  • 1
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
binarybottle said:

stevep99
middle     2.36
index      2.26
ring       2.02
little     1.84

Do these seem like reasonable strength values to use, or do you know of more relevant published measures? @rno


Attached.

Attachments:
Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
DreymaR said:

Maybe some weighting based on these measured finger strengths and the timing data mentioned by necdetaliozdur could provide useful? I'd say that the rapidity of a finger may prove even more useful as data for keyboard effort modeling, but strength should play a part since a weaker finger will grow tired more quickly I think.

Attached.

Attachments:
Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67

Hi

necdetaliozdur said:
 z  ,  '  k  .   v  h  c  g  b
 o  a  e  i  y   m  n  s  t  d
 q  ;  j  u  x   r  l  f  p  w

Preferred name?

Thanks, Ian

Offline
  • 0
  • Shai
  • Administrator
  • Reputation: 36
  • Registered: 11-Dec-2005
  • Posts: 423

iandoug, the sample size also looks really small, and the methodology seems flawed, so I don't think there's too much value in it.

For me the data looks extremely suspect. The hand split almost perfectly matches Dvorak (only P/Z swapped).

I wouldn't be surprised if the author used their own typing data to create these figures.

If you'll look at the hierarchy of evidence pyramid, this would be near the bottom.

Unfortunately, not all science is good science.

Also, it's focusing on the wrong metric. Strain/comfort is much more important than speed.

Offline
  • 0
  • Reputation: 214
  • From: Viken, Norway
  • Registered: 13-Dec-2006
  • Posts: 5,368

Iandoug, in your attached diagram the rareness of bigrams doesn't seem to matter much but same-hand bigrams are a few ms slower than alternating ones. However, for a proper analysis we'd need more detailed measurements. For instance, the IO bigram is probably a lot slower than the EN bigram even though both are same-hand. And whether you alternate once or several times over should matter quite a lot. At very high typing speeds it'd be a lot more comfy and less tiring to type RSTIEN over and over than to type TNTNTN over and over, I'm pretty sure of that!

I don't suppose we have any studies that go into this sort of detail, though?

Last edited by DreymaR (22-Mar-2021 13:30:05)

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
Shai said:

iandoug, the sample size also looks really small, and the methodology seems flawed, so I don't think there's too much value in it.

For me the data looks extremely suspect. The hand split almost perfectly matches Dvorak (only P/Z swapped).

For the record, Arno asked for my inputs a few weeks back and we have been communicating.

One of the first things I pointed out was the corpus.

If Arno doesn't mind open feedback, then also:
1. Ignoring upper case, and
2. space bar and
3. most punctuation

is going to produce a layout which is "best" on those metrics but not in the real world.

His actual layout generation is bigram-based, he is just cross-checking with the analyzers.

Using any analyzer which relies on input texts (eg KLA and forks) requires input texts that closely match English letter frequency. A similar idea applies to bigram-based analysis ... do your bigrams include all the chars on the keyboard, including non-letters, and upper/lower case? If not, then your results will be out.

For input texts, most don't match very well. I have attached one which is better than most, don't judge the content, just worry about the letter frequency. I have permission from the author to use his writings for these purposes. I've tested scads of things trying to find proper letter frequency. This is good for the first 15 chars, which covers the bulk of English. It's certainly way better than Alice Ch. 1 or the assorted word lists.

Shai, I only seem to be able to add one attachment per message ... is that how it is?

Cheers, Ian

Last edited by iandoug (22-Mar-2021 14:25:34)
Attachments:
Offline
  • 0
  • Shai
  • Administrator
  • Reputation: 36
  • Registered: 11-Dec-2005
  • Posts: 423
iandoug said:

Shai, I only seem to be able to add one attachment per message ... is that how it is?

Generally it's recommended to post files somewhere else, and then post a link. You can also ZIP the files.
Are you trying to post the same files as the ones linked earlier in GitHub? It's better to link to the source, as that will be up to date.

https://github.com/binarybottle/text_data

You may have missed it, but I've created a corpus which is much larger, cleaner and varied.
You can download the corpus here

Offline
  • 1
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
Shai said:

Are you trying to post the same files as the ones linked earlier in GitHub? It's better to link to the source, as that will be up to date.

https://github.com/binarybottle/text_data

Yes, didn't know he had added it to his repo..

Shai said:

You may have missed it, but I've created a corpus which is much larger, cleaner and varied.
You can download the corpus here

I did see your post but then I saw the size ... it's too big to feed to KLA.

It's also not entirely clean ... my checker aborts when it finds chars not on a standard US keyboard, first problem is here:

The char '' (ord 194) is required at line 115818:118 of length 671

So I ran it through my cleaner, but still only get as far as

The char '' (ord 194) is required at line 247013:94 of length 678

Turned out to be the currency character.

Resorted to manual editing, it was faster.

Here's link to cleaner clean version.
195,314,036 bytes
https://www.keyboard-design.com//etc/webcorpus.zip

The zip includes an analysis, as well as analysis of Patrick's corpus, my suggested text, and another one made by my monkey. The analyzer can't tell the difference between monkey text and actual English because it's made from bigram analysis of my corpus. So it's "almost perfect" "English".

You want both Similarity and FreqMatch as close to 100 as possible (one from below, the other from above), and TopFifteen as close to 120 (perfect) as possible.

I have a bug somewhere in the analyzer so it doesn't handle the tab character properly.

The .csv file is tab-delimited with NO string delimiter. Works in LibreOffice, should work in Excel. You may need to first open a blank spreadseet to put LibreOffice in the right mode.

Cheers, Ian

Offline
  • 0
  • Reputation: 1
  • Registered: 20-Mar-2021
  • Posts: 10
DreymaR said:

Maybe some weighting based on these measured finger strengths and the timing data mentioned by necdetaliozdur could provide useful? I'd say that the rapidity of a finger may prove even more useful as data for keyboard effort modeling, but strength should play a part since a weaker finger will grow tired more quickly I think.

Another article from the same author also presents the findings on finger strengths in terms of Borg's rating of perceived exertion

Finger       M   SD
Left little  3.3 2.1
Left ring    2.9 2.1
Left middle  2.3 1.9
Left index   2.0 1.8
Right index  1.6 1.8
Right middle 2.0 1.9
Right ring   2.3 1.9
Right little 2.7 2.1
Total        2.4 2.0

The relative strengths of fingers according to these results (if you normalize the values with respect to the index finger for each hand) are quite similar to what stevep99's been using in their analyzer. Personally, in my own hobbyist attempts of layout optimization so far, I've been using stevep99's weightings to account for finger strength.

Edited to correct the reference.

Last edited by necdetaliozdur (25-Mar-2021 12:54:46)
Offline
  • 1
  • Reputation: 1
  • Registered: 20-Mar-2021
  • Posts: 10
iandoug said:

Hi

necdetaliozdur said:
 z  ,  '  k  .   v  h  c  g  b
 o  a  e  i  y   m  n  s  t  d
 q  ;  j  u  x   r  l  f  p  w

Preferred name?

Thanks, Ian

You can call it "Righty" I guess, because of the right-handedness of the layout and the jarring placement of the letter R. I couldn't think of any other fancy name for it (⌒_⌒;)

Offline
  • 0
  • Reputation: 1
  • Registered: 20-Mar-2021
  • Posts: 10
binarybottle said:

Just this morning, in fact, I generated a layout solely based on their inter-key timing data -- what a coincidence!  However, I'm not happy with the confounds in their data, and I'm not convinced that speed is a good proxy for comfort, so I am pursuing a modified version of my flow model to account for the underlying structural problems in keyboards that I believe underpin the timing data.

I kind of agree, typing speed doesn't necessarily imply comfort, but they are definitely correlated. The problem here right now we only have info on typing speed in terms of bigrams, and they only tell a narrow portion of the story. For example, I am pretty sure the speed and comfort of the bigram AR in ARC, CAR, BAR or EAR (in Colemak) varies because of the preceding/following letter positions. Right now I am attempting to extract ngram (n>2) timing data from my own Typeracer history with the help of this article to see if my assumption is correct.

Offline
  • 0
  • Reputation: 117
  • From: UK
  • Registered: 14-Apr-2014
  • Posts: 979
necdetaliozdur said:
Finger       M   SD
Left little  3.3 2.1
Left ring    2.9 2.1
Left middle  2.3 1.9
Left index   2.0 1.8
Right index  1.6 1.8
Right middle 2.0 1.9
Right ring   2.3 1.9
Right little 2.7 2.1

So, just for fun, I normalized these values to right-index=1.0, and put them into my effort grid calculator. It's a bit strange because there is a very strong right hand bias - right pinky is deemed substantially better than left ring? Anyway, attached is the result I got. Make of it what you will!

Attachments:

Using Colemak-DH with Seniply.

Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
Offline
  • 0