• You are not logged in.

Engram layout

  • Started by binarybottle
  • 35 Replies:
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
necdetaliozdur said:

You can call it "Righty" I guess, because of the right-handedness of the layout and the jarring placement of the letter R. I couldn't think of any other fancy name for it (⌒_⌒;)

Thanks, will add to test suite.

Cheers, Ian

Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
necdetaliozdur said:

I kind of agree, typing speed doesn't necessarily imply comfort, but they are definitely correlated. The problem here right now we only have info on typing speed in terms of bigrams, and they only tell a narrow portion of the story. For example, I am pretty sure the speed and comfort of the bigram AR in ARC, CAR, BAR or EAR (in Colemak) varies because of the preceding/following letter positions. Right now I am attempting to extract ngram (n>2) timing data from my own Typeracer history with the help of this article to see if my assumption is correct.

I've been testing layouts again, a layout from Snarfangel (YOP_UIAN Kinesis 1) did rather well so I took a look ... he's got characters on the bottom row of the Ergodox, I can't imagine that those will be easy to type. Middle fingers do not like to curl down that much.

I think there's something missing from the analyzers... some "comfort" measure.

For example, the AltGr layouts score well, even on ANSI, but putting your hand in "touch typing" position, with your right thumb on AltGr, must be uncomfortable, but the analyzers don't care ...

My take-away from the various papers that I've read is that measuring "effort" or strength or flexibility during typing is non-trivial, there are a multitude of different factors interfering with each other. So we try to make sane approximations instead, and rely on the "scientific process" to steer us to the better solutions.

The paper I linked above points out that *in general* there are dramatic differences between men and woman (typing... :-) ) on top of differences between others of same gender.This stuff *is complicated* ... :-)

Cheers, Ian

Offline
  • 1
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
Shai said:

You may have missed it, but I've created a corpus which is much larger, cleaner and varied.
You can download the corpus here


FWIW, last year in September I compiled an English corpus and a code corpus. I posted some analysis to Den's site before it disappeared, along with my breathless prose.

Anyway, Arno's efforts motivated me to finally write it all up and do some other analysis, which is published here, along with some useful files. Get the 1.0.1 and -101 versions.

https://zenodo.org/record/4644104

During the process I wrote Shakespeare's Monkey and Shakespeare's Coder ... little programs that use the English and code bigram frequencies (all 97 characters on ANSI) to produce "English" or "code" that things like KLA think are perfect English or code, even though we can see they're not.

But they work great in KLA, and certainly better than the likes of Alice. It also effectively solves "how to test typing code" since the code samples are multi-language.

They should also be suitable fodder for analysis engines working with bigrams, since they are nothing but chained bigrams.

There's a whole bunch of "useful" spreadsheets in the zip file. Enjoy. :-)

I would not recommend for cryptonanalysis, but good for keyboard layouts.

Cheers, Ian

Offline
  • 1
  • Reputation: 214
  • From: Viken, Norway
  • Registered: 13-Dec-2006
  • Posts: 5,368
iandoug said:

For example, the AltGr layouts score well, even on ANSI, but putting your hand in "touch typing" position, with your right thumb on AltGr, must be uncomfortable, but the analyzers don't care ...

It's worse: Today's analyzers generally think that if you put something on the home position of a layer, no matter how convoluted the means to reaching that layer, it's the bee's knees. They have no respect for the chording-vs-sequencing issue either (see XahLee's article on sequencing), and I don't think they even know what a dead key is? But as long as you don't put anything important on dead keys you may exempt them from analysis I guess. It'd still be interesting to assess the difference between having your locale letters on AltGr mappings vs dead keys, but today's analysis tools aren't up to the task from what I gather.

On a side note: The Wide mod makes AltGr a lot more comfortable. It still depends a bit on your keyboard design, but a Wide config certainly helps.

Last edited by DreymaR (31-Mar-2021 09:32:14)

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67

@DreymaR

KLA and cousins (well, Den's forks, assume Steve's as well) blithely ignore any characters in the text that it can't find on the keyboard. Which has led to unrealistic good scores on occasion. :-). Myself included. I have a PHP program that checks layouts for completeness.

KLA knows normal, shift, AltGr, and shift-Altgr layers, and Den's last fork may handle a numpad layer as well. Still need to try and figure out what he did there.

I want to put in a check that it should decline to analyse a layout that is missing required characters.

They have no concept of a dead key or Linux Compose key ... probably a foreign concept to average US ANSI user :-)

Patrick's scoring (and I assume Steve's as well, unless he followed what Den and I did, which was to add VERTICAL distance) does reward layouts that constantly require multiple keypresses for a single character. I think Den also added some penalty for needing two or more fingers per char.

Cheers, Ian

(BTW I've added quartz-glyph to my collection... thanks..)

Offline
  • 0
  • Reputation: 214
  • From: Viken, Norway
  • Registered: 13-Dec-2006
  • Posts: 5,368
iandoug said:

(BTW I've added quartz-glyph to my collection... thanks..)

☆*✧:.。. o(⁎≧▽≦)o .。.:✧*☆

It's based on a perfect pangram. Therefore it's a perfect layout!

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline
  • 0
  • Reputation: 214
  • From: Viken, Norway
  • Registered: 13-Dec-2006
  • Posts: 5,368

Well, the only letters that go with E without creating too many ugly same-finger bigrams are U and O. So these are the choices, basically. Another tack is to put punctuation on the same finger as E, like the ISRT layout does. But even Colemak's E-comma SFB is a bit too common for the perfectionists.

Last edited by DreymaR (30-Apr-2021 13:03:06)

*** Learn Colemak in 2–5 steps with Tarmak! ***
*** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
USER39 said:

I've started using Engram out of curiosity yesterday and the position of the "o" over the "e" and the "u" over the "a" is very questionable for me... what was your reason to put those letters on the same fingers?

The best letters to put with E are Q and O, otherwise punctuation.
The problem is that the best letters to go with A are O and Q, or punctuation. So you either run out of punctuation (if A or E are on Index) or you sacrifice a good spot to punctuation.

Engram has a version 2, 2.5 and 3 now.

Attached is my mod to version 1. It still has some issues but is better in other respects. Later versions of Engram have better same-finger metrics.
It swaps the A-E vowels. AU is very common in layouts but AO is better.

Cheers, Ian

Attachments:
Offline
  • 0
  • Reputation: 2
  • From: NY, NY
  • Registered: 19-Mar-2021
  • Posts: 6

Hello, everyone!  Thank you all for your helpful feedback.  I am happy to report that after thinking about your comments, coding, analyzing, and evaluating for an additional six weeks, I have settled on a greatly improved Engram layout which I find much easier and more enjoyable to type with (I am typing with it right now). I believe it addresses many of the concerns that you all have raised in this thread.

The protocol to generate optimized layouts (https://github.com/binarybottle/engram) is cleaner and more rigorous, with new validations.  For text-based measures, I now include Shai's and Ian's text corpuses (https://github.com/binarybottle/text_data). I also generated a layout using the exact same protocol but optimized based solely on published interkey speed data to indirectly demonstrate that potentially faster-to-type layouts are not necessarily more comfortable to type.    

The resulting Engram layout has fewer consecutive same-finger bigrams, reduces the load on the little fingers, and though the protocol doesn't explicitly address the peculiarities of staggered keyboards, the Engram layout happens to have some of the lowest-frequency letters on the left bottom row, which mitigates some of the problems that arise with asymmetric staggered keyboards.

For those who would like to give it a closer look or try it out, please see the updated documentation at https://engram.dev.  I will revise the publication currently under review accordingly once I get it back from the reviewers.  Thanks again for your earlier feedback and I look forward to hearing your thoughts. 

Cheers,
@rno
--
binarybottle.com

Offline
  • 0
  • Reputation: 1
  • Registered: 20-Mar-2021
  • Posts: 10
iandoug said:

There's a whole bunch of "useful" spreadsheets in the zip file. Enjoy. :-)

Hi Ian,

I wanted to use your corpus for some layout optimization work but I realized that some of the entries in the spreadsheets (especially in english-bigrams) shows some errors, probably due to Excel being unable to recognize some of the unicode characters. I thought you may want to know.

The ngram frequency of your corpus seems to be the closest to the Norvig's infamous analysis of Google Books ngrams, among the other corpora I've seen so far. I tried to do an updated version of Google Books ngram analysis myself using the most recent 2020 dataset and including the punctuations, but I haven't got time to process 23GB of text so far :)

Best

Offline
  • 0
  • Reputation: 7
  • Registered: 18-Nov-2017
  • Posts: 67
necdetaliozdur said:

I wanted to use your corpus for some layout optimization work but I realized that some of the entries in the spreadsheets (especially in english-bigrams) shows some errors, probably due to Excel being unable to recognize some of the unicode characters. I thought you may want to know.

Yes, sorry about that. Was reported to me a while back (and rather embarrassing), it's caused by spreadsheets assuming things that start with = or + etc are formulas.

Discovered the workaround the other day. Will try to fix and upload new version by tomorrow.

Thanks, Ian

Last edited by iandoug (24-Aug-2021 12:33:04)
Offline
  • 0