• You are not logged in.

    Layout Analyzer support for double-quote character

    • Started by stevep99
    • 6 Replies:
    • Reputation: 117
    • From: UK
    • Registered: 14-Apr-2014
    • Posts: 978

    I have updated my layout analyzer to version 1.19:

    - Support added for double-quote character.
    - To use, add the character to the input layout (top-left) panel, and add values for "effort" and "finger" in the corresponding position in the configuration (top-right) panel.
    - To declare two symbols on the same key, enter them with no space between in the top-left panel
    - Full list of supported symbols in the default frequency file (from most-common to least-common) is:

    E T A O I N H S R D L U M W C F Y G , P B . V K " ' ; ? J X Q : Z / 

    Example: For US-ANSI keyboard with apostrophe and double-quote

    Layout (top-left box):

    q w f p b j l u y ;
    a r s t g k n e i o '"
    x c d v z m h , . /

    Configuration (top-right box)

    effort:
    3.5 2.4 2.0 2.2 3.4 3.8 2.2 2.0 2.4 3.5
    1.5 1.2 1.0 1.0 2.9 2.9 1.0 1.0 1.2 1.5 3.5
    2.8 2.5 1.7 2.6 4.0 2.6 1.7 2.5 2.8 3.5
    penalties:
    2.5 2.5 3.5 #same-finger
    0.5 1.0 1.5 #pinky-ring
    0.1 0.2 0.3 #ring-middle
    fingers:
    0 1 2 3 3 6 6 7 8 9
    0 1 2 3 3 6 6 7 8 9 9
    1 2 3 3 3 6 6 7 8 9
    type:
    std
    Last edited by stevep99 (20-Feb-2020 16:36:01)

    Using Colemak-DH with Seniply.

    Offline
    • 0
    • Reputation: 214
    • From: Viken, Norway
    • Registered: 13-Dec-2006
    • Posts: 5,361

    That's lovely and useful! Thanks a bunch! However, the hyphen-minus is still sorely lacking. It's way more common than the slash, and at least on par with or more common than semicolon/colon. I know that Colemak and QWERTY and many other layouts have it on the top row which isn't usually in your analyses, but some layouts bring it closer to the home position precisely because its frequency warrants it.

    You forgot to add the colon and question mark to your ANSI analysis above, or if you didn't I think you should flex your new muscles with them too! I guess the <> glyphs are wayyy too rare to care about.  (=ʘᆽʘ=)ʃ

    Colemak ANSI layout (top-left box):

    q  w  f  p  b  j  l  u  y  ;:
    a  r  s  t  g  k  n  e  i  o  '"
    x  c  d  v  z  m  h  ,  .  /?
    Last edited by DreymaR (21-Feb-2020 13:50:18)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 117
    • From: UK
    • Registered: 14-Apr-2014
    • Posts: 978

    I did wonder about minus/hypen shortly after I did that update. But then I figured I might also want e.g. plus, equals and brackets too, especially if you wanted to analyse, for example, computer code. But on the other hand there's a need to not get out of hand with characters are included, as although it's not a big deal when considering single characters, for bigrams there are n² possibilities, which makes the needed bigram data file grow rapidly.

    Last edited by stevep99 (21-Feb-2020 14:37:49)

    Using Colemak-DH with Seniply.

    Offline
    • 0
    • Reputation: 214
    • From: Viken, Norway
    • Registered: 13-Dec-2006
    • Posts: 5,361

    I don't think you really need <>=+[]{} as they're very rare indeed (by my guesstimate, =+ lies around the frequency of Zz – at 0.07 % usage). But hyphen-minus, like the quote key, lies between Kk and Xx/Jj in frequency (0.3–0.8 % as far as I can see) which is better than for instance ;: ?/ Qq Zz.

    Your results are quite well in accordance with the other sources I used for the analysis in this post:
    https://forum.colemak.com/topic/2526-q-rotation/#p22919

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 117
    • From: UK
    • Registered: 14-Apr-2014
    • Posts: 978

    So... I've had a bit of a brainwave regarding which characters to support...

    Up until the now, the analyzer has been limited to a limited subset of characters, basically the ones usually found in the main text area. This was done because most alternative layouts tend to only move those keys, and including a lot of characters would have made bigram calculations unwieldy.

    But... what occurred to me was, there is no fundamental reason why I can't use any arbitrary characters for the letter frequency analysis part, and only use the limited set of characters for consideration of bigrams. This would effectively allow any input layout in - including whatever numbers and symbols you wanted, and the analyser would work! The only limitation would be that you wouldn't get bigram penalties for unusual combinations, e.g. "1q", but I don't consider that to a big deal.

    So, I had some fun implementing this over the weekend. For completeness I have added hypen/minus to the list as you suggested, so there are now 35 characters supported in the bigram analysis. But you can also now have *any* character in the layout itself, including the number row if you so desire.

    I also added a heatmap feature too which I think is quite nice, albeit not as pretty as patorjk's.

    The new frequency results show most of the extra characters are hardly used - the numbers etc have very low frequency - but I guess you'd expect that as the source material is books. I am planning to some frequency analysis of text from other sources for comparison.

    Example input:

    1! 2" 3£ 4$ 5% 6^ 7& 8* 9( 0) -_ =+
    q w f p b j l u y ;: [{ ]} 
    a r s t g k n e i o '@
    z x c d v m h ,< .> /?
    effort:
    5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 
    3.5 2.4 2.0 2.2 3.4 3.8 2.2 2.0 2.4 3.5 5.0 5.0 
    1.5 1.2 1.0 1.0 2.9 2.9 1.0 1.0 1.2 1.5 4.0
    3.5 2.8 2.5 1.7 2.6 2.6 1.7 2.5 2.8 3.5
    penalties:
    2.5 2.5 3.5 #same-finger
    0.5 1.0 1.5 #pinky-ring
    0.1 0.2 0.3 #ring-middle
    fingers:
    0 1 2 3 3 6 6 7 8 9 9 9
    0 1 2 3 3 6 6 7 8 9 9 9
    0 1 2 3 3 6 6 7 8 9 9
    0 1 2 3 3 6 6 7 8 9
    type:
    angle
    Last edited by stevep99 (24-Feb-2020 16:02:16)

    Using Colemak-DH with Seniply.

    Offline
    • 1
    • Reputation: 214
    • From: Viken, Norway
    • Registered: 13-Dec-2006
    • Posts: 5,361

    Great work, SteveP!

    In your book analysis, did you account for curly/guillemot quotes (‚‘’‹› „“”«») and en/em dashes (– —) that would usually be typed as straight quotes and hyphens? I think it'd be most correct to search and replace those before analysis, for the purpose of finding frequencies pertaining to typing.

    Last edited by DreymaR (24-Feb-2020 18:38:46)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 117
    • From: UK
    • Registered: 14-Apr-2014
    • Posts: 978

    I don't to any preprocessing of the input, but it seems those characters are not present in the source texts anyway.

    Using Colemak-DH with Seniply.

    Offline
    • 0