• You are not logged in.

    THinkiNG abOUt digraPHs!

    • Started by DreymaR
    • 7 Replies:
    • Reputation: 111
    • From: Oslo, Norway
    • Registered: 13-Dec-2006
    • Posts: 4,707

    I've been thinking a bit about some particular letter patterns that actually represent single entities. These include TH, QU, GH and probably OU/PH/NG – any others? That sets them a bit apart from the rest. Just a th-ou-gh-t... (the gh in that word is mostly silent now but used to be pronounced).

    Incidentally, the reason we type TH and GH as letter pairs, from what I've been told, is that the Dutch type setters who first came to England in the late 15th Century to print the Bible and Arthur myth for William Caxton, didn't have lead type for the English Wynn (Ƿ/?) and Yogh (Ȝ) runes! Also, they got paid per letter so they certainly didn't mind making digraphs/bigrams out of the missing letters...

    In Icelandic they still have þ/ð letters for unvoiced/voiced TH, and in some languages you can use a γ/ɣ or similar for GH. Ph/ph can be written as Φ/φ. By now, that won't happen in English of course!    ლ( ʘ▽ʘ)ლ~?!

    But maybe people who wish to improve their keystroke stats could take those on?! There isn't much chance of getting more keys on the keyboard, but with layers maybe it's an interesting thought – same number of strokes but the possibility of using home position and thumbs more. I'm thinking along the lines of mapping Th/th to AltGr+T/t etc.

    Last edited by DreymaR (03-Dec-2019 09:56:36)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 15
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 451

    A key for "TH" has been suggested here before, but I doubt it would work in practice.  We are so used to typing letters (graphemes), not sounds (phonemes).  How long would it take you to adjust to typing "the" as th-e instead of t-h-e?  Or is that because I'm not a native English speaker?  In my language Dutch, "ij" is sometimes considered a single letter, and hand-written as ÿ, but I would never type it that way.

    Another issue would be capitalisation; should Shift+"th" produce "TH" or "Th"?  You don't want to depend on auto correct?

    For comparison; Serbian is a bi-alphabetical language (Latin and Cyrillic both in use today), the Latin alphabet has digrahs lj, nj, dž that are single Cyrillic characters љ, њ, џ, and are collated as single characters in both alphabets, but their Latin keyboard (QWERTZ) has no digraph keys.  (The corresponding keys instead have Q, W, X, which are not used in the language and have no Cyrillic counterpart.)

    Offline
    • 0
    • Reputation: 15
    • Registered: 12-Sep-2016
    • Posts: 45

    I am also  Dutch and have actually IJ bound to Colemak's semicolon position. The J position is very awkward for common sequences such as "ijk" and "ijn". Moving IJ to the semicolon improves it a lot. Learning it was also very easy as it is just one sound and it is basically handled as a single letter. Furthermore, in this case it also also very clear that Shift+"ij" should be mapped to "IJ", as it should be capitalized as a whole: Ij never occurs.

    Create advanced keyboard layouts in various formats using my Keyboard Layout Files Creator!

    Offline
    • 0
    • Reputation: 111
    • From: Oslo, Norway
    • Registered: 13-Dec-2006
    • Posts: 4,707

    Yeah, I don't think it seems too tempting for me but we had a chat at the Discord and it got me thinking.

    I'd think that Shift+T would produce Th because that's what you'd use in normal text. I had the same problem with my dead key table entries for the ligatures ij/IJ, lj/Lj/LJ, nj/Nj/NJ, dz/Dz/DZ and dž/Dž/DŽ; I resorted to using the AltGr mappings for the relevant keys to release the all-cap versions (so, e.g., l→lj, L→Lj and ł→LJ). It's a hack since it's only on my eD layout the AltGr mapped glyphs are connected to their base letters.

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 15
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 451

    But now you're talking about (dead) keys to produce single Unicode glyphs, not two consecutive characters?

    Unicode has glyphs for digraphs like lj/Lj/LJ, as well as purely stylistic ligatures like ff/fi/ffi/..., but mainly for round-trip compatibility with existing (legacy) encodings.  It recommends against using them though, since it obfuscates text and makes search (string matching) harder.  Official position is that ligatures should be rendered by fonts or higher-level software, not in text encoding.  For me that extends to keyboard layouts as well.

    That does not oppose your suggestion to develop methods to enter proper character sequences like "th" though.

    Offline
    • 0
    • Reputation: 0
    • Registered: 21-Jun-2017
    • Posts: 3

    I don’t think putting those digraphs on the AltGr layer is a good idea. It’s surely only somewhat working when your right thumb is agile enough, your space bar is short and/or you are using a wide mod. But even then it’s probably a waste of space. At least G, H, N, O, T and U should be in comfortable positions anyway and  bigrams of those letters should be easy to type.

    Still, I think there is some value to the idea of giving entities like that more attention when designing a keyboard layout than their bare bigram frequency would suggest. I certainly did that when placing P (I’m not using Colemak). I chose a layout that fared slightly worse in my calculations, because it did feel better. The difference probably lay in the SP, PH and also PF bigrams, which can be entities in German and/or English.

    Generally, I’d rather reduce the number of keys needed to write text. I actually got rid of the keys for Ä, Ö, Ü and ß. Instead my ISO key is now a multi-purpose dead key (◌) that allows me to comfortably write these letters (and more) as if they were simple bigrams (◌A etc.). But that may be just me, I just like nice bigrams. Certainly more so than holding down modifier keys. My shift keys are sticky, so uppercase letters also have that simple bigram feel.

    Last edited by Torben (03-Dec-2019 18:23:45)
    Online
    • 0
    • Reputation: 60
    • From: UK
    • Registered: 14-Apr-2014
    • Posts: 696

    It's an interesting idea, but I also don't think it would work well, as I suspect most people don't think of "th" as being a single entity, and certainly most people are unaware of the historic developments that lead to it.

    However, the idea does bring to mind the Welsh alphabet, which still does have some letters pairs which do join to be a single entity. So I can imagine it might make some sense for a native Welsh typist to use AltGr+C for Ch for example.

    Last edited by stevep99 (04-Dec-2019 14:33:46)

    Using Colemak Mod-DH with some additional ergonomic keyboard mods.

    Offline
    • 0
    • Reputation: 111
    • From: Oslo, Norway
    • Registered: 13-Dec-2006
    • Posts: 4,707

    I also don't think it'd be worth it for the majority of typers. It was just a discussion at the Discord that got me thinking. The leap from a letter-by-letter layout to proper stenography which is hella fast but a nightmare to learn from what I gather, is overmuch for most non-professional typists. So maybe some ambitious souls could map some n-grams and save some effort...?

    But for bigrams the answer is pretty much a given I guess: You'd need two strokes or an overlong stretch to save two strokes and that amounts to a zero sum game – unless you can benefit substantially from an ergonomic advantage of the first stroke (the modifier or dead key). Possibly, the bracket keys which are often used for locale letters like æøåüéèñ etc, might hold TH and another digraph... which would make them pinky stretches but with the Wide mod they'd be half-long index stretches instead so maybe? That'd of course only appeal to staunch text typists who don't use brackets as much.

    Or, could the answer be to look towards the most used n>2-grams? Typing THE/AND/ING with two strokes each sounds just a bit nice, and those may be common enough for the dedicated soul to care about. But after those, the tri+gram frequencies drop off rather sharply.

    Well. I don't know. I don't think this is the sort of thing I personally would do unless I were convinced of a substantial benefit which I am not. For one, it makes you even more dependent on your typing software and I'm already too dependent on it...    ̄(=⌒ᆺ⌒=) ̄

    Last edited by DreymaR (06-Dec-2019 10:11:32)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0