• You are not logged in.

    A few text analysis scripts

    • Started by erw
    • 4 Replies:
    • Reputation: 2
    • From: Aalborg, Denmark
    • Registered: 18-Feb-2011
    • Posts: 166

    When I started learning Colemak I made some Python scripts to analyze text and see how big an advantage Colemak would be.

    Similar stuff has probably been made before, but it was fun to make these, and I figured that maybe someone would find them useful, so here goes:

    https://bitbucket.org/erw/colemak/src

    Note that you need Python 3.x to run them.

    Last edited by erw (17-May-2011 18:23:11)
    Offline
    • 0
    • Reputation: 214
    • From: Viken, Norway
    • Registered: 13-Dec-2006
    • Posts: 5,361

    Cool, thanks for sharing!  :)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Offline
    • 0
    • Reputation: 0
    • Registered: 06-Jul-2011
    • Posts: 22

    I am also a Python programmer and had thought about coding a password generator that would generate passwords that were compatible with both QWERTY and Colemak.

    Unfortunately the Colemak program that I have does not work on login windows :(

    Offline
    • 0
    • Reputation: 2
    • From: Aalborg, Denmark
    • Registered: 18-Feb-2011
    • Posts: 166

    Well, you could do a basic one in one line:

    ''.join(random.choice('qQwWaAzZxXcCvVbBhHmM,<.>/?') for x in range(10))

    ...and also add the digits and special chars.

    You could of course make it generate easier-to-type passwords, for example alternating hands and not too many shifted chars (or maybe include opposite shift key presses in the alternation), but each restriction also reduces the number of possible passwords and thus its security. Online cracking sweeps will by all probability not know you use Colemak and will only use a dictionary anyway, but local cracking attempts (which is more likely anyway if you use it for a Windows login password) or targeted online attacks will likely know that you use Colemak or even know that you use a compatible password, which means you shouldn't make it too short.

    If you use only lowercase letters, no numbers and symbols and a completely random distribution (which a random number generator will approximate but a human never will), a QWERTY/Colemak compatible password would need to be 41% longer (log(26)/log(10)) than a pure QWERTY or Colemak password to have the same strength against a brute force attack. If you use both cases and numbers and special chars, it would only need to be 11% longer (log(2*(26+10+8))/log(2*(10+10+8))).

    In any case, remember to limit or if at all possible to avoid password reuse (a password manager with a strong master password to the rescue (well, for web related passwords at least -- won't do much good for a Windows login password)).

    Some would also argue that since the Mersenne Twister (the random number generator used in Python) is mathematically predictable, it would be wise to use a cryptographically strong RNG, for example from the OpenSSL library (M2Crypto is a good Python wrapper). But again, it depends on the size of your tin foil hat...

    [/Rant]

    P.S. Can you tell I'm majoring in computer security? :-)

    Offline
    • 0
    • Reputation: 2
    • From: Aalborg, Denmark
    • Registered: 18-Feb-2011
    • Posts: 166

    I added a version of my charfreq script in Haskell since I'm learning that. Not really Colemak-related news but I just wanted to share it :-P

    https://bitbucket.org/erw/colemak/src

    Offline
    • 0