• You are not logged in.
  • Index
  • General
  • I'm looking for text analysis program (Cyrillic)

    I'm looking for text analysis program (Cyrillic)

    • Started by pafkata90
    • 11 Replies:
    • Reputation: 1
    • From: Sofia, Bulgaria
    • Registered: 05-Mar-2011
    • Posts: 387

    Can someone give me a link to a website or a program which can analyse text with Cyrillic letters? All of the ones I find in internet can analyse only Latin. I'm working on a layout for my own language (Bulgarian), cause the official layout is based on not very good in my opinion rules. It places the most frequent letters on the index and middle finger regardless of the row, and takes the pressure off the ring and the pinkie. Yeah, that's why I've got a letter with a frequency of use 0.02% on the home row... So far I've found only the frequencies of the letters, but nothing else.
    I'll appreciate any help :)

    PS: At the moment I'm using a phonetic Colemak layout, and it's actually fairly good. It has almost 70% home row usage :D. It's better than the official layout in some parts. I wouldn't go as far as saying it's better but hey... That's why I'm trying to do something here.

    Last edited by pafkata90 (11-Aug-2011 00:12:04)
    Offline
    • 0
    • Reputation: 2
    • From: Aalborg, Denmark
    • Registered: 18-Feb-2011
    • Posts: 166

    Well, 70% sure sounds a lot better than 0.02%!

    What kind of analysis would you like to do besides letter frequency?

    One lazy approach would be to substitute the Cyrillic letters for Latin ones, use a good Latin text analysis program and then substitute back.

    Otherwise, if nothing else shows up, we could get together for that beer and cook something up. I've already done some text analysis programs :-)

    Offline
    • 0
    • Reputation: 1
    • From: Sofia, Bulgaria
    • Registered: 05-Mar-2011
    • Posts: 387

    Very helpful would be to see what are the most common pair of letters. It already came to my mind to convert the text to Latin and run some analysis. But actually I couldn't find text analyser which can give me the most common pairs. Maybe just something's up with me today.
    I've seen the thread you showed me but I don't know how to run these. Maybe we do have to meet for a beer and talk it through :D

    The official layout has a great hand alternation, but I'm afraid that it might have too much row jumps because fairly often used letters are on the top and bottom row on the right side. Much less frequent letters are on more comfortable positions than the bottom row, like D and H.

    Last edited by pafkata90 (11-Aug-2011 19:30:57)
    Offline
    • 0
    • Reputation: 2
    • From: Aalborg, Denmark
    • Registered: 18-Feb-2011
    • Posts: 166

    If anyone is near Aalborg (northern Denmark) (or feels like travelling there :-)  we'll be at Studenterhuset on Wednesday the 17th 4 o'clock for a Colemak meetup :-)

    Offline
    • 0
    • Reputation: 0
    • Registered: 27-Apr-2011
    • Posts: 17

    http://www.sttmedia.com/wordcreator

    I used this FREE one to analyse my corpus of 11 000 000 letters in french, and to find bigrams and trigrams as well.
    It works with greek and russian.


    And I use it in order to create syllabes that i paste in klavaro to increase my typing skills.

    wortgenerator6-max-en.gif

    wortgenerator5-max-en.gif

    Last edited by BvoFRak (12-Aug-2011 12:12:56)
    Offline
    • 0
    • Reputation: 1
    • From: Sofia, Bulgaria
    • Registered: 05-Mar-2011
    • Posts: 387

    Exactly what I was looking for, BvoFRak. A thousand thanks!

    Offline
    • 0
    • Reputation: 1
    • From: Sofia, Bulgaria
    • Registered: 05-Mar-2011
    • Posts: 387

    Hey again. Can any of you help me out a bit with a program that can compare hand alternation and same finger ratio between 2 layouts? The distance would be nice also. I'm just usin my own xl spreadsheets but I don't know how to make such analysis.
    PS: Again - for Cyrillic.

    Last edited by pafkata90 (30-Aug-2011 08:25:19)
    Offline
    • 0
    • Reputation: 23
    • From: Belgium
    • Registered: 26-Feb-2008
    • Posts: 480

    Hi,

    I've used the software from Carpalx (perl) when I created my Rulemak keyboard layout.  I tweaked it a little bit to cope with 8-bit characters, and then it can digest KOI8-R just fine. :-)

    Offline
    • 0
    • Reputation: 1
    • From: Sofia, Bulgaria
    • Registered: 05-Mar-2011
    • Posts: 387

    Hm... I didn't know that such a topic existed in this forum. I've more or less been through all the things discussed there on my own :). Thanks. I'll try the Carpalx software. If anyone's curious I'll give him my Bulgarian-optimized nameless layout :D.

    Offline
    • 0
    • Reputation: 0
    • Registered: 27-Apr-2011
    • Posts: 17

    I don't know if it will work with cyrillic but I anaysed my data with excel 2007
    Clavier.xlsx

    Offline
    • 0
    • Reputation: 1
    • From: Sofia, Bulgaria
    • Registered: 05-Mar-2011
    • Posts: 387

    That's an awesome spreadsheet, thank you. I'll try to find my way in all that French :)

    Edit: Just to let you know - I finished the layout I've been aiming for. It's got 80% hand alternation, 72% home row usage, 8.6% bottom row, much lesser finger travel distance, 0.37% row jumps and 1.9% same finger ratio and quite a lot inward rolls. I'm using it at the moment and I'm quite happy with it. So thanks to everybody who helped :)

    Last edited by pafkata90 (06-Sep-2011 14:31:19)
    Offline
    • 0
    • Reputation: 0
    • Registered: 27-Apr-2011
    • Posts: 17

    You're welcome!

    80% hand alternation; impressive!

    Offline
    • 0
      • Index
      • General
      • I'm looking for text analysis program (Cyrillic)