• You are not logged in.

    Voice Recognition

    • Started by keyboard samurai
    • 11 Replies:
    • Reputation: 2
    • From: Houston, Texas
    • Registered: 03-Jan-2007
    • Posts: 358

    I have heard people say there is no point in changing from QWERTY since
    voice recognition is here (glowing reviews of Dragon Naturally Speaking 9),
    so I found this quote interesting.


    Performance Optimization of Virtual Keyboards said:

    Voice Recognition.

    Speech has been expected to be a compelling alter-
    native to typing. Despite the progress made in speech recognition technol-
    ogy, however, a recent study by Karat, Halverson, Horn, and Karat (1999)
    showed that the effective speed of text entry by continuous speech recogni-
    tion was still far lower than that of the keyboard (13.6 vs. 32.5 corrected
    words per minute [wpm] for transcription and 7.8 vs. 19.0 corrected wpm for
    composition). Furthermore, the study also revealed many human-factors is-
    sues that had not been well understood. For example, many users found it
    “harder to talk and think than type and think” and considered the keyboard
    to be more “natural” than speech for text entry.

    http://www.almaden.ibm.com/u/zhai/paper … Galley.pdf

    so what do you think?

    all keyboards are obsolete?

    Last edited by keyboard samurai (21-Jun-2007 05:05:14)
    Offline
    • 0
    • Reputation: 0
    • From: Köln, Germany
    • Registered: 01-Apr-2007
    • Posts: 264

    Definitely not. I don't believe that voice recognition will rival the keyboard in the near future, probably never. I find it difficult to concentrate when talking to a PC and it feels rather unnatural and ridiculous. If you need a short pause for thinking, you just stop typing. Speech recognition, however, urges you to keep talking consistently and without "umms" and "mmhs".
    I'm not saying accurate recognition of speech is/will be technologically impossible, but I think it's rather inconvenient and unnatural, at least for text input.

    Offline
    • 0
    • Reputation: 0
    • From: NYC
    • Registered: 02-Feb-2007
    • Posts: 104

    we can't always talk to it...there are times when we can't outloud? Now if a computer would be able to read minds and transfer it to the PC, that would be awesome, but unfortunately that's just sci fi for now. Also, another problem I see is the keys like _-*&~`... which are easier to type then pronounce. Also, how about the difference between combination keys and one key, for example, tab (the key) vs tab (the word).

    Last edited by AGK (21-Jun-2007 17:17:53)
    Offline
    • 0
    • Reputation: 214
    • From: Viken, Norway
    • Registered: 13-Dec-2006
    • Posts: 5,362

    I think we'll always want a silent method of input because talking loudly is socially stressful (even subvocalizing). Some time in the future I think that it'll become more of a mindlink thing, but that's far off. I work with functional brain studies (fMRI) myself and have read about MEG some so I have a bit of an idea about this. Not in the next five years I believe. Maybe not in the next twenty for all I know, although there's a lot of research going on at least.

    I also believe that we'll always be using our hands. They're such wonderful instruments. Those keyboards where you can run macros and effects by moving your fingers in certain patterns are interesting.

    Last edited by DreymaR (21-Jun-2007 17:43:47)

    *** Learn Colemak in 2–5 steps with Tarmak! ***
    *** Check out my Big Bag of Keyboard Tricks for Win/Linux/TMK... ***

    Online
    • 0
    • Reputation: 0
    • Registered: 21-Dec-2006
    • Posts: 17

    I do not like Voice Recognition because it is not perfect. Just see the next link:

    Microsoft Windows Vista Speech "Wreck Ignition"

    Last edited by R2D2! (22-Jun-2007 00:30:22)

    —R2D2! // Ilhuıtemoc δ

    Offline
    • 0
    • Reputation: 2
    • From: Houston, Texas
    • Registered: 03-Jan-2007
    • Posts: 358

    YOW!    that video is hilarious !!!!


      It would be nice to see someone attempting to demonstrate Dragon Naturally Speaking in the same way for comparison since they appear to be the gold standard.

    Offline
    • 0
    • Reputation: 0
    • Registered: 09-May-2007
    • Posts: 79

    First off, I say Dragon whatever doesn't count, as it's proprietary.  Sphinx, however, would, in my opinion.  Also, recognition of patterns in general is something which the human brain excels at, and in general, requires a lot of processing power when done using other kinds of devices, including, of course, microprocessors.  Keyboards require none.  They are an extremely simple peripheral, relative to the device using them for input.

    The more accurate a recognition app gets, the more processing power it would require, likely...  As of now, though, I don't use any on a regular basis, so I guess I'm probably just putting noise into the signal.  Sorry.  Anyone who uses any- it would be nice to confirm or disprove my theory.

    If anyone has issue with my "Dragon whatever doesn't count..."- it's because everyone has a keyboard if they have a PC- but not $1600 to blow on software, at this point in time.  In addition to that, I just, well, don't like proprietary software...  I digress.

    As always, I'm sorry if I rant a little (just in case) but my brain really loves to type a lot of words when I'm using colemak.  Thanks again, Shai.

    Offline
    • 0
    • Reputation: 0
    • Registered: 04-Apr-2013
    • Posts: 538

    Proof of concept for programming via voice. 

    The greatest advantage (aside from the obvious ones, of course) is the large number of "quick sounds", like "oink", "boink", etc. This allows for many more primitive commands to be mapped, which would otherwise require many keypresses on the keyboard.

    Error correction seems a much smaller deal than you'd think at first, partially because programmers tend to be reasonable about variable names, etc.

    General disadvantages for voice coding:

    • Requires learning of an essentially new language based on monosyllabic words.  Though, as voice recognition becomes more common, I predict that these will become standard in the next few decades.

    • Still inefficient for arbitrary words - better hope yours' are in the dictionary.  I doubt this would ever be solved.

    Disadvantages of the current setup:

    • It's Dragon (which, as ethana suggested, "doesn't count").  The speaker was forced to run Windows in a virtual machine in order to run the program - it doesn't get much more proprietary than that.  Even the speaker's customizations don't seem to be available as of yet.

    • Very expensive - he recommends a $300 microphone, among other things.

    Last edited by lalop (13-Aug-2013 10:36:57)
    Offline
    • 0
    • Reputation: 7
    • Registered: 21-Apr-2010
    • Posts: 818

    @lalop, great link.  Interesting to read a post from 2007 also.

    Voice recognition is getting wider exposure now, thanks to the voice features on smart phones.

    You raise the issue about cost but I'm sure costs will go down, as the software and hardware is already apart of commodity devices.  How can you use Siri in a pub or at the roadside with background noise?  Perhaps you can't,  I'm guessing it's a problem that needs to be cracked to make the thing usable.

    I like the idea of eschewing the keyboard myself.   However being stuck in Windows doesn't sound like fun to me.  Needs must and all.  Would be great if there was an open source alternative.

    It still doesn't look like the easiest thing to do, still a little unnatural, but impressive nonetheless.  Duct tape gets you far!

    Last edited by pinkyache (14-Aug-2013 11:36:37)

    --
    Physicians deafen our ears with the Honorificabilitudinitatibus of their heavenly Panacaea, their sovereign Guiacum.

    Offline
    • 0
    • Reputation: 0
    • Registered: 04-Apr-2013
    • Posts: 538

    Maybe I was being overly optimistic in suggesting voice recognition would become more common - certainly not fast enough to get economies of scale.  The real limiting factor is the learning curve.  We all know how badly humans react to that. I'm guessing this would be last resort even for people suffering with RSI.

    What I do suspect would happen is that we'd get language-creep from lobbying VR users.  We already see similar lexicon in paredit's "slurp" and "barf", so it's not out of the question.  Slowly, the barriers to entry would erode as monosyllabic words become more and more standard.  Though, by then, we'd probably have thought-recognition and all this would be irrelevant.  The median programming tool moves at a glacial pace.

    Offline
    • 0
    • Reputation: 7
    • Registered: 21-Apr-2010
    • Posts: 818

    It's got to be easier than learning how to touch type surely?

    --
    Physicians deafen our ears with the Honorificabilitudinitatibus of their heavenly Panacaea, their sovereign Guiacum.

    Offline
    • 0
    • Reputation: 0
    • Registered: 04-Apr-2013
    • Posts: 538

    I wouldn't think so.  Touch-typing is rather straightforward from the home-row position, and the rest is remembering some 26 keys you're probably already familiar with.  But, not having tried voice recognition yet, I don't know for sure.

    Offline
    • 0