Date: prev next · Thread: first prev next last

Re: [libreoffice-accessibility] Re: [libreoffice-users] Re: Sphinx - voice translation - Linux



On 12/2/2014 6:50 PM, Tom Davies wrote:
Hi :)
I am wondering if anyone here uses any speech recognition packages or has
any idea about which ones work well.  I think the preference is for
something working well on Linux but it might be interesting to hear about
others.

I've been using speech recognition since 1994. I currently use NaturallySpeaking 13 and, if you want to use anything other than simple dictation, it's the tool of choice. Sadly, the company has its head up its hind end with regards to how it treats disabled people but they don't hesitate to tell you about their Section 508 compliance but don't mention that it's only if you're working in Microsoft Word.

If you want to work on the Macintosh they do have a product for straight dictation but very little support for creating your own speech user interfaces. People keep trying to use the Microsoft Windows speech recognition environment but, they keep coming back to NaturallySpeaking because of the recognition accuracy and greater ease-of-use.

A few folks who try to do things with Google speech recognition but it's really aimed at dictating limited amount of tasks that you correct by hand on a mobile device. I put this in the maybe someday category especially since Google, like Microsoft and Nuance, doesn't talk to disabled users who really understand how speech changes the system.

The biggest problem with using speech recognition is not the accuracy. It's not even finding a good microphone which, is an expensive process both in terms of time and money, it's the social aspect. Think about how annoying it is to listen to someone talking on their phone hearing only half the conversation and all of the disjointed chaotic forms of speech that humans use with communicating with other humans. Now, imagine something 3 to 4 times worse but you have the sound of a person talking to a computer, correcting mistakes, and giving commands in a way that makes no sense to someone who knows the native language.

I've been wanting speech recognition on linux for a very long time. , I'm not the only one. Back about circa 2000, I outlined the idea that to put speech recognition on other platforms, you should have a dedicated box it does nothing but run the recognition engine, language and acoustic processing etc. with a client-side wedge that makes it possible to communicate the local context back to the recognition environment so that the appropriate actions could be activated by a user supplied grammar.

the primary advantage of this technique is that you would control all of the hardware, audio channel, performance issues in a dedicated box to for speech recognition (which makes it damned inconvenient for everything else) and then you could control anything else for very low-cost. You wouldn't have to worry about porting between different operating systems or hardware architectures. You just had one platform you could control. In theory today, we could be talking about a mobile phone. NaturallySpeaking running takes up roughly 1 to 1.5 GB. Well within the range of a high-end smart phone.

At the time, it was possible but not practical as it would require a dedicated laptop or machine to run the recognition engine. Today, the only technological barrier to implementing this model is the ability to deliver a clean audio stream to a virtual machine and enough hooks in the target OS UI to let you do user interface automation and state detection.

this is not terribly difficult project if you have hands that work well enough to write code freely. However, as is usual for most accessibility projects, 99% of the barriers to implementation are political/social/resource rather than technical. so I pick away at it in my spare time. I make painfully slow progress and after 15 years of thinking about it, I have the start of an implementation for the KVM virtual machine environment. I'm currently being hung up by a problem with using the host USB audio device inside the virtual machine.

So tread lightly in your desire for speech recognition. If I wasn't disabled, I probably wouldn't use it. 10 years ago I would've said otherwise. 10 to 15 years ago more applications were accessible by speech recognition. Today, with all the different toolkits that pay no heed to the needs of speech recognition from users, it's even less usability.

and I think that's enough for now. I will end with this. LO's accessibility for speech recognition users is on a par with most other Windows applications i.e. almost nil. For some reason I don't understand, Select-and-Say capability does work sometimes so there is that small blessing but for the most part, most of the features of LO are not accessible and cannot be made accessible with the current system.

--- eric

--
To unsubscribe e-mail to: accessibility+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/accessibility/
All messages sent to this list will be publicly archived and cannot be deleted

Context


Privacy Policy | Impressum (Legal Info) | Copyright information: Unless otherwise specified, all text and images on this website are licensed under the Creative Commons Attribution-Share Alike 3.0 License. This does not include the source code of LibreOffice, which is licensed under the Mozilla Public License (MPLv2). "LibreOffice" and "The Document Foundation" are registered trademarks of their corresponding registered owners or are in actual use as trademarks in one or more countries. Their respective logos and icons are also subject to international copyright laws. Use thereof is explained in our trademark policy.