On 12/2/2014 6:50 PM, Tom Davies wrote:
Hi :)
I am wondering if anyone here uses any speech recognition packages or has
any idea about which ones work well. I think the preference is for
something working well on Linux but it might be interesting to hear about
others.
I've been using speech recognition since 1994. I currently use
NaturallySpeaking 13 and, if you want to use anything other than simple
dictation, it's the tool of choice. Sadly, the company has its head up
its hind end with regards to how it treats disabled people but they
don't hesitate to tell you about their Section 508 compliance but don't
mention that it's only if you're working in Microsoft Word.
If you want to work on the Macintosh they do have a product for straight
dictation but very little support for creating your own speech user
interfaces. People keep trying to use the Microsoft Windows speech
recognition environment but, they keep coming back to NaturallySpeaking
because of the recognition accuracy and greater ease-of-use.
A few folks who try to do things with Google speech recognition but
it's really aimed at dictating limited amount of tasks that you correct
by hand on a mobile device. I put this in the maybe someday category
especially since Google, like Microsoft and Nuance, doesn't talk to
disabled users who really understand how speech changes the system.
The biggest problem with using speech recognition is not the accuracy.
It's not even finding a good microphone which, is an expensive process
both in terms of time and money, it's the social aspect. Think about how
annoying it is to listen to someone talking on their phone hearing only
half the conversation and all of the disjointed chaotic forms of speech
that humans use with communicating with other humans. Now, imagine
something 3 to 4 times worse but you have the sound of a person talking
to a computer, correcting mistakes, and giving commands in a way that
makes no sense to someone who knows the native language.
I've been wanting speech recognition on linux for a very long time. ,
I'm not the only one. Back about circa 2000, I outlined the idea that to
put speech recognition on other platforms, you should have a dedicated
box it does nothing but run the recognition engine, language and
acoustic processing etc. with a client-side wedge that makes it possible
to communicate the local context back to the recognition environment so
that the appropriate actions could be activated by a user supplied grammar.
the primary advantage of this technique is that you would control all
of the hardware, audio channel, performance issues in a dedicated box to
for speech recognition (which makes it damned inconvenient for
everything else) and then you could control anything else for very
low-cost. You wouldn't have to worry about porting between different
operating systems or hardware architectures. You just had one platform
you could control. In theory today, we could be talking about a mobile
phone. NaturallySpeaking running takes up roughly 1 to 1.5 GB. Well
within the range of a high-end smart phone.
At the time, it was possible but not practical as it would require a
dedicated laptop or machine to run the recognition engine. Today, the
only technological barrier to implementing this model is the ability to
deliver a clean audio stream to a virtual machine and enough hooks in
the target OS UI to let you do user interface automation and state
detection.
this is not terribly difficult project if you have hands that work well
enough to write code freely. However, as is usual for most accessibility
projects, 99% of the barriers to implementation are
political/social/resource rather than technical. so I pick away at it
in my spare time. I make painfully slow progress and after 15 years of
thinking about it, I have the start of an implementation for the KVM
virtual machine environment. I'm currently being hung up by a problem
with using the host USB audio device inside the virtual machine.
So tread lightly in your desire for speech recognition. If I wasn't
disabled, I probably wouldn't use it. 10 years ago I would've said
otherwise. 10 to 15 years ago more applications were accessible by
speech recognition. Today, with all the different toolkits that pay no
heed to the needs of speech recognition from users, it's even less
usability.
and I think that's enough for now. I will end with this. LO's
accessibility for speech recognition users is on a par with most other
Windows applications i.e. almost nil. For some reason I don't
understand, Select-and-Say capability does work sometimes so there is
that small blessing but for the most part, most of the features of LO
are not accessible and cannot be made accessible with the current system.
--- eric
--
To unsubscribe e-mail to: accessibility+unsubscribe@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/accessibility/
All messages sent to this list will be publicly archived and cannot be deleted
Context
Privacy Policy |
Impressum (Legal Info) |
Copyright information: Unless otherwise specified, all text and images
on this website are licensed under the
Creative Commons Attribution-Share Alike 3.0 License.
This does not include the source code of LibreOffice, which is
licensed under the Mozilla Public License (
MPLv2).
"LibreOffice" and "The Document Foundation" are
registered trademarks of their corresponding registered owners or are
in actual use as trademarks in one or more countries. Their respective
logos and icons are also subject to international copyright laws. Use
thereof is explained in our
trademark policy.