NASA research in subvocalization
Subvocalization is that little voice in your brain that says the words. Research
on sub-vocalization is conflicting. The issue is whether or not a reader can actually
avoid subvocalization and still understand what the eyes see. Currently the consensus
seems to be that the reader must subvocalize at least faintly. If you want to experiment,
try humming (hmmmm) like a bee while you read a couple of paragraphs.
The danger here is that the mind wanders. Have you ever read a page, reached
the bottom line, and suddenly realized that you don't remember a thing you read.
As your eyes moved across the lines you were thinking about something else. It is
similar to a person who can type a copy of a business letter and talk to you at
the same time. The text seems to go in the eyes and out the fingers without registering
in the brain. The best strategy here may be subvocalize only the key words.
See online :
subvocal speech. A somewhat fluffy article on
NASA’s research on subvocalization analysis does an OK job of talking about
what some of this means. Also read
subvocal speech recognition system (via
Nasa as developed a system to recognize subvocal speech: using sensor on your
throat, they can monitor neural messages from the brain. Even if you are not making
any sound (reading in your head or speaking to yourself), it seems that the brain
is still sending signals to your tongue and vocal chords. Hence what you are saying
to yourself can be recorded.
Speed readers try
to stop subvocalization, but as a fictional device for a multitasking interface,
it’s greatly useful. NASA seems to agree on the interface perspective and,
has developed a simple subvocalizing interface tool.
Just think about the possiblilites: silent speech interfaces that could type
on your pda or laptop as you think, Â controlling your computer without your hands,
silently, or even (remove ':') Â someone recording what you are thinking using hidden
The idea is to detect a person's "whispers" as a way to enable private speech
input. It's called
subvocalization. NASA has figured out how to do it.
Listed below are links to web-logs (blogs) that reference
Subvocalization Becomes a Reality.
A lot of people have been trying to get this right from different angles, and
it has a lot of implications. For starters, it is likely to make speech recognition
quite a bit more efficient, at least from the false-match side. (I think people
will have to learn to use efficient voice recognition online software. In conversation
between people, there is a lot of nearly unconscious filtering going on - ignoring
misstatements, grammatical correction, interpretation of pauses, tone, inflection,
interaction between inflection and quite high-level inferences in what is said,
etc. People are very used to using these clues, and learning to use very literal-minded
online software, even when accurate, will require learning too.)
That said, there is an enormous amount to be said about this research. If it
is commercially useful, not only will voice recognition take off, but interrogation
- be that what one normally thinks of as such, or just a chat with your boss - can
become extremely invasive. The required physical contact will go away - surface
deformation sensors using lasers are becoming quite good, and all of the money going
into facial recognition will help it target.
If this all works out, I wonder if we’ll start seeing training classes in the
art of thinking without subvocalising.
The technology could improve the adoption rate of speech recognition systems
as well. As mentioned in the article, since the speech recognition is done over
“silent signals", noise, traditionally a significant problem for Speech-To-Text
(STT) systems, wouldn’t be such a big factor anymore when it comes to speech recognition
accuracy: we could see huge improvements on this aspect with such a technology,
driving adoption. This could tremendously improve the life of handicapped people.
Moreover, subvocalization, apart from reducing the number of loud cell phone users,
would also make it more socially acceptable to “talk” to your computer, which has
also been seen as an obstacle to STT system adoption: people tend to feel embarrassed
while trying to get their computer to understand what they’re saying in front of
coworkers for example.
MacDevCenter has a
article on Speech recognition and synthesis in Mac OS. They have a short paragraph
on why human-computer interaction via the speech medium isn’t more successful but
they fail to mention that without subvocalization, it’s fairly difficult to interact
with your computer by talking to it when you’re not the only person in the room…
I also wonder if the technology would help people with speech disabilities. I
don’t think that this would do any good for people who were born mute since they
probably don’t know how to subvocalize (but I don’t really know anything about that
so…). If STT accuracy is near perfect, then you could think of coupling the STT
system to a Text-To-Speech (TTS) system performing synthesis of subvocal words thus
giving back the gift of speech to those who lost it. Will this work with people
with speech defects? I don’t know but I think the system has lots of potential.
Obviously, the technology is still very much in its infancy but, according to
NASA (the developer of the technology): the team
plans to build a dictionary of English words recognizable by speech recognition
software. Next, we need speaker implants which would be the counterpart to subvocalize
technology: a speaker system that would vibrate jaw bones or maybe eardrums directly
to allow you to listen to things in perfect confidentiality!