Has continuous speech dictation finally arrived? After testing IBM's ViaVoice 98, the latest revision of the product, I'm beginning to believe it has at least begun to become useful.
Over the past few years, I've tested a number of voice recognition dictation programs from IBM, Dragon Systems, and others. In their early days, the hardware and resource requirements were extensive. IBM's first product for the PC, for example, required not only a huge amount of memory in the days when it cost $50 or more per megabyte, but came with a circuit board you had to install in your system as well (and it promptly began fighting with other components).
Early dictation packages also required what is known as "discrete" speech. In order to use them, you had to leave noticeable gaps between words. It required you to adopt an unnatural speech rhythm to put in the gaps. Then you had to train yourself not to put the gaps in between syllables as well (that's the one that always tripped me up) and teach yourself to think that way, too. The result, for me at least, was slower than the speed at which I could type.
When continuous speech dictation arrived in late 1997, it wasn't well-received. If you saw a review in virtually any publication, you'll recall that the reaction was consistent. Although Dragon Systems' NaturallySpeaking was judged marginally better than IBM's ViaVoice Gold, both left a lot to be desired. Despite long "enrollment" sessions spent reading into the computer, neither product was terribly accurate. Most reviewers' favourite lead was a garbled sentence that bore virtually no relation to what they'd said.
That was then and this is now.
One difference between ViaVoice 98 and the earlier Gold version is that the programmers acknowledge that you cannot run the program well without training it first. I suspect this was more of a marketing decision in the first release because ViaVoice Gold did run marginally better when trained (but sill not well enough to be anything other than frustrating). ViaVoice 98 does give you the option of training later, but strongly recommends you take the time before starting to use it. Once we got through the program's first problem, it paid off.
Not without flaws
I don't want to mislead you. ViaVoice 98 is not without some flaws. While processing the voice training files, it encountered a newer version of Microsoft's C++ runtime library file than it expected, and promptly threw up its hands in disgust. Ironically, this is less likely to happen if you're using Windows 95 than if you've upgraded to Windows 98. After a call to the ViaVoice help centre, located somewhere in Europe (fortunately a toll-free call), and the initially distressing suggestion from "Otto" that ViaVoice may not work in my system, we were able to fix the problem by reinstalling the earlier version of the library file. The best news was that I didn't have to repeat the hour-long training session. The files were still present and processing picked up where it left off.
Once past the problem, the first thing I tried, naturally, was to stress the program. VaVoice 98 Executive edition (see details on others below), allows three dictation modes. You can dictate into Speech Pad, a mini word processor that comes with the program. You can also dictate directly into Microsoft Word if you have at least 48 MB of RAM. The hardest task is the one that allows you to dictate directly into any Windows program that accepts text (such as another word processor, Word Pad, Notepad, an E-mail or spreadsheet program, and so on). I normally use WordPerfect to write columns, so that's what I tried first. Interest results ensued.
To begin with, the degree of accuracy was, compared to other products I've tried, astounding. It wasn't totally error-free and not all proper nouns (such as Corel, for example) were capitalized, but highlighting the incorrect word and saying, "Correct this" instantly brought up a small training/correction window where you can spell out the correct word or select it from a multiple choice list. ViaVoice 98 rarely made the same mistake twice and that, all by itself, was a significant improvement.
The Executive edition also includes a navigation, command, and control vocabulary you can use while dictating or to start programs from the desktop (as well as other commands). The program will switch seamlessly between dictation and command modes. For example, editing commands, such as "select next sentence" or "select three words left" may be issued in the middle of a dictation and the program will detect that you are giving it instructions, rather than intending to have the words appear on screen. Can it make mistakes and put commands in the text or mistake dictated text for commands? Yes, it can; so there's an alternate method of switching from dictation to command mode.
ViaVoice 98 provides an option to use an attention word to make the switch. And here we run into what one IBM spokesperson acknowledged was something the company plans to fix in the next revision. The current command word is "computer" and you can't change it. So, for example, if I wished to pull down a menu to change a font, I could say, "Computer Format Font," and have the dialogue appear. Unfortunately, for someone who writes about computers for a living, this isn't optimum. I'd prefer to nominate another word (such as Pendragon, for example - the name of the computer I use to do most of my work).
ViaVoice 98 also has a couple of other habits I find mildly annoying, such as putting two spaces after a sentence and two returns after a paragraph. These default settings cannot be changed, either, and IBM has also acknowledged that these settings should be more flexible.
Despite these easily fixable quibbles, I've found the program to be fairly well-behaved. It does use a lot of resources and other applications slow down if it's running. The text that appears on screen also lags slightly behind what you're saying, so I've found it's not a good idea to watch the screen while dictating. I either get distracted or fall into the same speech pattern I once used, many, many years ago to dictate to a secretary, which slows me down.
In WordPerfect, whenever you go back and put corrections into a paragraph, ViaVoice 98 issues a code that causes WordPerfect to insert a large drop capital letter at the beginning of the first sentence. Then it resists all attempts to remove the drop cap, which can be annoying. However, in Word and other applications in which I've tried it, there have been few mishaps.