"GUI VUI IDEs" - just wanted to see how many capital letters I could squeeze into a title!
Seriously, though, I've been looking at graphical user interfaces that allow you to create voice user interfaces. Sounds like a contradiction in terms, but there are/were several out there. Nuance, when it was still Nuance, had a basic GUI for developing VoiceXML applications. Cisco bought Audium's offering, and there's VoiceObjects as well. Even Eclipse has got in on the act in the Voice Tools Project.
Before I go on, maybe I should confess to the fact that I've written most of my VoiceXML code using nothing more than XMLWriter or UltraEdit. Recently I've moved to Eclipse, as it allows me to develop Java, VoiceXML and JSPs under one roof. In particular it helps with the integration of the VoiceXML front-ends with the Java back-ends - especially when the new back-end builts are continually being delivered! This approach is very much VoiceXML as another web front-end. The core application and logic is Java, and VoiceXML happens to be one of the front-ends. Obviously using Eclipse to write the xHTML for the web-interface for same application is no problem. So the emphasis is largely on developing, integrating, deploying and testing web applications (be they xHTML, VoiceXML or WAP) in a unified environment.
Then along come GUIs - even in Eclipse! Now I've seen GUIs for IVR before - and in many ways they are fine for DTMF applications, that generally have a rather linear flow. However, I've always had my suspicions that they may not be ideal for speech recognition applications, or even complicated DTMF applications.
Maybe I'm just a control freak - but I like to write pure VoiceXML myself. For a start VoiceXML is easy. It's so easy I have difficulty understanding why a GUI is even needed. Sure, a GUI ensures that you can't forget a '/>' at the end of a tag - but so does a good parser. I'm also not sure if GUIs actually offer better overviews of the call flow - they seem to mix design and realisation into one. Generally we use a dialog design document to specify how the call-flow works. It describes each question and answer (dialog state) that the caller is expected to handle, as well as the movement between these states. The call-flow itself resembles the state chart XML that is currently being specified. However, although the state flow resembles VoiceXML forms or fields, they may actually be implemented using a combination of forms, blocks and fields - in particular if there is a lot of interface customisation because of user profiling. If an interface is complicated then it's going to be complicated whether it's pure VoiceXML or a web of connections between elements in a GUI.
I suppose that's my real point. What are GUIs trying to do? Make it easier for beginners to write applications? Make it easier to get an overview of an application? Ensure consistency in the VoiceXML code? I'm probably missing something here, so feel free to comment and point it out to me!
I know I'm going to sound like a cranky old school teacher, who thinks everyone should learn Latin (which I do ;-), but I think if you want to start with VoiceXML then write some real VoiceXML! First off, you'll find out that it's dead easy and secondly, that it's great!