beSpokn

 


Even trial balloons have architectural considerations.


My “architectural considerations” such as they are, have led me to build spoken word applications using just three tools: text messaging, email and speech-to-text.

I am pretty sure that my guiding principle here comes from the old adage “keep it simple stupid”,  growing out of my belief that a lot of the things that we see, a lot of the things that we use, including software, are over-embellished. How many of you, for instance, use Word, and use all of its features? You know that could probably design a full Sunday edition of the New York Times as a word document, right?  But who does that?

So, for those of you who are technically-inclined, my initial interest is not to wire up a cloud data resource to a client application that runs in a smart phone. My interest lies simply in leveraging the built-in functions of that phone – specifically the speech-to-text rendering engine –  to plug the output straight into an engine that translates the output more-or-less directly into database elements. I call this a “spoken word interface”.  Anything else dilutes the focus and complicates the execution of functions.  In that context, the client application is garbage.  It’s just stuff that gets in the way, stuff that gives the designer, or the business, an opportunity to do some “branding” work or some other “work” that helps to monetize a campaign.  Like “free” list-keeping apps that tell you where to find a bargain on dish soap twenty miles away… So, for me… no app.  No “container” for functions. And, by the way, no need to “access” proprietary speech-to-text functions from that container application.  Building the container and embedding said proprietary functions would entail licensing.  Who wants to go there?  Whereas using speech-to-text with your text messaging is free and will likely always be.

The goal is to provide access to database functions with as little fanfare as possible, with a bare minimum of effort by the user, with a low bar for learning.  With a graduated path to using and leveraging increasingly sophisticated functions.  At the end of the day, a list-keeping application (first in the list) is a database function where you put a list item into the database, extract a set of list items from the database, delete one or more list items from the database. That’s it.  Now speak to it!

My premise is that there are a lot of things that can be accomplished using spoken word interfaces which need not impose the requirement of being “conversational”.  That seems to be where everybody’s headed.   In this case, you have significantly lowered the bar in terms of designing natural language interpreters, and everything else that comes along with the deconstruction of messages.

VeeDee