ETEAM Blog Alexa, How can I Build an Interactive Voice App?

Our interest in capturing and transmitting the human voice is hardly a new phenomenon. As early as the turn of the millennium, we were already experimenting with the limits of voice recognition. As humans, we seek to communicate with others and to be understood. Sound, one of the original modes of transmitting a message, is one pillar in our ability to communicate with other people in a meaningful way.

In the digital age, our use of sound has evolved and been rediscovered in the form of voice recognition apps and software. There has been a massive resurgence in the interest in sound as a means to bridge communication between humans and machines.

 Many of us have noticed the growing market in digital assistants. Browsing the local shop, or online retailers, we can find a number of options including Amazon’s Alexa or Google Home. Sales of such devices have been growing exponentially as people are finding novel ways to add technology to their lives to create greater comfort and convenience.

Any serious app developer should have at least a basic working knowledge of Voice App architecture and not miss out on this future wave!

Primary challenges for developers

At first glance, creating a voice app is a fairly straightforward process. The coding required for fulfillment is, in not fact, overly complex. Below you can find an example where the user asks:

“What’s the temperature like in Kyiv tomorrow?”

dialogflow agent , fulfilment


However, upon further examination we find that the number of platforms available is extremely numerous. This can lead to confusion when agent configurations and fulfillment codes are hosted separately. It can also become frustrating to keep track of all the necessary configuration changes.


platforms for building voice apps


 Platforms for building voice apps


In addition, multiple levels of integration will become necessary for dual functionality as Google Assistant and Amazon Alexa do not use identical platforms. This is not ideal, as we would want to have an app available to users on both platforms to maximize the size of our user base.

 building voice apps


Jovo to the rescue!

Fortunately, there is a simple solution to the aforementioned issues. Jovo is an open-sourced framework which will allow you to build voice apps for both Amazon Alexa and Google Assistant with only one code base.

Using the Jovo CLI you will also be able to -

1. Quickly create and update platform specific interaction and language models

2. Quickly import existing Alexa Skills or Google Actions without having to create two completely separate language models.

3. Access Jovo’s tools, integrations and plugins, which should help to make your life as a voice app developer much easier.


Jovo CLI


By using the Jovo CLI you can easily create new projects, and new language models, which you’ll be able to run locally for testing and new prototypes. Jovo’s Webhook feature can save you a lot of time by allowing you to debug and develop prototypes locally or using a webhosting service.

Jovo’s Universal Language Model

One of the Jovo’s most obvious benefits is its ability to work across both the Amazon and Google platforms. The Jovo Language Model will save you a lot of time by removing the need to manually update your language models on both Alexa (Amazon) and Dialogflow (Google).


Jovo Language Model


Jovo currently has built in support for both Alexa Skills or Google Actions. Either of these can be easily converted from the Jovo Language Model using the Jovo CLI. This allows a voice app developer the tremendous convenience of maintaining only one single language file.

This is a huge blessing to all voice app developers as it will save the enormous headache of having to make individual changes to enable functionality on both platforms. Developers can then focus on creating unique features and improving the user experience rather than spending countless hours on conversions and re-configurations between platforms.

Jovo intents and state machines

Jovo State Machine

Jovo helps to simplify intent routing and state management, which will allow you to get started more quickly and build your app logic while you're improving your voice app.

Limitations of voice apps


Limitations of voice apps


Voice apps have the potential to be a truly revolutionary way to improve productivity and add convenience to our daily lives. However, as useful as voice apps can be in our busy lives and chaotic schedules, it’s important to point out that there are a few limitations.

Voice app developers frequently ask if it’s possible to leave the Microphone function “ON” at all times. Unfortunately, this is not an option which is usually available to third party developers. We strongly recommend that you review the literature on functions currently available for the platform(s) you ultimately choose to work with.

For developers experienced with non-voice apps, the process to publish and update voice apps may seem a bit slower in comparison. Unlike non-voice apps, updating and testing voice apps will require a longer waiting period for both Alexa Skills and Google Actions. Currently, waiting times can be up to several hours. Hopefully this will improve in the not too distant future, as voice apps gain popularity and more support is made available to the Amazon and Google review processes.

Finally, getting your voice app into the world will depend on your own marketing strategy. At the moment the voice app market is not very crowded, so it could be an ideal time to test the waters. As more competitors enter the market, it will become increasingly important to find strategies to build brand awareness and optimize for searches in the marketplace.

Contact Us

If you are looking for further guidance on how to fully integrate your voice app with Amazon Alexa or Google Assistant, please feel free to reach out to us. Our experts will be more than happy to help you reach your objectives.


Get the latest from ETEAM straight to your inbox!

Follow ETEAM