It’s not a new hack but it will work well if you have short texts (1-3 sentences) that you wish to convert to speech.
After all, when returning a response to the Google Assistant, you can use a subset of the Speech Synthesis Markup Language (SSML).
Why?
Because you can make your agent’s responses seem more life-like experience.
How?
Open your terminal and try something like this:
curl "http://www.google.com/speech-api/v1/synthesize?lang=en-us&text=actions+on+google+rock" -o aog-rock-hack.mp3
That’s it.
If you want to use SSML and get fancy, it’s also support it in the request:
https://www.google.com/speech-api/v1/synthesize?lang=es-US&ssml=<speak>Hey<prosody%20rate="slow">how+are+you+doing+this+morning</prosody></speak>
Later you can use it in your Action on Google like that:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<speak> | |
Here are <say-as interpret-as="characters">SSML</say-as> samples. | |
I can pause <break time="3s"/>. | |
I can play a sound | |
<audio src="https://www.example.com/MY_MP3_FILE.mp3"> | |
didn't get your MP3 audio file | |
</audio>. | |
I can speak in cardinals. Your number is | |
<say-as interpret-as="cardinal"> | |
10 | |
</say-as>. | |
Or I can speak in ordinals. You are | |
<say-as interpret-as="ordinal"> | |
10 | |
</say-as> in line. | |
Or I can even speak in digits. The digits for ten are | |
<say-as interpret-as="characters"> | |
10 | |
</say-as>. | |
I can also substitute phrases, like the | |
<sub alias="World Wide Web Consortium"> | |
W3C | |
</sub>. | |
Finally, I can speak a paragraph with two sentences. | |
<p> | |
<s>This is sentence one.</s> | |
<s>This is sentence two.</s> | |
</p> | |
</speak> |
For more hack and tips, check these slides on VUI Design or the official docs on SSML.
Thank for the information. I await for every nugget that you may provide.