-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Python Text-to-Speech "Speaking addresses with SSML" tutorial #2255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
55 commits
Select commit
Hold shift + click to select a range
c61affc
initial tts ssml commit. TODO readme, tests
48bf7ce
added formatting comments to src
15ee2f4
working on tutorial files
433eae4
fixed readme
18415d7
fixed block comment syntax error
dda7f83
made 4-space tabs
6032dd2
more stylistic spaces and newline at end of file
c207d72
more spaces
3cb7601
fixed block comments
1acc7d4
fixed ambiguous variable name
6c3bc46
still fixing style issues
705191e
more formatting changes re: PR review
befe2e6
renamed tests file to align with GCP convention
1c09aa2
refactored tests to unit tests
e0c2846
removing obsolete testing files
7b6e3b9
fixing Linter bugs
7e7b428
fixing spacing Linter bugs
429837c
per Linter, removed excessive blank lines
527c57f
nailed down final two blank line linter bugs
46006be
fixed all too-long lines except for tests
a9e2c39
fixed newline typo in special character audio test case
1cd4ca9
Linter should be good to go. Fixed final lines-too-long error
871faa1
more Linter errors :)
ff54dfb
line continuation indent fix for Linter
454b2ba
experimenting with line continuation indent to make Linter happy
a55c485
fun with line continuation indentation
01c4876
removing outdated example.mp3 file
9c30bc2
organizing repo. adding resources directory
fb1dc6e
removing pesky example.mp3 file
69d6abf
adding both tagged and un-tagged examples
eec805b
fixing copyright spacing
0c2413e
auto-generated readme
21af343
whoops. forgot to add new readme.
48c7ac9
generalizing HTML ampersand replacement
648a9df
standardizing main
13cf929
consolidating example directories
0e2d519
cleaning up repo & refactored tests
76d5333
fixed indentation in tests
0de4025
cleaning up resources
5d8d148
cleaning up
81004b4
updated ampersand in resource files
e7d7462
added html python3 support
0f12ca2
removed trailing whitespace
b1ff0f3
Refactored tests for pytest
6024dcc
Refactored tests for pytest
a8dd416
Lint
ab4cbb1
removing unneccessary test
7c565f4
removed test docstrings
dac796f
consolidated test ssml
235989c
deleted test mp3 file
cf38eaf
actually consolidating example ssml
f6a3679
more consolidation
fb210d4
consistency check for example ssmls
735bea4
removing tagged examples
2bc55bc
removed None checker in ssml_to_audio
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
.. This file is automatically generated. Do not edit this file directly. | ||
|
||
Google Cloud Text-to-Speech API 'Speaking Addresses with SSML' Tutorial Python Samples | ||
=============================================================================== | ||
|
||
.. image:: https://gstatic.com/cloudssh/images/open-btn.png | ||
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/README.rst | ||
|
||
|
||
This directory contains samples for Google Cloud Text-to-Speech API 'Speaking Addresses with SSML' Tutorial. Google Cloud Text-To-Speech API 'Speaking Addresses with SSML Tutorial'_ enables you to use Speech Synthesis Markup Language (SSML) to speak a text file of addresses. You can embed SSML commands in a string of text to personalize synthetic audio from Cloud Text-to-Speech API. | ||
|
||
|
||
|
||
|
||
.. _Google Cloud Text-to-Speech API 'Speaking Addresses with SSML' Tutorial: https://cloud.google.com/text-to-speech/docs/ssml-tutorial | ||
|
||
Setup | ||
------------------------------------------------------------------------------- | ||
|
||
|
||
Authentication | ||
++++++++++++++ | ||
|
||
This sample requires you to have authentication setup. Refer to the | ||
`Authentication Getting Started Guide`_ for instructions on setting up | ||
credentials for applications. | ||
|
||
.. _Authentication Getting Started Guide: | ||
https://cloud.google.com/docs/authentication/getting-started | ||
|
||
Install Dependencies | ||
++++++++++++++++++++ | ||
|
||
#. Clone python-docs-samples and change directory to the sample directory you want to use. | ||
|
||
.. code-block:: bash | ||
|
||
$ git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git | ||
|
||
#. Install `pip`_ and `virtualenv`_ if you do not already have them. You may want to refer to the `Python Development Environment Setup Guide`_ for Google Cloud Platform for instructions. | ||
|
||
.. _Python Development Environment Setup Guide: | ||
https://cloud.google.com/python/setup | ||
|
||
#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+. | ||
|
||
.. code-block:: bash | ||
|
||
$ virtualenv env | ||
$ source env/bin/activate | ||
|
||
#. Install the dependencies needed to run the samples. | ||
|
||
.. code-block:: bash | ||
|
||
$ pip install -r requirements.txt | ||
|
||
.. _pip: https://pip.pypa.io/ | ||
.. _virtualenv: https://virtualenv.pypa.io/ | ||
|
||
Samples | ||
------------------------------------------------------------------------------- | ||
|
||
Speaking addresses with SSML Tutorial | ||
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ | ||
|
||
.. image:: https://gstatic.com/cloudssh/images/open-btn.png | ||
:target: https://console.cloud.google.com/cloudshell/open?git_repo=https://github.com/GoogleCloudPlatform/python-docs-samples&page=editor&open_in_editor=/tts.py,/README.rst | ||
|
||
|
||
|
||
|
||
To run this sample: | ||
|
||
.. code-block:: bash | ||
|
||
$ python tts.py | ||
|
||
|
||
|
||
|
||
The client library | ||
------------------------------------------------------------------------------- | ||
|
||
This sample uses the `Google Cloud Client Library for Python`_. | ||
You can read the documentation for more details on API usage and use GitHub | ||
to `browse the source`_ and `report issues`_. | ||
|
||
.. _Google Cloud Client Library for Python: | ||
https://googlecloudplatform.github.io/google-cloud-python/ | ||
.. _browse the source: | ||
https://github.com/GoogleCloudPlatform/google-cloud-python | ||
.. _report issues: | ||
https://github.com/GoogleCloudPlatform/google-cloud-python/issues | ||
|
||
|
||
.. _Google Cloud SDK: https://cloud.google.com/sdk/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
|
||
# This file is used to generate README.rst | ||
|
||
product: | ||
name: Google Cloud Text-to-Speech API 'Speaking Addresses with SSML' Tutorial | ||
short_name: Cloud TTS API SSML Addresses Tutorial | ||
url: https://cloud.google.com/text-to-speech/docs/ssml-tutorial | ||
description: > | ||
Google Cloud Text-To-Speech API 'Speaking Addresses with SSML Tutorial'_ enables you to use Speech Synthesis Markup Language (SSML) to speak a text file of addresses. You can embed SSML commands in a string of text to personalize synthetic audio from Cloud Text-to-Speech API. | ||
|
||
setup: | ||
- auth | ||
- install_deps | ||
|
||
samples: | ||
- name: Speaking addresses with SSML Tutorial | ||
file: tts.py | ||
|
||
cloud_client_library: true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
google-cloud-texttospeech==0.4.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<speak>123 Street Ln, Small Town, IL 12345 USA | ||
crowdus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<break time="2s"/>1 Jenny St & Number St, Tutone City, CA 86753 | ||
<break time="2s"/>1 Piazza del Fibonacci, 12358 Pisa, Italy | ||
<break time="2s"/></speak> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
123 Street Ln, Small Town, IL 12345 USA | ||
1 Jenny St & Number St, Tutone City, CA 86753 | ||
1 Piazza del Fibonacci, 12358 Pisa, Italy |
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
# Copyright 2019 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
# [START tts_ssml_address_imports] | ||
from google.cloud import texttospeech | ||
|
||
# For Python 3, instead use: | ||
# import html | ||
import cgi | ||
# [END tts_ssml_address_imports] | ||
|
||
|
||
# [START tts_ssml_address_audio] | ||
def ssml_to_audio(ssml_text, outfile): | ||
# Generates SSML text from plaintext. | ||
# | ||
# Given a string of SSML text and an output file name, this function | ||
# calls the Text-to-Speech API. The API returns a synthetic audio | ||
# version of the text, formatted according to the SSML commands. This | ||
# function saves the synthetic audio to the designated output file. | ||
# | ||
# Args: | ||
# ssml_text: string of SSML text | ||
# outfile: string name of file under which to save audio output | ||
# | ||
# Returns: | ||
# nothing | ||
|
||
# Instantiates a client | ||
client = texttospeech.TextToSpeechClient() | ||
|
||
# Sets the text input to be synthesized | ||
synthesis_input = texttospeech.types.SynthesisInput(ssml=ssml_text) | ||
|
||
# Builds the voice request, selects the language code ("en-US") and | ||
# the SSML voice gender ("MALE") | ||
voice = texttospeech.types.VoiceSelectionParams( | ||
language_code='en-US', | ||
ssml_gender=texttospeech.enums.SsmlVoiceGender.MALE) | ||
|
||
# Selects the type of audio file to return | ||
audio_config = texttospeech.types.AudioConfig( | ||
audio_encoding=texttospeech.enums.AudioEncoding.MP3) | ||
|
||
# Performs the text-to-speech request on the text input with the selected | ||
# voice parameters and audio file type | ||
response = client.synthesize_speech(synthesis_input, voice, audio_config) | ||
|
||
# Writes the synthetic audio to the output file. | ||
with open(outfile, 'wb') as out: | ||
out.write(response.audio_content) | ||
print('Audio content written to file ' + outfile) | ||
# [END tts_ssml_address_audio] | ||
|
||
|
||
# [START tts_ssml_address_ssml] | ||
def text_to_ssml(inputfile): | ||
# Generates SSML text from plaintext. | ||
# Given an input filename, this function converts the contents of the text | ||
# file into a string of formatted SSML text. This function formats the SSML | ||
# string so that, when synthesized, the synthetic audio will pause for two | ||
# seconds between each line of the text file. This function also handles | ||
# special text characters which might interfere with SSML commands. | ||
# | ||
# Args: | ||
# inputfile: string name of plaintext file | ||
# | ||
# Returns: | ||
# A string of SSML text based on plaintext input | ||
|
||
# Parses lines of input file | ||
with open(inputfile, 'r') as f: | ||
raw_lines = f.read() | ||
|
||
# Replace special characters with HTML Ampersand Character Codes | ||
# These Codes prevent the API from confusing text with | ||
# SSML commands | ||
# For example, '<' --> '<' and '&' --> '&' | ||
|
||
# For Python 3, instead use: | ||
# escaped_lines = html.escape(raw_lines) | ||
escaped_lines = cgi.escape(raw_lines) | ||
|
||
# Convert plaintext to SSML | ||
# Wait two seconds between each address | ||
ssml = '<speak>{}</speak>'.format( | ||
escaped_lines.replace('\n', '\n<break time="2s"/>')) | ||
|
||
# Return the concatenated string of ssml script | ||
return ssml | ||
# [END tts_ssml_address_ssml] | ||
|
||
|
||
# [START tts_ssml_address_test] | ||
def main(): | ||
# test example address file | ||
plaintext = 'resources/example.txt' | ||
crowdus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ssml_text = text_to_ssml(plaintext) | ||
ssml_to_audio(ssml_text, 'resources/example.mp3') | ||
# [END tts_ssml_address_test] | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# Copyright 2019 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from tts import text_to_ssml | ||
from tts import ssml_to_audio | ||
|
||
import filecmp | ||
import os | ||
|
||
|
||
def test_text_to_ssml(capsys): | ||
|
||
# Read expected SSML output from resources | ||
with open('resources/example.ssml', 'r') as f: | ||
expected_ssml = f.read() | ||
|
||
# Assert plaintext converted to SSML | ||
input_text = 'resources/example.txt' | ||
tested_ssml = text_to_ssml(input_text) | ||
assert expected_ssml == tested_ssml | ||
|
||
|
||
def test_ssml_to_audio(capsys): | ||
|
||
# Read SSML input from resources | ||
with open('resources/example.ssml', 'r') as f: | ||
input_ssml = f.read() | ||
|
||
# Assert audio file generated | ||
ssml_to_audio(input_ssml, 'test_example.mp3') | ||
assert os.path.isfile('test_example.mp3') | ||
|
||
# Assert audio file generated correctly | ||
assert filecmp.cmp('test_example.mp3', | ||
crowdus marked this conversation as resolved.
Show resolved
Hide resolved
|
||
'resources/expected_example.mp3', | ||
shallow=True) | ||
out, err = capsys.readouterr() | ||
|
||
# Delete test file | ||
os.remove("test_example.mp3") | ||
|
||
# Assert success message printed | ||
assert "Audio content written to file test_example.mp3" in out |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.