Title: | R Client for the Microsoft Cognitive Services 'Text-to-Speech' REST API |
---|---|
Description: | R Client for the Microsoft Cognitive Services 'Text-to-Speech' REST API, including voice synthesis. A valid account must be registered at the Microsoft Cognitive Services website <https://azure.microsoft.com/services/cognitive-services/> in order to obtain a (free) API key. Without an API key, this package will not work properly. |
Authors: | John Muschelli [aut, cre] |
Maintainer: | John Muschelli <[email protected]> |
License: | GPL-3 |
Version: | 0.6.3 |
Built: | 2025-01-10 04:31:56 UTC |
Source: | https://github.com/jhudsl/mscstts |
Create SSML for Passing to Synthesize
ms_create_ssml( script, voice = NULL, gender = c("Female", "Male"), language = "en-US", escape = FALSE ) ms_voice_info(voice, token = NULL, api_key = NULL, region = NULL)
ms_create_ssml( script, voice = NULL, gender = c("Female", "Male"), language = "en-US", escape = FALSE ) ms_voice_info(voice, token = NULL, api_key = NULL, region = NULL)
script |
A character vector of lines to be spoken |
voice |
full voice name, usually from
|
gender |
Sex of the Speaker |
language |
Language to be spoken,
must be from |
escape |
Should non-standard characters be substituted? Should not
be used if |
token |
An authentication token, base-64 encoded usually from
|
api_key |
Microsoft Cognitive Services API key, if token is not provided. |
region |
Subscription region for your key. See https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/overview |
A character string of the text and SSML markup
ms_create_ssml("hey I really like things & dogs", escape = TRUE) ms_create_ssml("hey I really like things") ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things') ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things', escape = TRUE)
ms_create_ssml("hey I really like things & dogs", escape = TRUE) ms_create_ssml("hey I really like things") ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things') ms_create_ssml('hey I <emphasis level="strong">really like</emphasis> things', escape = TRUE)
Determines if option(ms_tts_key)
or
option(ms_tts_key)
is set.
If not, it stops and returns an error. If so, returns the value.
ms_get_tts_key(api_key = NULL, error = TRUE) ms_have_tts_key(api_key = NULL) ms_set_tts_key(api_key) ms_valid_tts_key(api_key = NULL, region = ms_regions())
ms_get_tts_key(api_key = NULL, error = TRUE) ms_have_tts_key(api_key = NULL) ms_set_tts_key(api_key) ms_valid_tts_key(api_key = NULL, region = ms_regions())
api_key |
Microsoft Cognitive Services API key |
error |
Should the function error if |
region |
Subscription region for your key. See https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/overview |
API key
You can either set the API key using
option(ms_tts_key)
or have it accessible by
api_key = Sys.getenv('MS_TTS_API_KEY")}, or
\code{api_key = Sys.getenv('MS_TTS_API_KEY1")}, or
\code{api_key = Sys.getenv('MS_TTS_API_KEY2")
res = ms_get_tts_key(error = FALSE)
res = ms_get_tts_key(error = FALSE)
Get Microsoft Text To Speech (TTS) or Cognitive Services Token from API Key
ms_get_tts_token(api_key = NULL, region = ms_regions()) ms_auth_url(region = ms_regions()) ms_expired_token(token = NULL)
ms_get_tts_token(api_key = NULL, region = ms_regions()) ms_auth_url(region = ms_regions()) ms_expired_token(token = NULL)
api_key |
Microsoft Cognitive Services API key |
region |
Subscription region for your key. See https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/overview#reference-docs |
token |
An authentication of class |
A list of the request, and token
if (ms_valid_tts_key()) { token = ms_get_tts_token() }
if (ms_valid_tts_key()) { token = ms_get_tts_token() }
Language and Gender to Microsoft Voice Name
ms_language_to_ms_name(language = "en-US", gender = c("Female", "Male")) ms_validate_language_gender(language = "en-US", gender = c("Female", "Male"))
ms_language_to_ms_name(language = "en-US", gender = c("Female", "Male")) ms_validate_language_gender(language = "en-US", gender = c("Female", "Male"))
language |
A language code, see |
gender |
Either Male or Female, not all languages support both genders |
A character string of the name of the voice
ms_language_to_ms_name() ms_validate_language_gender()
ms_language_to_ms_name() ms_validate_language_gender()
List Voices
ms_list_voices(token = NULL, api_key = NULL, region = NULL, ...)
ms_list_voices(token = NULL, api_key = NULL, region = NULL, ...)
token |
An authentication token, base-64 encoded usually from
|
api_key |
Microsoft Cognitive Services API key, if token is not provided. |
region |
Subscription region for your key. See https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/overview |
... |
Additional arguments to send to |
A data.frame
of the names and their long names.
## Not run: if (ms_have_tts_key()) { res = ms_list_voices() } ## End(Not run) ## Not run: res = ms_list_voices(region = "eastus") ## End(Not run)
## Not run: if (ms_have_tts_key()) { res = ms_list_voices() } ## End(Not run) ## Not run: res = ms_list_voices(region = "eastus") ## End(Not run)
Names of Microsoft Locales
ms_locale_names
ms_locale_names
A data.frame
with 2 columns:
the 5 character langugae code, separated by a hyphen, also referred to as a locale
the name of the language
Locales for Microsoft Translate
ms_locales() ms_locale_df() ms_languages() ms_language_codes()
ms_locales() ms_locale_df() ms_languages() ms_language_codes()
A list of Locales and the Speech Voices
This is generated from the JSON output of
ms_list_voices
ms_locales() ms_locale_df()
ms_locales() ms_locale_df()
Detailed Names of Microsoft Locales and Voices
ms_locales_df
ms_locales_df
A data.frame
with 2 columns:
full name with Microsoft Server Speech Text to Speech Voice
in there
short display name
local name
shorter than locale name, but has language code as well
gender of the voice
the 5 character langugae code, separated by a hyphen, also referred to as a locale
sample rate (in Hz) of the voice
Standard or Neural
Read Synthesized output
ms_read_synthesis(output)
ms_read_synthesis(output)
output |
List from |
A Wave Object
The tuneR
package cannot read all different types of
the output here.
Get Microsoft Text To Speech (TTS) or Cognitive Services Token from API Key
ms_synthesize( script, token = NULL, api_key = NULL, gender = c("Female", "Male"), language = "en-US", voice = NULL, output_format = c("raw-16khz-16bit-mono-pcm", "raw-8khz-8bit-mono-mulaw", "riff-8khz-8bit-mono-alaw", "riff-8khz-8bit-mono-mulaw", "riff-16khz-16bit-mono-pcm", "audio-16khz-128kbitrate-mono-mp3", "audio-16khz-64kbitrate-mono-mp3", "audio-16khz-32kbitrate-mono-mp3", "raw-24khz-16bit-mono-pcm", "riff-24khz-16bit-mono-pcm", "audio-24khz-160kbitrate-mono-mp3", "audio-24khz-96kbitrate-mono-mp3", "audio-24khz-48kbitrate-mono-mp3"), escape = FALSE, region = NULL, api = c("tts", "bing"), ... ) ms_region(region = ms_regions()) ms_regions() ms_set_region(region = ms_regions()) ms_synthesize_api_url(api = c("tts", "bing"), region = NULL)
ms_synthesize( script, token = NULL, api_key = NULL, gender = c("Female", "Male"), language = "en-US", voice = NULL, output_format = c("raw-16khz-16bit-mono-pcm", "raw-8khz-8bit-mono-mulaw", "riff-8khz-8bit-mono-alaw", "riff-8khz-8bit-mono-mulaw", "riff-16khz-16bit-mono-pcm", "audio-16khz-128kbitrate-mono-mp3", "audio-16khz-64kbitrate-mono-mp3", "audio-16khz-32kbitrate-mono-mp3", "raw-24khz-16bit-mono-pcm", "riff-24khz-16bit-mono-pcm", "audio-24khz-160kbitrate-mono-mp3", "audio-24khz-96kbitrate-mono-mp3", "audio-24khz-48kbitrate-mono-mp3"), escape = FALSE, region = NULL, api = c("tts", "bing"), ... ) ms_region(region = ms_regions()) ms_regions() ms_set_region(region = ms_regions()) ms_synthesize_api_url(api = c("tts", "bing"), region = NULL)
script |
A character vector of lines to be spoken |
token |
An authentication token, base-64 encoded usually from
|
api_key |
Microsoft Cognitive Services API key, if token is not provided. |
gender |
Sex of the Speaker |
language |
Language to be spoken,
must be from |
voice |
full voice name, usually from
|
output_format |
Format of the output, see https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-migrate-from-bing-speech for more information |
escape |
Should non-standard characters be substituted? Should not
be used if |
region |
Subscription region for your key. See https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/overview#reference-docs |
api |
which API to authorize on, either
|
... |
Additional arguments to send to |
A list of the request, content, token, and 'SSML'.
The content is likely in a binary format and the output depends on the 'output_format' chosen. For example, if the 'output_format' is an 'MP3', then see below example
## Not run: if (ms_have_tts_key()) { res = ms_synthesize( script = "hey, how are you doing? I'm doing pretty good", output_format = "audio-16khz-128kbitrate-mono-mp3") tmp <- tempfile(fileext = ".mp3") writeBin(res$content, con = tmp) mp3 = tuneR::readMP3(tmp) } ## End(Not run) ms_regions()
## Not run: if (ms_have_tts_key()) { res = ms_synthesize( script = "hey, how are you doing? I'm doing pretty good", output_format = "audio-16khz-128kbitrate-mono-mp3") tmp <- tempfile(fileext = ".mp3") writeBin(res$content, con = tmp) mp3 = tuneR::readMP3(tmp) } ## End(Not run) ms_regions()
Print method for token
## S3 method for class 'token' print(x, reveal = FALSE, ...) reveal(x, ...)
## S3 method for class 'token' print(x, reveal = FALSE, ...) reveal(x, ...)
x |
an object used to select a method. |
reveal |
Should the token be revealed |
... |
further arguments passed to or from other methods |
x = "asdf" class(x) = "token" print(x) attr(x, "timestamp") = Sys.time() print(x) print(x, reveal = TRUE)
x = "asdf" class(x) = "token" print(x) attr(x, "timestamp") = Sys.time() print(x) print(x, reveal = TRUE)