Amazon Transcribe FAQs – Amazon Web Services (Aws) Flashcards
Amazon Transcribe service is designed to handle a wide range of speech and acoustic characteristics, including variations in volume, pitch, and speaking rate. The quality and content of the audio signal (including but not limited to factors such as background noise, overlapping speakers, accented speech, or switches between languages within a single audio file) may affect the accuracy of service output. We are constantly updating the service to improve its ability to accommodate additional acoustic variation and content types.
Q: What else should I know before using Amazon Transcribe service?
Amazon Transcribe supports both 16 kHz and 8kHz audio streams, and multiple audio encodings, including WAV, MP3, MP4 and FLAC.
Q: What kind of inputs does Amazon Transcribe support?
Yes. Amazon Transcribe enables users to open a bidirectional stream over HTTP2. Users can send an audio stream to the service while receiving a text stream in return in real time.
Q: Does Amazon Transcribe support real-time transcriptions?
Streaming transcription currently supports 16-bit Linear PCM encoding.
Q: What encoding does real-time transcription support?
For information on language support, please refer to this documentation page.
Q: What languages does Amazon Transcribe support?
Amazon Transcribe service calls are limited to 4 hours (or 2GB) per API call for our batch service. The streaming service can accommodate open connections up to 4 hours long.
Q: Are there size restrictions on the audio content that Amazon Transcribe can process?
Amazon Transcribe batch service supports .NET, Go, Java, Javascript, PHP, Python and Ruby. Amazon Transcribe real-time service supports Java SDK, Ruby SDK, and C++ SDK. Additional SDK support are coming. For more details, visit the Resources page.
Q: What programming languages does Amazon Transcribe support?
The speech recognition output depends on a number of factors in addition to custom vocabulary entries, so there can be no assurance that if a term is included in the custom vocabulary, it will be correctly recognized. However, the most frequent reason is that a custom word lacks the correct pronunciation. If you haven’t provided a pronunciation for your custom word, please try to create one. If you already have provided one, double-check its correctness, or include other pronunciation variants if necessary. This can be done by creating multiple entries in the custom vocabulary file that differ in the pronunciation field.
Q: My custom vocabulary words are not being recognized! What can I do?
IPA allows for more precise pronunciations. You should provide IPA pronunciations if you are able to generate IPA (e.g., from a lexicon that has IPA pronunciations or an online converter tool).
Q: There are two ways of giving pronunciations, IPA or SoundsLike fields in the custom vocabulary table. Which one is better?
Several standard dictionaries, such as the Oxford English Dictionary or the Cambridge Dictionary (including their online versions) provide pronunciations in IPA. There are also online converters (e.g. easypronunciation.com or tophonetics.com for English) — however, note that in most cases these tools are based on underlying dictionaries and may not generate correct IPA for some words, such as proper names. Amazon Transcribe does not endorse any third-party tools.
Q: I'd like to use IPA but I'm not a linguistic expert. Is there an online tool I can use?
You should use the IPA standard that is appropriate for the audio files you will be processing — e.g., if you are expecting to process audio from British English speakers, use the British English pronunciation standard. The set of allowed IPA symbols may differ for the different languages and dialects supported by Amazon Transcribe; please make sure that your pronunciations contain only the allowed characters. Details on the IPA character sets can be found in the documentation: https://docs.aws.amazon.com/transcribe/latest/dg/how-vocabulary.html#charsets
Q: Do I need to use different IPA standards that are specific to a different accents of the same language? (e.g. US English versus British English)?
You can break a word or phrase down into smaller pieces and provide a pronunciation for each piece using the standard orthography of the language to mimic the way that the word sounds. For example, in English you can provide pronunciation hints for the phrase Los-Angeles like this: loss-ann-gel-es. The hint for the word Etienne would look like this: eh-tee-en. You separate each part of the hint with a hyphen (-). You can use any of the allowed characters for the input language.
Q: How can I provide the pronunciation using SoundsLike field in the custom vocabulary table?
If you use an acronym containing periods, the spelling pronunciation will be generated internally. If you do not use periods, please provide the pronunciation in the pronunciation field. For some acronyms, it is not obvious whether they have a spelling pronunciation or a word-like pronunciation (e.g., NATO is often pronounced ‘n eɪ t oʊ’ (nay-toh) rather than ‘ɛn eɪ ti oʊ’ (N. A. T. O.)).
Q: How do two different ways of providing acronyms (with periods and without periods but with pronunciations) work?
You can find sample input formats and examples in the documentation: https://docs.aws.amazon.com/transcribe/latest/dg/how-vocabulary.html.
Q: Where can I find examples of how to use custom pronunciations?
The system will use the pronunciation you provide; this should increase the likelihood of the word being recognized correctly if the pronunciation is correct and matches what was spoken. If you are not certain you are generating correct IPA, please run a comparison by processing your audio files with a vocabulary that contains your IPA pronunciations, and with a vocabulary that only contains the words (and, optionally, display-as forms). If you do not provide any pronunciations the service will use an approximation, which may or may not work better than your input.
Q: What happens if I use the wrong IPA? If I am uncertain, am I better off not inputting any IPA?
Yes. While phrases may only use a restricted set of characters for the specific language, UTF-8 characters apart from \t (TAB) are permitted in the DisplayAs column.
Q: When using DisplayAs forms, can I display character sets unrelated to the original language being transcribed? (e.g. output “Street” as “街道“)
No, it is only available for batch APIs at this time.
Q: Is Automatic content redaction available with both batch and streaming APIs for Transcribe?
US-English (en-US) is supported at this time.
Q: What languages are supported for Automatic content redaction?
No, this feature does not remove sensitive personal information from the source audio. You can however redact personal information from the source audio yourself using the start and end timestamps that are provided in the redacted transcripts for each instance of an identified PII utterance.
Q: Does Automatic content redaction also redact sensitive personal information from the source audio?
No, Automatic content redaction only works on audio file as an input.
Q: Can I use Automatic content redaction for redacting personal information from the existing text transcripts?
Automatic content redaction is designed to identify and remove personally identifiable information (PII), but due to the predictive nature of machine learning, it may not identify and remove all instances of PII in a transcript generated by the service. You should review any output provided by Automatic content redaction to ensure it meets your needs.
Q: What else should I know before using Automatic content redaction?
Refer to the Amazon Transcribe Pricing page to learn more.
Q: What does it cost?
Please refer to the AWS Global Infrastructure Region Table.
Q: What AWS regions are available for Amazon Transcribe?
Yes. You can use available Delete APIs to delete data and other artifacts associated with transcription jobs. If you have issues doing so, contact AWS support.
Q: Can I delete data and artifacts associated with transcription jobs stored by Amazon Transcribe?
Only authorized employees will have access to your content that is processed by Amazon Transcribe. Your trust, privacy, and the security of your content are our highest priority, and we implement appropriate and sophisticated technical and physical controls, including encryption at rest and in transit, designed to prevent unauthorized access to, or disclosure of, your content and ensure that our use complies with our commitments to you. Please see https://aws.amazon.com/compliance/data-privacy-faq/ for more information.
Q: Who has access to my content that is processed and stored by Amazon Transcribe?
You always retain ownership of your content, and we will only use your content with your consent.
Q: Do I still own my content that is processed and stored by Amazon Transcribe?
Yes, subject to your compliance with the Amazon Transcribe Service Terms, including your obligation to provide any required notices and obtain any required verifiable parental consent under COPPA, you may use Amazon Transcribe in connection with websites, programs, or other applications that are directed or targeted, in whole or in part, to children under age 13.
Q: Can I use Amazon Transcribe in connection with websites, programs or other applications that are directed or targeted to children under age 13 and subject to the Children’s Online Privacy Protection Act (COPPA)?
For information about the requirements of COPPA and guidance for determining whether your website, program, or other application is subject to COPPA, please refer directly to the resources provided and maintained by the United States Federal Trade Commission. This site also contains information regarding how to determine whether a service is directed or targeted, in whole or in part, to children under age 13.
Q: How do I determine whether my website, program, or application is subject to COPPA?
Amazon Transcribe Medical uses advanced machine learning models to accurately transcribe medical speech into text. Transcribe Medical can general text transcripts that can be used to support a variety of use cases, spanning clinical documentation workflow and drug safety monitoring (pharmacovigilance) to subtitling for telemedicine and even contact center analytics in the healthcare and life sciences domains.
Q. What can I do with Amazon Transcribe Medical?
No, you don’t need any ASR or machine learning expertise to use Amazon Transcribe Medical. You only need to call Transcribe Medical’s API, and the service will handle the required machine learning in the backend to transcribe medical speech to text.
Q. Do I need to be an expert in automatic speech recognition (ASR) to use Amazon Transcribe Medical?
You can get started with Amazon Transcribe Medical from the AWS Management console or by using the SDK. Please refer to this technical documentation page for details.
Q. How do I get started with Amazon Transcribe Medical?
Amazon Transcribe Medical currently supports medical transcription in US English.
Q. Which languages does Amazon Transcribe Medical support?
Amazon Transcribe Medical supports transcription for primary care, covering specialties such as family medicine, internal medicine, pediatrics, and OB-GYN.
Q. Which medical specialties does Amazon Transcribe Medical support?
Amazon Transcribe Medical is currently available in US-East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), EU (Ireland), and Asia Pacific (Sydney).
Q. In which AWS regions is Amazon Transcribe Medical available?
Refer to the Amazon Transcribe Medical pricing page to learn more about pricing details.
Q. How is Amazon Transcribe Medical priced?
Yes.
Q. Is Amazon Transcribe Medical HIPAA eligible?
Amazon Transcribe Medical does not use content processed by the service for any reason other than to provide and maintain the service. Content processed by the service is not used to develop or improve the quality for Amazon Transcribe Medical or any other Amazon machine-learning/artificial-intelligence technologies.
Q. Is the content processed by Amazon Transcribe Medical used for any purpose other than to provide the service?
Yes, Amazon Transcribe Medical uses machine learning and is continuously being trained to make it better for customer use cases. Amazon Transcribe medical does not store or use customer data used with the service to train the models
Q. Does Amazon Transcribe Medical learn over time?
Amazon Transcribe Medical is not a substitute for professional medical advice, diagnosis, or treatment. You and your end users are responsible for exercising your and their own discretion, experience, and judgment in determining the correctness, completeness, timeliness, and suitability of any information provided by Amazon Transcribe Medical. You and your end users are solely responsible for any decisions, advice, actions, and/or inactions based on the use of Amazon Transcribe Medical. You are responsible for reviewing any output provided by Amazon Transcribe Medical to ensure it meets your needs.
Q. What else should I know before using the Amazon Transcribe Medical service?