AudioEncoding(value)Audio encoding of the audio content sent in the conversational query
request. Refer to the Cloud Speech API
documentation <https://cloud.google.com/speech-to-text/docs/basics>__
for more details.
Values:
AUDIO_ENCODING_UNSPECIFIED (0):
Not specified.
AUDIO_ENCODING_LINEAR_16 (1):
Uncompressed 16-bit signed little-endian
samples (Linear PCM).
AUDIO_ENCODING_FLAC (2):
`FLAC https://xiph.org/flac/documentation.html__
(Free Lossless Audio Codec) is the recommended encoding
because it is lossless (therefore recognition is not
compromised) and requires only about half the bandwidth ofLINEAR16.FLACstream encoding supports 16-bit and
24-bit samples, however, not all fields inSTREAMINFOare supported.
AUDIO_ENCODING_MULAW (3):
8-bit samples that compand 14-bit audio
samples using G.711 PCMU/mu-law.
AUDIO_ENCODING_AMR (4):
Adaptive Multi-Rate Narrowband codec.sample_rate_hertzmust be 8000.
AUDIO_ENCODING_AMR_WB (5):
Adaptive Multi-Rate Wideband codec.sample_rate_hertzmust be 16000.
AUDIO_ENCODING_OGG_OPUS (6):
Opus encoded audio frames in Ogg container
(OggOpus https://wiki.xiph.org/OggOpus__).sample_rate_hertzmust be 16000.
AUDIO_ENCODING_SPEEX_WITH_HEADER_BYTE (7):
Although the use of lossy encodings is not recommended, if a
very low bitrate encoding is required,OGG_OPUSis
highly preferred over Speex encoding. TheSpeex https://speex.org/__ encoding supported by
Dialogflow API has a header byte in each block, as in MIME
typeaudio/x-speex-with-header-byte. It is a variant of
the RTP Speex encoding defined inRFC
5574 https://tools.ietf.org/html/rfc5574`__. The stream is
a sequence of blocks, one block per RTP packet. Each block
starts with a byte containing the length of the block, in
bytes, followed by one or more frames of Speex data, padded
to an integral number of bytes (octets) as specified in RFC
- In other words, each RTP header is replaced with a
single byte containing the block length. Only Speex wideband
is supported.
sample_rate_hertzmust be 16000.