public sealed class SpeechTranscriptionConfig : IMessage<SpeechTranscriptionConfig>, IEquatable<SpeechTranscriptionConfig>, IDeepCloneable<SpeechTranscriptionConfig>, IBufferMessage, IMessage
Reference documentation and code samples for the Google Cloud Video Intelligence v1 API class SpeechTranscriptionConfig.
Optional. If set, specifies the estimated number of speakers in the
conversation. If not set, defaults to '2'. Ignored unless
enable_speaker_diarization is set to true.
public bool EnableAutomaticPunctuation { get; set; }
Optional. If 'true', adds punctuation to recognition result hypotheses.
This feature is only available in select languages. Setting this for
requests in other languages has no effect at all. The default 'false' value
does not add punctuation to result hypotheses. NOTE: "This is currently
offered as an experimental service, complimentary to all users. In the
future this may be exclusively available as a premium feature."
public bool EnableSpeakerDiarization { get; set; }
Optional. If 'true', enables speaker detection for each recognized word in
the top alternative of the recognition result using a speaker_tag provided
in the WordInfo.
Note: When this is true, we send all the words from the beginning of the
audio for the top alternative in every consecutive response.
This is done in order to improve our speaker tags as our models learn to
identify the speakers in the conversation over time.
Optional. If true, the top result includes a list of words and the
confidence for those words. If false, no word-level confidence
information is returned. The default is false.
Optional. If set to true, the server will attempt to filter out
profanities, replacing all but the initial character in each filtered word
with asterisks, e.g. "f***". If set to false or omitted, profanities
won't be filtered out.
Required. Required The language of the supplied audio as a
BCP-47 language tag.
Example: "en-US".
See Language Support
for a list of the currently supported language codes.
Optional. Maximum number of recognition hypotheses to be returned.
Specifically, the maximum number of SpeechRecognitionAlternative messages
within each SpeechTranscription. The server may return fewer than
max_alternatives. Valid values are 0-30. A value of 0 or 1 will
return a maximum of one. If omitted, will return a maximum of one.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-07 UTC."],[[["\u003cp\u003eThe \u003ccode\u003eSpeechTranscriptionConfig\u003c/code\u003e class is part of the Google Cloud Video Intelligence v1 API and is used for configuring speech transcription.\u003c/p\u003e\n"],["\u003cp\u003eThis class implements several interfaces, including \u003ccode\u003eIMessage\u003c/code\u003e, \u003ccode\u003eIEquatable\u003c/code\u003e, \u003ccode\u003eIDeepCloneable\u003c/code\u003e, and \u003ccode\u003eIBufferMessage\u003c/code\u003e, and inherits from \u003ccode\u003eobject\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eIt provides properties such as \u003ccode\u003eAudioTracks\u003c/code\u003e, \u003ccode\u003eDiarizationSpeakerCount\u003c/code\u003e, \u003ccode\u003eEnableAutomaticPunctuation\u003c/code\u003e, \u003ccode\u003eEnableSpeakerDiarization\u003c/code\u003e, and others to customize speech transcription settings.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eLanguageCode\u003c/code\u003e property is required and specifies the language of the audio using a BCP-47 language tag.\u003c/p\u003e\n"],["\u003cp\u003eThe latest version available for SpeechTranscriptionConfig is 3.4.0, and several earlier versions are also documented.\u003c/p\u003e\n"]]],[],null,["# Google Cloud Video Intelligence v1 API - Class SpeechTranscriptionConfig (3.4.0)\n\nVersion latestkeyboard_arrow_down\n\n- [3.4.0 (latest)](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/latest/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig)\n- [3.3.0](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/3.3.0/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig)\n- [3.2.0](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/3.2.0/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig)\n- [3.1.0](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/3.1.0/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig)\n- [3.0.0](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/3.0.0/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig)\n- [2.3.0](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/2.3.0/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig)\n- [2.2.0](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/2.2.0/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig) \n\n public sealed class SpeechTranscriptionConfig : IMessage\u003cSpeechTranscriptionConfig\u003e, IEquatable\u003cSpeechTranscriptionConfig\u003e, IDeepCloneable\u003cSpeechTranscriptionConfig\u003e, IBufferMessage, IMessage\n\nReference documentation and code samples for the Google Cloud Video Intelligence v1 API class SpeechTranscriptionConfig.\n\nConfig for SPEECH_TRANSCRIPTION. \n\nInheritance\n-----------\n\n[object](https://learn.microsoft.com/dotnet/api/system.object) \\\u003e SpeechTranscriptionConfig \n\nImplements\n----------\n\n[IMessage](https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.IMessage-1.html)[SpeechTranscriptionConfig](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/latest/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig), [IEquatable](https://learn.microsoft.com/dotnet/api/system.iequatable-1)[SpeechTranscriptionConfig](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/latest/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig), [IDeepCloneable](https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.IDeepCloneable-1.html)[SpeechTranscriptionConfig](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/latest/Google.Cloud.VideoIntelligence.V1.SpeechTranscriptionConfig), [IBufferMessage](https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.IBufferMessage.html), [IMessage](https://cloud.google.com/dotnet/docs/reference/Google.Protobuf/latest/Google.Protobuf.IMessage.html) \n\nInherited Members\n-----------------\n\n[object.GetHashCode()](https://learn.microsoft.com/dotnet/api/system.object.gethashcode) \n[object.GetType()](https://learn.microsoft.com/dotnet/api/system.object.gettype) \n[object.ToString()](https://learn.microsoft.com/dotnet/api/system.object.tostring)\n\nNamespace\n---------\n\n[Google.Cloud.VideoIntelligence.V1](/dotnet/docs/reference/Google.Cloud.VideoIntelligence.V1/latest/Google.Cloud.VideoIntelligence.V1)\n\nAssembly\n--------\n\nGoogle.Cloud.VideoIntelligence.V1.dll\n\nConstructors\n------------\n\n### SpeechTranscriptionConfig()\n\n public SpeechTranscriptionConfig()\n\n### SpeechTranscriptionConfig(SpeechTranscriptionConfig)\n\n public SpeechTranscriptionConfig(SpeechTranscriptionConfig other)\n\nProperties\n----------\n\n### AudioTracks\n\n public RepeatedField\u003cint\u003e AudioTracks { get; }\n\nOptional. For file formats, such as MXF or MKV, supporting multiple audio\ntracks, specify up to two tracks. Default: track 0.\n\n### DiarizationSpeakerCount\n\n public int DiarizationSpeakerCount { get; set; }\n\nOptional. If set, specifies the estimated number of speakers in the\nconversation. If not set, defaults to '2'. Ignored unless\nenable_speaker_diarization is set to true.\n\n### EnableAutomaticPunctuation\n\n public bool EnableAutomaticPunctuation { get; set; }\n\nOptional. If 'true', adds punctuation to recognition result hypotheses.\nThis feature is only available in select languages. Setting this for\nrequests in other languages has no effect at all. The default 'false' value\ndoes not add punctuation to result hypotheses. NOTE: \"This is currently\noffered as an experimental service, complimentary to all users. In the\nfuture this may be exclusively available as a premium feature.\"\n\n### EnableSpeakerDiarization\n\n public bool EnableSpeakerDiarization { get; set; }\n\nOptional. If 'true', enables speaker detection for each recognized word in\nthe top alternative of the recognition result using a speaker_tag provided\nin the WordInfo.\nNote: When this is true, we send all the words from the beginning of the\naudio for the top alternative in every consecutive response.\nThis is done in order to improve our speaker tags as our models learn to\nidentify the speakers in the conversation over time.\n\n### EnableWordConfidence\n\n public bool EnableWordConfidence { get; set; }\n\nOptional. If `true`, the top result includes a list of words and the\nconfidence for those words. If `false`, no word-level confidence\ninformation is returned. The default is `false`.\n\n### FilterProfanity\n\n public bool FilterProfanity { get; set; }\n\nOptional. If set to `true`, the server will attempt to filter out\nprofanities, replacing all but the initial character in each filtered word\nwith asterisks, e.g. \"f\\*\\*\\*\". If set to `false` or omitted, profanities\nwon't be filtered out.\n\n### LanguageCode\n\n public string LanguageCode { get; set; }\n\nRequired. *Required* The language of the supplied audio as a\n[BCP-47](https://www.rfc-editor.org/rfc/bcp/bcp47.txt) language tag.\nExample: \"en-US\".\nSee [Language Support](https://cloud.google.com/speech/docs/languages)\nfor a list of the currently supported language codes.\n\n### MaxAlternatives\n\n public int MaxAlternatives { get; set; }\n\nOptional. Maximum number of recognition hypotheses to be returned.\nSpecifically, the maximum number of `SpeechRecognitionAlternative` messages\nwithin each `SpeechTranscription`. The server may return fewer than\n`max_alternatives`. Valid values are `0`-`30`. A value of `0` or `1` will\nreturn a maximum of one. If omitted, will return a maximum of one.\n\n### SpeechContexts\n\n public RepeatedField\u003cSpeechContext\u003e SpeechContexts { get; }\n\nOptional. A means to provide context to assist the speech recognition."]]