Skip to main content

Parse transcription data

Agora utilizes Protocol Buffers (protobuf) to serialize transcription data. Developed by Google, Protobuf is a language-neutral, platform-independent mechanism for serializing structured data. It allows for the efficient and consistent handling of data across different platforms by generating source code in multiple programming languages. For more information about Google protocol buffers, see protobuf.dev.

Prerequisites

To follow this procedure:

  • Have a valid Agora Account.

  • Have a valid Agora project with an app ID and a temporary token or a token server. For details, see Agora account management.

  • Have a computer with access to the internet. If your network has a firewall, follow the steps in Firewall requirements.

  • Join an RTC channel as a host and start streaming.

  • Make sure Real-Time STT is enabled for your app.
  • Install the Protobuf package to generate code classes for displaying transcription text.

Use Protobuf to parse transcription data

Protobuf enables you to generate source code in your preferred coding language, based on the structure specified in the .proto file. Agora provides the following Protobuf template for parsing data:


_25
syntax = "proto3";
_25
_25
package agora.audio2text;
_25
option java_package = "io.agora.rtc.audio2text";
_25
option java_outer_classname = "Audio2TextProtobuffer";
_25
_25
message Text {
_25
int32 vendor = 1;
_25
int32 version = 2;
_25
int32 seqnum = 3;
_25
int32 uid = 4;
_25
int32 flag = 5;
_25
int64 time = 6;
_25
int32 lang = 7;
_25
int32 starttime = 8;
_25
int32 offtime = 9;
_25
repeated Word words = 10;
_25
}
_25
message Word {
_25
string text = 1;
_25
int32 start_ms = 2;
_25
int32 duration_ms = 3;
_25
bool is_final = 4;
_25
double confidence = 5;
_25
}

To parse and display the text in your client:

  1. Copy the Protobuf template to a local .proto file.

  2. In your file, edit the following properties to match your project:

    • package : The source code package namespace.
    • option : The desired language options.
  3. To generate a Protobuf class, run the protoc protocol compiler on your .proto file to generate the code that you use to work with the transcription message type.

    Invoke the protoc compiler as follows:


    _1
    protoc --proto_path=IMPORT_PATH --cpp_out=DST_DIR --java_out=DST_DIR --python_out=DST_DIR --go_out=DST_DIR --ruby_out=DST_DIR --objc_out=DST_DIR --csharp_out=DST_DIR path/to/file.proto

    Agora also provides Protobuf sample code to parse and display transcription text. To obtain the sample code, contact support@agora.io

  4. Use the Protobuf class to read transcription text.

    When transcription text is available, your Video SDK app receives the onStreamMessage callback. Use the generated Protobuf class in you app to read the byte data returned by the callback. Refer to the onStreamMessage callback in the API reference for your platform.

Demo project

Check out the demo project on Github.

vundefined