Skip to main content

Core concepts

Agora Cloud Recording enables you to record video and voice calls or streams in the cloud for storage or on-demand viewing. Cloud Recording works with Voice Calling, Video Calling, Broadcast Streaming and Interactive Live Streaming.

This page introduces the key processes and concepts you need to know to use Cloud Recording.

Using the Agora Console

To use Agora Cloud Recording, create a project in the Agora Console first.

Create project in Agora Console

Agora Console

Agora Console provides an intuitive interface for developers to query and manage their Agora account. After registering an Agora Account, you use the Agora Console to perform the following tasks:

  • Manage the account
  • Create and configure Agora projects and services
  • Get an App ID
  • Manage members and roles
  • Check call quality and usage
  • Check bills and make payments
  • Access product resources

Agora also provides RESTful APIs that you use to implement features such as creating a project and fetching usage numbers programmatically.

Agora Account Management

See Agora account management for details on how to manage all aspects of your Agora account.

General concepts

Agora uses the following basic concepts:

App ID

The App ID is a unique key generated by Agora's platform to identify each project. Each project in your account is assigned its own unique App ID. The App ID is critical for connecting users within your app. It's used to initialize the Agora Engine in your app, and as one of the required keys to create authentication tokens for secure communication. Retrieve your App ID using the Agora Console.

Agora uses the App ID to identify each app and provide billing and other statistical data services.

App IDs are stored on the front-end client and do not provide access control. Projects using only an App ID allow any user with the App ID to join voice and video streams.

For applications requiring access controls, such as those in production environments, choose an App ID + Token mechanism for user authentication when creating a new project. Without an authentication token, your environment is open to anyone with access to your App ID.

App Certificate

An App Certificate is a unique key generated by the Agora Console to secure projects through token authentication. It is required, along with the App ID, to generate a token that proves authorization between your systems and Agora's network. App Certificates are used to generate Video SDK or Signaling authentication tokens.

App Certificates should be stored securely in your backend systems. If your App Certificate is compromised or to meet security compliance requirements, you can invalidate certificates and create new ones through the Agora Console.

Tokens

A token is a dynamic key generated using the App ID, App Certificate, user ID, and expiration timestamp. Tokens authenticate and secure access to Agora's services, ensuring only authorized users can join a channel and participate in real-time communication.

Tokens are generated on your server and passed to the client for use in the Video SDK or Signaling. The token generation process involves digitally signing the App ID, App Certificate, user ID, and expiration timestamp using a specific algorithm, preventing tampering or forgery.

For testing and during development, use the Agora Console to generate temporary tokens. For production environments, implement a token server as part of your security infrastructure to control access to your channels.

For information on setting up a token server for generating and managing tokens, refer to the guide on Secure authentication with tokens.

Channel

In Agora's platform, a channel is a way of grouping users together and is identified by a unique channel name. Users who connect to the same channel can communicate with each other. A channel is created when the first user joins and ceases to exist when the last user leaves.

Channels are created by calling the methods for transmitting real-time data. Agora uses different channels to transmit different types of data:

  • The Video SDK channel is used for transmitting audio or video data.
  • The Signaling channel is used for transmitting messaging or signaling data.

These channels are independent of each other.

Additional services provided by Agora, such as Cloud Recording and Real-Time Speech-To-Text, join the Video SDK channel to provide real-time recording, transmission acceleration, media playback, and content moderation.

User ID

In Agora's platform, the UID is an integer value that is a unique identifier assigned to each user within the context of a specific channel. When joining a channel, you have the choice to either assign a specific UID to the user or pass 0 or null and allow Agora's platform to automatically generate and assign a UID for the user. If two users attempt to join the same channel with the same UID, it can lead to unexpected behavior.

The UID is used by Agora's services and components to identify and manage users within a channel. Developers should ensure that UIDs are properly assigned to prevent conflicts.

Agora SD-RTN™

Agora's core engagement services are powered by its Software-Defined Real-time Network (SD-RTN™), which is accessible and available anytime, anywhere around the world. Unlike traditional networks, the software-defined network is not confined by device, phone numbers, or a telecommunication provider's coverage area. Agora SD-RTN™ has data centers globally, covering over 200 countries and regions. The network delivers sub-second latency and high availability of real-time video and audio anywhere on the globe. With Agora SD-RTN™, Agora can deliver live user engagement experiences in the form of real-time communication (RTC) with the following advantages:

  • Unmatched quality of service
  • High availability and accessibility
  • True scalability
  • Low cost

Cloud recording concepts

Recording modes

Agora Cloud Recording supports three recording modes:

  • Individual recording
  • Composite recording
  • Web page recording

After the recording is complete, the recorded content is uploaded as a TS file to the third-party cloud storage you specified. An M3U8 file is also generated to serve as an index for the corresponding TS file

The working principles of different recording modes and the types of files generated by Cloud Recording are as follows:

Individual recording

In individual recording, the recording service records the audio and video streams of each UID in the channel separately. After the recording is complete, the recording service generates the corresponding audio and video files for each UID.

For example, if there are 3 UIDs in the channel and each UID sends audio and video, then in the individual recording mode, 3 audio files and 3 video files are generated.

Composite recording

In mixed recording, the recording service combines the audio and video of multiple UIDs in the channel into a single audio and video file.

For example, if there are 3 UIDs in the channel and each sends audio and video, the mixed recording mode generates one recording file that includes the audio and video of all UIDs.

Web page recording

In web page recording, the recording service combines the page content and audio of a specified web page into an audio and video file.

Web page recording is commonly used in the following scenarios:

  • In online classrooms, to record the teacher and student audio and video along with courseware, whiteboard, and other visuals.
  • In video conferences, to capture participants' audio and video, as well as whiteboard, PPT, and other visuals.

Transcoding and non-transcoding modes

In individual recording, audio transcoding and non-transcoding modes have different use cases and characteristics.

Individual recording with transcoding: This mode is used in scenarios where unified audio encoding parameters are needed to ensure consistent recording file formats and parameters for easier post-processing and playback. It is commonly used in cases requiring high compatibility and standardized output, such as wide player support and standardized storage.

Individual recording without transcoding: This mode is used when the original audio encoding parameters must be preserved to maintain the sound quality and performance. It is often used in scenarios with high demands for real-time performance and original sound quality, such as high-fidelity audio recording.

FeatureIndividual recording with transcodingIndividual recording without transcoding
Transcoding during audio encodingYesYes
Raw audio dataThe sampling rate, number of channels, and bitrate are fixed at 48 kHz, mono, and 48 Kbps respectively.The bitrate, sampling rate and number of channels are determined by the audio encoding parameters of the streaming end AudioProfile.
Audio encoding formatLC-AACDetermined by the configuration of the source end AudioProfile
Generated recording filesEach UID generates an audio file in M3U8 format and multiple audio files in TS format.Same as transcoding recording. If the user stops streaming using muteLocalAudioStream or leaveChannel audio recording stops immediately, and there is no 15 seconds of silent data.
Player compatibilityThe recorded file can be played by any mainstream player that supports the HLS protocol.The audio encoding format is determined by the configuration of the streaming end AudioProfile. Different audio encoding formats have different compatibility.

Delayed transcoding

Delayed transcoding is designed for audio-only recording scenarios. When you enable this mode, the recording service merges and transcodes the audio files of all users in the specified channel into an MP3, M4A, or AAC file within 24 hours after the recording ends (or up to 48 hours in special cases) and uploads it to the specified third-party cloud storage.

Delayed audio mixing

Delayed audio mixing is used for individual audio recording scenarios. To obtain a mixed recording file of all users in the channel after recording, you enable the delayed audio mixing feature when starting individual audio recording without transcoding. Once enabled, the recording service merges and transcodes the audio files of all users in the specified channel into an MP3, M4A, or AAC file within 24 hours after the recording is complete (or up to 48 hours in special cases) and uploads it to the specified third-party cloud storage.

Slicing

Slicing involves cutting audio and video data according to specific rules during the recording process to generate multiple recording files. After slicing, several slice files (such as TS or WebM files) are created, along with M3U8 files that store the indexes of these slice files.

vundefined