VR, RTMP player SGPlayer principle

brief introduction

SGPlayer is a AVPlayer based, FFmpeg media player framework. Support panoramic video, RTMP, RTSP and other live streaming; while supporting iOS, macOS, tvOS three platforms. In this paper, we will introduce the realization principle of the key modules in the way of illustration and explanation.
project address: GitHub – SGPlayer

Initiating reason

On video playback, apple provides AVPlayer performance has a very good performance, in the absence of special needs and resource control, the first choice must be it. But with the rise of VR and live, only a lot of time to use AVPlayer has been unable to meet the demand. For performance reasons, and can not completely abandon the AVPlayer, after all, when there is a clear advantage on demand. In the existing open source projects, the general positioning is relatively simple, and can not take into account AVPlayer, live, VR. As a result, the need to use 3 players to meet the demand, that is, on-demand use of AVPlayer, live using a separate player, VR using a separate player. This deal with 3 different interfaces and callback events, it is really a crash! The appearance of SGPlayer greatly simplifies the process.

Composition structure and playing process

VR, RTMP player SGPlayer principle
SGPlayer play flow chart

The figure shows the SGPlayer playback process and the main components, the following brief introduction to the division of the components of the map


SGPlayer is an abstract player shell, which itself does not have the playback function. As a carrier of interaction with the outside world. The real play is done by the internal SGAVPlayer and SGFFPlayer. The picture is drawn by the internal SGDisplayView.


SGPlayerDecoder is the core of the selection of the play, according to the type of dynamic selection of the use of SGAVPlayer or SGFFPlayer to play, you can change the configuration parameters, from the definition of the core of the selection strategy.


SGAVPlayer is based on the AVPlayer package, the video output to SGDsiplayView, and according to the type of video (panoramic or flat) display. Audio processing by the system without additional operation.


SGFFPlayer is based on the FFmpeg package to support nearly all mainstream video formats. Video screen output to SGDsiplayView. Audio output to SGAudioManager, followed by the use of Audio SGAudioManager to play Unit.


SGDisplayView is responsible for the rendering of video images. It will not draw a video screen itself, only as the parent of the drawing layer to use, the real drawing by the internal AVPlayerLayer and SGGLViewController, the selection of the following table shows the rules.

plane panorama
SGAVPlayer AVPlayerLayer SGGLViewController
SGFFPlayer SGGLViewController SGGLViewController


SGAudioManager is responsible for sound playback and audio event processing. The internal use of AUGraph to do a layer of mixing, through the sound can be set to sound output volume size and other operations.


Understand the function of each component, re – sort out the full playback process

  • SGPlayer received playback request.
  • Distributed by SGPlayerDecoder to SGAVPlayer or SGFFPlayer according to the resource type.
  • If you use SGAVPlayer playback, according to the type of video output to the SGDisplayView in the AVPlayerLayer or SGGLViewController.
  • If you use SGFFPlayer playback, the video screen output to SGDisplayView, audio output to SGAudioManager.

The abstract SGPlayer will really responsible for playing SGAVPlayer and SGFFPlayer shield, so that regardless of what type of foreign resources, only exposed a set of unified interface and callback, will play the difference between core internal digestion, reduce the use cost as much as possible.

Panoramic image theory

Panoramic image and the nature of a flat image is a 2D image, the difference is the carrier of the display. To plan, to show the model is a rectangle, only the pixel on the image corresponding to the rectangle; and panoramic image display model is a ball, to every pixel on the image corresponds to the corresponding location on the sphere. In the drawing process, the difference between the two is not large, only a slight difference in the mapping rules and presentation.

Mapping rules

VR, RTMP player SGPlayer principle
panoramic image mapping rules

The process of sticking a flat image onto a sphere is very similar to that of a globe. The above figure, for example, the left side of each pixel in the picture, you can find the corresponding position on the right side of the sphere. The following lists a key correspondence.

  • All points on the line AB are corresponding to the point J, and all points on the same line CD are corresponding to the point K.
  • Points on the line MN correspond to the points on the equator.
  • Points on the line AC/BD correspond to the points on the front surface of the green meridian.
  • Points on the straight line EF correspond to the points on the second half of the green longitude.

Presentation mode

Viewing angle of panoramic view of VR, RTMP player SGPlayer principle

This image shows the panoramic image of the presentation, different from the plane, the panoramic image needs to be placed in the center of scenic spots, standing in the center of the image on the surface. Finally, the projection of the surface ABCD on the plane ABCD is displayed on the screen.


This part of the content involved in the realization of a lot of OpenGL, you need to have some of the basis of OpenGL. It is also necessary to make distortion correction and dispersion correction to ensure that the image is restored in real time. Specific implementation can view SGGLViewController.

SGFFPlayer operation flow

VR, RTMP player SGPlayer principle
SGFFPlayer operation flow

The figure shows the SGFFPlayer flow chart, the following brief introduction to the various components of the map

Thread model

SGFFPlayer has 4 threads. Corresponding to the 4 blue circles in the figure.

  • Data read – Read Packet Loop
  • Video decoding – Video Decode Loop
  • Video rendering – Video Display Loop
  • Audio playback – Audio Playback Loop

The thread control condition is hidden in the graph. In the 4 threads to complete the entire process of cooperation.


SGVideoDecoder is a video decoder, which can be configured to synchronize, asynchronous decoding, and whether to open the hard solution. Above is the use of asynchronous decoding, the default decoding thread corresponding to the following table.

plane panorama
Software decoding asynchronous synchronization
Hardware decoding asynchronous asynchronous
  • Synchronous decoding immediately after receiving the video packet, and stored in the video frame queue.
  • After receiving the video packet, the asynchronous decoding is only stored in the audio packet queue. When the independent decoding thread takes out the audio packet and completes the decoding, the video frame queue is stored.


SGAudioDecoder is an audio decoder, using synchronous decoding, immediately after receiving the audio packet decoding, and stored in the audio frame queue.

Data queue SGFFPacketQueue, SGFFFrameQueue

  • SGFFPacketQueue is a packet queue that is used to manage the packet before decoding (AVPacket).
  • SGFFFrameQueue is the frame queue used to manage the decoded frames (SGFFVideoFrame or SGFFAudioFrame).

They all support the data synchronization and asynchronous access, synchronization is achieved through the conditional variable (NSCondition). When there is not enough data in the queue, it blocks the current thread until the new element is added to the queue.

Frame multiplexing pool SGFFFramePool

This section is not shown in the figure above, but it avoids some unnecessary performance overhead. Because of the large number of audio and video frames, the 1 minute video contains thousands of frames. If each frame is newly created, it will cause unnecessary waste of resources. The SGFFFramePool created by SGFFFrame will not be released immediately after the completion of the application, but it will be reused for the next use to achieve the purpose of creating only a minimum number of frames.

Audio video synchronization

There are 3 common synchronization

  1. Audio clock
  2. Video clock
  3. Self timer

In SGFFPlayer, the priority of the use of audio clock, when there is no audio tracks in the video, the use of video clock synchronization.


Understand the functions of each component to the video asynchronous decoding, for example, to sort out the entire process

  • The data read thread is read into the data packet and distributed to the audio decoder or video decoder according to the packet type.
  • If an audio packet is received, an audio decoder receives an audio packet while decoding, and the decoded audio frame is stored in the audio frame queue.
  • If the video packet, because the video decoder is asynchronous decoding, only the video packet into the video packet queue, waiting for the video decoding thread to the queue to take the video packet.
  • The video decoding thread loop takes out the video packet from the video packet queue, and decodes the video frame into the video frame queue.
  • The audio playback thread loop takes out the audio frame and plays the audio frame queue.
  • The video shows the thread loop to remove the video frame and draw the video frame queue.

Here the operation of the SGFFPlayer process has been very clear, just in the various links in the corresponding conditions to control, you can complete the playback function.


On the principle of SGPlayer is described here, as the main theory in this paper, so there is no code. Interested students can find all the code on the GitHub. Hope that we can help.