Updated: February 27, 2024
Preparing Files For Broadcasting On iOS Devices
Previously, in our Part 1, we talked about devs struggling to get the broadcasting available from an iOS device. Now we have direct access to compressed files and this accessibility gave us the freedom we dreamt about. We came up with a qualitatively better method of preparing frames for subsequent broadcasting. In general, we can split this process into three steps.
Step 1. Video capturing. Сam catches the video and the device creates CMSamplebuffer packages with all the media samples and data.
Step 2. Video compressing. Received data is being compressed with the help of VideoToolbox. This process compresses the data within the CMSamplebuffer packages.
Step 3. Converting into NALUs to optimize online streaming.
Let’s Talk The Process Step By Step
We can refer to numerous documents and examples to explain how the first step is being done, but we want to focus your attention on the fact that during this first step process we receive CMSamplebuffer streamline containing uncompressed CMPixelBuffer data.
During the second stage we need to create and tune VTCompressionSessionRef. To compress an input frame we use VTCompressionSessionEncodeFrame function while using CMSampleBuffer as a parameter for the process. Upon finishing the operation the encoder uses a call-back function, which we set up during the initialization of the VTCompressionSessionRef. As a result of this complicated process we receive a new CMSampleBuffer package, that now contains compressed data. It is the very same CMSampleBuffer stream, but it contains CMBlockBuffer structures with a compressed video.
The following step requires us to convert the CMSampleBuffers’ stream into NALUs (Network Abstraction Layer Unit) stream. That is how it’s usually done when working with H.264 encoder. H.264 stream can be made in two different formats – Annex B and AVCC. Apple calls the format most commonly used for streaming Elementary Stream, we call it Annex B. iOS media libraries can work with H.264 stream in the AVCC format (MPEG-4 stream). Regardless of the format, there are 19 various types of NALUs and each NALU can store two types of data: VCL (Video Coding Layer) or meta. Each package can be easily parsed and processed as it has an appropriate descriptive header. The core difference between Annex B and AVCC formats lies in NALUs being splitted into the videostream.
Two Formats – One Way
Annex B doesn’t carry its own size, but starts with a start code. This code is usually 0x000001
or 0X0000001
(3 or 4 bytes). This allows splitting the whole stream into multiple NALUs.
AVCC defines the size of each NALU with a header that precedes the NALU itself. The header is about 4 bytes long, but might be lesser.
Each CMSampleBuffer package with compressed data contains the following:
- Pts (CMTime) – presentation time stamp
- Format description (CMVideoFormatDescription) that apparently carries the description of the format
- Block Buffer (CMBlockBuffer) that contains parts of or a whole compressed frame
A CMSampleBuffer stream is a stream of I-, B-, and P-frames. Each stream may contain one or multiple AVCC format NALUs. Annex B, which is used for transmitting, is a sequence of PPS (Picture Parameter Set), SPS (Sequence Parameter Set), I-frames, B-frames, and P-frames NALUs. The amount of P- and B-frames may vary.
PPS and SPS contain parameters needed for encoding and must precede each I-frame.
For each NALU the length parameter is switched for the start code and then added to the stream. In order to correctly measure the length and amount of NALUs in a stream of CMBlockBuffer data, we use the length header coming forward each NALU.
The correct length is being coded in big-endian format and thus we need to swap its value in order to get correct NALU’s length. When a CMSampleBuffer contains an I-frame, we make PPS and SPS NALUs out of I-frame format description and put them before other NALUs from the CMBlockBuffer within the corresponding CMSampleBuffer.
All about results in a H.264 Annex B format stream that is ready to be broadcast and displayed on other devices!
Read more about Video Streaming here: