After getting notice of the HW-H264-Decoder being available to programmers in iOS 8, I want to use it now. There is a nice introduction to 'Direct Access to Video Encoding and Decoding' from WWDC 2014 out there. You can take a look here.

Based on Case 1 there, I started to develop an Application, that should be able to get an H264-RTP-UDP-Stream from GStreamer, sink it into an 'appsink'-element to get direct access to the NAL Units and do the conversion to create CMSampleBuffers, which my AVSampleBufferDisplayLayer can display then.

The interesting piece of code doing all that is the following:

//  GStreamerBackend.m

#import "GStreamerBackend.h"

NSString * const naluTypesStrings[] = {
    @"Unspecified (non-VCL)",
    @"Coded slice of a non-IDR picture (VCL)",
    @"Coded slice data partition A (VCL)",
    @"Coded slice data partition B (VCL)",
    @"Coded slice data partition C (VCL)",
    @"Coded slice of an IDR picture (VCL)",
    @"Supplemental enhancement information (SEI) (non-VCL)",
    @"Sequence parameter set (non-VCL)",
    @"Picture parameter set (non-VCL)",
    @"Access unit delimiter (non-VCL)",
    @"End of sequence (non-VCL)",
    @"End of stream (non-VCL)",
    @"Filler data (non-VCL)",
    @"Sequence parameter set extension (non-VCL)",
    @"Prefix NAL unit (non-VCL)",
    @"Subset sequence parameter set (non-VCL)",
    @"Reserved (non-VCL)",
    @"Reserved (non-VCL)",
    @"Reserved (non-VCL)",
    @"Coded slice of an auxiliary coded picture without partitioning (non-VCL)",
    @"Coded slice extension (non-VCL)",
    @"Coded slice extension for depth view components (non-VCL)",
    @"Reserved (non-VCL)",
    @"Reserved (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",
    @"Unspecified (non-VCL)",

static GstFlowReturn new_sample(GstAppSink *sink, gpointer user_data)
    GStreamerBackend *backend = (__bridge GStreamerBackend *)(user_data);
    GstSample *sample = gst_app_sink_pull_sample(sink);
    GstBuffer *buffer = gst_sample_get_buffer(sample);
    GstMemory *memory = gst_buffer_get_all_memory(buffer);

    GstMapInfo info;
    gst_memory_map (memory, &info, GST_MAP_READ);

    int startCodeIndex = 0;
    for (int i = 0; i < 5; i++) {
        if (info.data[i] == 0x01) {
            startCodeIndex = i;
    int nalu_type = ((uint8_t)info.data[startCodeIndex + 1] & 0x1F);
    NSLog(@"NALU with Type \"%@\" received.", naluTypesStrings[nalu_type]);
    if(backend.searchForSPSAndPPS) {
        if (nalu_type == 7)
            backend.spsData = [NSData dataWithBytes:&(info.data[startCodeIndex + 1]) length: info.size - 4];

        if (nalu_type == 8)
            backend.ppsData = [NSData dataWithBytes:&(info.data[startCodeIndex + 1]) length: info.size - 4];

        if (backend.spsData != nil && backend.ppsData != nil) {
            const uint8_t* const parameterSetPointers[2] = { (const uint8_t*)[backend.spsData bytes], (const uint8_t*)[backend.ppsData bytes] };
            const size_t parameterSetSizes[2] = { [backend.spsData length], [backend.ppsData length] };

            CMVideoFormatDescriptionRef videoFormatDescr;
            OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, parameterSetPointers, parameterSetSizes, 4, &videoFormatDescr);
            [backend setVideoFormatDescr:videoFormatDescr];
            [backend setSearchForSPSAndPPS:false];
            NSLog(@"Found all data for CMVideoFormatDescription. Creation: %@.", (status == noErr) ? @"successfully." : @"failed.");
    if (nalu_type == 1 || nalu_type == 5) {
        CMBlockBufferRef videoBlock = NULL;
        OSStatus status = CMBlockBufferCreateWithMemoryBlock(NULL, info.data, info.size, kCFAllocatorNull, NULL, 0, info.size, 0, &videoBlock);
        NSLog(@"BlockBufferCreation: %@", (status == kCMBlockBufferNoErr) ? @"successfully." : @"failed.");
        const uint8_t sourceBytes[] = {(uint8_t)(info.size >> 24), (uint8_t)(info.size >> 16), (uint8_t)(info.size >> 8), (uint8_t)info.size};
        status = CMBlockBufferReplaceDataBytes(sourceBytes, videoBlock, 0, 4);
        NSLog(@"BlockBufferReplace: %@", (status == kCMBlockBufferNoErr) ? @"successfully." : @"failed.");

        CMSampleBufferRef sbRef = NULL;
        const size_t sampleSizeArray[] = {info.size};

        status = CMSampleBufferCreate(kCFAllocatorDefault, videoBlock, true, NULL, NULL, backend.videoFormatDescr, 1, 0, NULL, 1, sampleSizeArray, &sbRef);
        NSLog(@"SampleBufferCreate: %@", (status == noErr) ? @"successfully." : @"failed.");

        CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sbRef, YES);
        CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
        CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);

        NSLog(@"Error: %@, Status:%@", backend.displayLayer.error, (backend.displayLayer.status == AVQueuedSampleBufferRenderingStatusUnknown)?@"unknown":((backend.displayLayer.status == AVQueuedSampleBufferRenderingStatusRendering)?@"rendering":@"failed"));
            [backend.displayLayer enqueueSampleBuffer:sbRef];
            [backend.displayLayer setNeedsDisplay];


    gst_memory_unmap(memory, &info);

    return GST_FLOW_OK;

@implementation GStreamerBackend

- (instancetype)init
    if (self = [super init]) {
        self.searchForSPSAndPPS = true;
        self.ppsData = nil;
        self.spsData = nil;
        self.displayLayer = [[AVSampleBufferDisplayLayer alloc] init];
        self.displayLayer.bounds = CGRectMake(0, 0, 300, 300);
        self.displayLayer.backgroundColor = [UIColor blackColor].CGColor;
        self.displayLayer.position = CGPointMake(500, 500);
        self.queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
        dispatch_async(self.queue, ^{
            [self app_function];
    return self;

- (void)start
    if(gst_element_set_state(self.pipeline, GST_STATE_PLAYING) == GST_STATE_CHANGE_FAILURE) {
        NSLog(@"Failed to set pipeline to playing");

- (void)app_function
    GstElement *udpsrc, *rtphdepay, *capsfilter;
    GMainContext *context; /* GLib context used to run the main loop */
    GMainLoop *main_loop;  /* GLib main loop */

    context = g_main_context_new ();

    g_set_application_name ("appsink");

    self.pipeline = gst_pipeline_new ("testpipe");

    udpsrc = gst_element_factory_make ("udpsrc", "udpsrc");
    GstCaps *caps = gst_caps_new_simple("application/x-rtp", "media", G_TYPE_STRING, "video", "clock-rate", G_TYPE_INT, 90000, "encoding-name", G_TYPE_STRING, "H264", NULL);
    g_object_set(udpsrc, "caps", caps, "port", 5000, NULL);
    rtphdepay = gst_element_factory_make("rtph264depay", "rtph264depay");
    capsfilter = gst_element_factory_make("capsfilter", "capsfilter");
    caps = gst_caps_new_simple("video/x-h264", "streamformat", G_TYPE_STRING, "byte-stream", "alignment", G_TYPE_STRING, "nal", NULL);
    g_object_set(capsfilter, "caps", caps, NULL);
    self.appsink = gst_element_factory_make ("appsink", "appsink");

    gst_bin_add_many (GST_BIN (self.pipeline), udpsrc, rtphdepay, capsfilter, self.appsink, NULL);

    if(!gst_element_link_many (udpsrc, rtphdepay, capsfilter, self.appsink, NULL)) {
        NSLog(@"Cannot link gstreamer elements");
        exit (1);

    if(gst_element_set_state(self.pipeline, GST_STATE_READY) != GST_STATE_CHANGE_SUCCESS)
        NSLog(@"could not change to ready");

    GstAppSinkCallbacks callbacks = { NULL, NULL, new_sample,
        NULL, NULL};
    gst_app_sink_set_callbacks (GST_APP_SINK(self.appsink), &callbacks, (__bridge gpointer)(self), NULL);

    main_loop = g_main_loop_new (context, FALSE);
    g_main_loop_run (main_loop);

    /* Free resources */
    g_main_loop_unref (main_loop);
    main_loop = NULL;
    g_main_context_unref (context);
    gst_element_set_state (GST_ELEMENT (self.pipeline), GST_STATE_NULL);
    gst_object_unref (GST_OBJECT (self.pipeline));


What I get when running the App and starting to stream to the iOS device:

NALU with Type "Sequence parameter set (non-VCL)" received.
NALU with Type "Picture parameter set   (non-VCL)" received.

Found all data for CMVideoFormatDescription. Creation: successfully..

NALU with Type "Coded slice of an IDR picture (VCL)" received.
BlockBufferCreation: successfully.
BlockBufferReplace: successfully.
SampleBufferCreate: successfully.
Error: (null), Status:unknown

NALU with Type "Coded slice of a non-IDR picture (VCL)" received.
BlockBufferCreation: successfully.
BlockBufferReplace: successfully.
SampleBufferCreate: successfully.
Error: (null), Status:rendering
[...] (repetition of the last 5 lines)

So it seems to decode as it should do, but my problem is, that I could not see anything in my AVSampleBufferDisplayLayer. It might be a problem with the kCMSampleAttachmentKey_DisplayImmediately, but I have set it like I was told to here (see the 'important' note).

Every idea is welcome ;)

  • I am almost done implementing this exact thing but why are you checking for the start code? Isn't that only in byte streams (if it were over TCP). I thought if over RTP(UDP) since it was packetized so a start code is no longer needed. This RFC is where I learned everything I have along this process and it does not mention looking for a start code since it is in packets. I know the video you posted a link to does mention that but I was always confused why they conflicted each other.
    – ddelnano
    Feb 13, 2015 at 17:58
  • I am not sure about what the specification says. But because i use GStreamer before getting access to the stream, and especially specifying NALUs as output, GStreamer could convert it to the anything, that was not in the UDP packets originally. So adding of the startcode may be done by GStreamer even if it was not there in the UDP packet.
    – Zappel
    Mar 5, 2015 at 11:07
  • Am I correct in saying that your code looks for the start code 0x0001 or 0x000001? On your server side that is streaming are you using gstreamer as a command line utiltiy? If so could you show me what command you used?
    – ddelnano
    Mar 5, 2015 at 15:18
  • Neither nor. My code looks for both of them. So it can detect the 3 and 4 byte start code. On the server side, which is a Raspberry Pi in my case, i run the following command: raspivid -t 0 -h 720 -w 1280 -fps 45 -vf -hf -b 6500000 -o - | gst-launch-1.0 -v fdsrc ! h264parse ! rtph264pay ! udpsink host= port=5000
    – Zappel
    Mar 12, 2015 at 22:59

2 Answers 2


Got it working now. The length of each NALU does not contain the length header itself. So i have do subtract 4 from my info.size before using it for my sourceBytes.

  • Hey Zappel! Me and my team working on a pretty basic AVSampleBufferDisplayLayer + AVAssetReader for the past few days. We got great progress but stuck on some issues. We'd love help if possible (paid if needed). Any chance I can contact you some how?
    – Roi Mulia
    Jun 24, 2019 at 22:24

Instructed by your code,I write an program to decode and display a live H.264 stream using AVSampleBufferDisplayLayer.I use live555 rather GSStream to receive H.264 NAL units.

Unfortunately, my app only displays a few frames and then no image can be show any more.Has you app ever met with the same problem?

  • No, my app does not have any of these problems. Did you evaluate the OSStatus Returns of CM Functions? If yes, what are their values? Give us some more input to help you. Maybe you should have a look at thread-safety of your app. Try to implement it first of all in the main thread.
    – Zappel
    Sep 29, 2014 at 10:56

Not the answer you're looking for? Browse other questions tagged or ask your own question.