19

I have been testing playing multiple live streams using different players because I wanted to get the lowest latency value. I tried gstreamer player (gst-launch-0.01), mplayer, totem and ffmpeg player (ffplay). I used different configuration values to get the lowest latency for each of them for example:

ffplay -fflags nobuffer 
mplayer -benchmark

The protocol I am streaming with is udp and I am getting a better values with ffplay than mplayer or gst-launch. To be honest, I don't know what kind of configuration I need to do it the gstreamer to get a lower latency. Now, what I need is two things:

  1. I would like to know if someone has a better suggestion about streaming a live stream with lower latency < 100 ms. I am now getting higher than 100 ms which is not really efficient for me.

  2. Since I am using ffplay currently, because it is the best so far. I would like to do a simple gui with a play and record button and 3 screens to stream from different video servers, I just don't know what kind of wrapper (which should be really fast) to use!

9
  • 100ms is a completely unreasonable low amount of latency. Most sound cards can't do that little amount of latency. You need purpose-built hardware for latencies that low, and it won't work over the internet.
    – Brad
    Jan 19, 2014 at 7:41
  • 1
    I don't have sound in my stream it is disabled, anyhow .. where can I buy such a hardware ?
    – user573014
    Jan 20, 2014 at 4:54
  • That's not the point... my point is that 100ms is very low latency, and if you have a requirement of latency this low you are doing something very specialized. I'm suggesting you revisit your requirements so you can get a reasonable solution. Even most of the pro stuff doesn't get that low: vtx.co.uk/product.aspx?id=205 And, that's not over the internet.
    – Brad
    Jan 20, 2014 at 5:09
  • 1
    @Brad VLC streaming is very slow, the latency I get is very high .. that's why I don't want to use it
    – user573014
    Jan 20, 2014 at 6:19
  • 1
    trac.ffmpeg.org/wiki/StreamingGuide has a section on latency..
    – rogerdpack
    Apr 7, 2014 at 21:39

2 Answers 2

16

Well, for a really low latency streaming scenario, you could try NTSC. Its latency can be under 63us (microseconds) ideally.

For digital streaming with quality approaching NTSC and a 40ms latency budget see rsaxvc's answer at 120hz. If you need Over The Air streaming, this is the best low-latency option I've seen and it's very well thought out and the resolution will scale with hardware capability.

If you mean digital streaming and you want good compression ratios, ie 1080p over wifi, then you are out of luck if you want less than 100ms of latency with today's commodity hardware, because in order for a compression algorithm to give a good compression ratio, it needs a lot of context. For example Mpeg 1 used 12 frames in an ipbbpbbpbbpb GOP (group of pictures) arrangement where i is an 'intra' frame which is effectively a jpeg still, a p is a predictive frame which encodes some movements between i and p frames, and b frames encode some spot fixups where the prediction didn't work very well. Anyhow, 12 frames even at 60fps is still 200ms, so that's 200ms just to capture the data, then some time to encode it, then some time to transmit it, then some time to decode it, then some time to buffer the audio so the soundcard doesn't run out of data while the CPU is sending a new block to the DMA memory region, and at the same time 2-3 frames of video need to be queued up to send to the video display in order to prevent tearing on a digital display. So really there's a minimum of 15 frames or 250ms, plus latency incurred in transmission. NTSC doesn't have such latencies because it's transmitted analog with the only 'compression' being two sneaky tricks: interlacing where only half of each frame is transmitted each time as alternate rows, even on one frame, odd on the next, and then the second trick is colour space compression by using 3 black and white pixels plus its phase discrimination to determine what colour is shown, so colour is transmitted at 1/3 the bandwidth of the brightness (luma) signal. Cool eh? And I guess you could say that the audio has a sort of 'compression' as well in that automatic gain control could be used to make a 20dB analog audio signal appear to provide closer to a 60dB experience by blasting our ears out of our heads at commercials due to the AGC jacking up the volume during the 2-3 seconds of silence between the show and the commercial. Later when we got higher fidelity audio circuits, commercials were actually broadcast louder than shows, but that was just their way of providing the same impact as the older TVs had given the advertisers.

This walk down memory lane brought to you by Nostalgia (tm). Buy Nostalgia brand toilet soap! ;-)

Here's the best I've achieved under Ubuntu 18.04 with stock ffmpeg and mpv. This requires a 3rd gen Intel Core processor or later. See ffmpeg site for directions to use NVidia hardware coding instead.

ffmpeg -f x11grab -s 1920x1080 -framerate 60 -i :0.0 \
  -vaapi_device /dev/dri/renderD128 \
  -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' \
  -c:v h264_vaapi -qp:v 26 -bf 0 -tune zerolatency -f mpegts \
  udp://$HOST_IP:12345

And then on the Media box:

mpv --no-cache --untimed --no-demuxer-thread --video-sync=audio \
  --vd-lavc-threads=1 udp://$HOST_IP:12345 

This achieves about 250ms latency for 1080p@60hz at around 3Mbps, which is ok for streaming shows over wifi. mpv can adjust for lip sync (CTRL +- during play). It's tolerable for streaming desktop mouse/keyboard interactions for media control, but it's unusable for real-time gaming (see NVidia Shield, Google Stadia for remote gaming)

One other thing: LCD/OLED/Plasma TVs, and some LCD monitors have Frame Interpolation, either via de-interlacing, or via SmoothVision (the "Soap Opera Effect"). This processing adds input lag. You can usually turn it off in the display's settings, or by connecting to the "PC" or "Console" input port if the display has a port marked that way. Some displays have a way to rename the inputs. In that case, selecting "PC" or "Console" may reduce the input lag, but you may notice colour banding, flickering, etc as a result of the extra processing being turned off.

CRT monitors have effectively zero input lag. But you'll get baked with ionizing radiation. Pick your poison.

8
  • It's pretty common to skip B-frames in order to achieve lower latency. Newer systems skip I and P frames and instead use a hybrid I+P format, where there's a column of I-blocks that moves around over time.
    – rsaxvc
    Apr 6, 2016 at 0:39
  • True, but I said if "you want any kind of efficient compression" and throwing out B and P frames basically turns it into an MJPEG stream. Also I was trying to paint a picture of all the things which contribute to latency, most of which are unavoidable... I forgot to also mention input lag of the display itself which can also be a pretty significant portion of the OP's highly optimistic "<100ms" latency budget. VNC can manage under 100ms, but that's not "video streaming".
    – Wil
    Jun 8, 2018 at 19:57
  • mplayer playing a UDP pipe of intraframe-refresh frames easily beats 100mS over WiFi. 20mS or less for input lag 60mS for compression, transport, decompression 20mS or less for output lag
    – rsaxvc
    Jun 22, 2018 at 14:13
  • Done. I didn't want to post it originally because it's pretty niche.
    – rsaxvc
    Jul 13, 2018 at 3:14
  • wow thanks a lot for this, it's helping me get closer to my goal (think gamestream for emulators) Our framebuffer is very small, so bandwidth is not really a problem. (think 640x480 at most) This is my current profile: dpaste.com/0Y3T6M2 And current results streamable.com/cftfv Still not there but closer. I still have problems with the player side though but I hope I'll get there.
    – Radius
    May 26, 2019 at 3:53
15

The problem with traditional media players like VLC, ffmpeg, and to some extent, mplayer, is that they'll try to play at a consistent framerate, and this requires some buffering, which kills the latency target. The alternative is to render the incoming video as fast as you can, and not care about anything else.

@genpfault and I made a custom UDP protocol, planned for flying RC cars and quads. It's targets low latency at the expense of pretty much everything else(resolution, bitrate, packetrate, compression efficiency). At smaller resolutions, we got it to run over 115200 baud UART and XBEE, but video under those restrictions was not as useful as we'd hoped. Today I'm testing in a 320x240 configuration, running on a laptop(Intel i5-2540M), since I no longer have the original setup.

You need to plan your latency budget, here's where I spent mine:

  1. Acquisition - We picked 125FPS PS3 Eye cameras. So our latency here is at most a little over 8mS. 'Smarter' cameras which do compression onboard(either h264 or MJPEG) are to be avoided. Also, if your camera has any sort of auto-exposure timing, you'll need to disable it to lock it in the fastest framerate, or provide ample lighting(Today, my builtin webcam is only doing 8 FPS due to AE).
  2. Conversion - If possible, have the camera emit frames in a format you can compress directly(Generally YUV format, which the Eye supports natively). Then you can skip this step, but I'm spending 0.1mS here.
  3. Encoding - We used a specially tuned H.264. It takes ~2.5mS, and requires no buffering of future frames, at the cost of compression ratio.
  4. Transport - We used UDP over WiFi, <5mS when working correctly without a bunch of other radios interfering.
  5. Decoding - This is pretty much limited by the receiver's CPU. The encoder can help by sending work that is multithread decodable. This is usually faster than encode. ~1.5mS today.
  6. Conversion - Your decoder might do this step for you, but generally encoders/decoders speak YUV, and displays speak RGB, and someone has to convert between them. 0.1mS on my laptop.
  7. Display - Without VSYNC, a 60 FPS monitor has latency of up to ~17mS, plus some LCD latency, maybe 6ms? It really depends on the display and I'm not sure which panel this laptop has.

The total comes to: 40.2mS.

Encoding:

At the time, X264 was the best H264-AnnexB encoder we could find. We had to control for bitrate, slice-max-size, vbv-bufsize, vbv-maxrate. Start with the defaults for "superfast", and "zerolatency", which will disable B-frames.

Additionally, intra-frame refresh is a must! Effectively this allows chopping up the normal 'I' frame and mingling it up with the following P-frames. Without this, you'll have 'bubbles' in the bitrate demand that will temporarily clog your transport, increasing latency.

Encoding-Transport-Planning:

The encoder was tuned to generate UDP-sized H264 NALUs. This way, when a UDP packet was dropped, an entire H264 NALU was dropped, and we didn't have to resynchronize, the decoder just sort of...burped...and continued with some graphical corruption.

Final Results 320x240

enter image description here

It's...faster than I can measure reliably with a cell phone pointed at a camera pointed at my laptop. Compression ratio 320x240x2B = 150kB/frame, compressed down to a little over 3kB/frame.

7
  • Your panel input lag budget is extremely optimisitic... not impossible but few people will experience anything below 9ms and typical is much much higher, even into the hundreds of ms. But I'll adjust my answer :-)
    – Wil
    Jul 16, 2018 at 22:57
  • 1
    @Wil, could you send me an example of a 'hundreds of ms' panel? I thought most panels didn't do any buffering(or at most a frame for rescaling) and would then be limited by framerate.
    – rsaxvc
    Jul 16, 2020 at 2:32
  • 3
    TV's 3 years old with SmoothVision are a good example. Feed it a 24FPS source and it will use 3 buffered frames to interpolate 60FPS (or higher) output. That results in 125ms of input lag just from the interpolation engine alone. Smoothvision (Soap Opera Effect) is on by default with most TVs. Many people are using TVs as monitors. Recently (last 3 years) awareness is on the rise among gamers. TV manufacturers have begun adding a low-latency 'game' mode (ALLM) and some even support FreeSync to get <8ms input lag.
    – Wil
    Aug 5, 2020 at 19:09
  • As processing resources have gotten cheaper and 8-bit panels (now dithering to 10 and 12 bit per RGB) it's become possible to perform frame processing in less time and in some cases even eliminate a need for buffering more than a few lines of raster, which is how some high-end panels with FreeSync can now start to claim <2ms input lag with interpolation disabled. That will become more common over time. Today, those low-latency panels tend to have poor colour accuracy, sadly.
    – Wil
    Aug 5, 2020 at 19:25
  • 1
    Here's a list of panel response times: displaylag.com/display-database It's easy to get much worse response times than this by feeding a <60hz input to the devices that have interpolation enabled, or a 60hz input with sequential identical frames, ie a game that's fallen below 60fps, a movie playing at 24fps, a console emulator that's locked at 30fps, etc.
    – Wil
    Aug 5, 2020 at 20:37

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.