• Hulu
  • TV
  • Movies
  • More TV. On more devices.
Search
Hulu Tech Blog

Simulating HTTP LiveStreaming (HLS) – A way to ensure video playback works great

July 25th, 2011 by Ludo Antonov

Hulu Plus launched out of beta in November of last year and it’s currently available on a number of mobile and living room connected devices. One of the core delivery protocols that Hulu relies on for video streaming is called HTTP Live Streaming (HLS). It’s documented by Apple Inc. (see IETF) and HLS is now widely used—already available as a video delivery vehicle on some of the major devices on the market, like the iPhone/iPad, Sony PlayStation®3, Roku, Android 3.0, and more.

In a nutshell, the HLS protocol delivers video over HTTP via a playlist of small segments that are made available in a variety of bitrates from one or more delivery servers. This allows the playback engine to switch on a segment-by-segment basis between different bitrates and content delivery networks (CDN). It helps compensate for some of the network variances and infrastructure failures that might occur during playback.

But what determines a quality viewing experience—and uninterrupted video playback? That largely depends on the client’s playback engine. Different device platforms usually have different playback engines, each with its own implementation of the HLS protocol. Such protocol implementations often differ not only in completeness of the HLS implementation, but also in streaming and bitrate switching heuristics—like choosing how to act under different network conditions, for example. These differences are especially noticeable with sudden changes in the stability of the network, when playing content on low-bandwidth networks, and with partial failures in the video delivery infrastructure.

Hulu reaches millions of consumers every month, so we’re exposed to a wide spectrum of network conditions. Hitting edge playback scenarios is not uncommon.  We provide a client-facing application, so maintaining the best viewing experience possible, even in suboptimal conditions, ultimately falls as the responsibility of the Hulu Plus app. When a user reports a problem with playback, we need to be able to simulate the user’s environment (network conditions, device, content played) so we can determine the root cause of the problem and find out whether there is a viable solution for it. Sometimes this will result in discovering issues with the playback engines. Unless we can reproduce these sorts of problems, it would be virtually impossible to hand them off to a device manufacturer to fix. Other times, the right solution would be to make the UI more forgiving, to make the app smarter about recovering from unexpected failures. Either way, having the ability to reproduce the troubled scenarios is key to taking appropriate action.

At Hulu, to address the need of being able to reproduce the kinds of scenarios, we recently started working on an infrastructure service called DripLS (abbreviated from Drip LiveStreaming). The purpose of the service is to traffic-shape a video stream in accordance to a set of rules. It’s an attempt to simulate real-world network conditions to help ensure that clients and streaming engines degrade gracefully and deliver the best viewing experience possible. DripLS acts as an intermediary between the server hosting HLS video segments and the HLS client, caching segments that need to be traffic shaped and rewriting the m3u8 playlists that the HLS clients receive. The basic flow of the service is outlined in Figure 1.

Figure 1. DripLS workflow

For example, DripLS allows us to simulate a sudden network drop that will cause the video playback to “stall.” It can simulate missing segments that will cause a playback “skip,” too, or simulate a mid-stream CDN failure, thus exercising CDN fallback scenarios. It’s also capable of serving video files as they would be transmitted on a low-bandwidth or “lossy” network. DripLS has almost countless useful applications for validating video playback, and these are just a few of the ones we’ve been able to capture and experiment with since we built the service. The results have already helped us in making streaming more reliable and resilient to failures and making our client-side monitoring infrastructure more aware of these problems when they occur in production.

How does it work?

DripLS appears as a normal HLS endpoint that can be used directly by any HLS client. This allows the service to be ready for use without additional provisioning by any device that supports HTTP Live Streaming. To achieve the desired traffic shaping, the URL to DripLS can be given a set of rules via its query string, which control how the incoming stream will be shaped. The DripLS URL is in the following format:

http://<dripls-host>/master.m3u8?authkey=<authkey>&cid=<cid>&[r=<rule-expression>~<action>,…]

A sample of how an actual DripLS URL might look like would be:

http://<dripls-host>/master.m3u8?authkey=<authkey>&cid=<cid>&r=650k~e404,1500k.*~e500,cdn1.*.s2~net10loss1

In the example above, the transmitted stream, denoted by cid (content id), is instructed to return an HTTP error code 404 for the variant playlist encoded at 650kbit/s bitrate as well as return HTTP error code 500 for all video segment files in the 1500kbit/s bitrate playlist. Additionally, segment 2 from CDN 1 in all variant bitrate playlists will be transmitted back at 10kb/s with 1% packet loss.

DripLS supports two sets of rule classes: e<> and net<>. Matches from the e<> class result in direct rewrites of URLs in the HLS m3u8 playlists to specific URLs that raise the specified HTTP error code. Matches from the net<> class are a little more involved and result in caching and transmitting the matched segments under the rule specified network conditions.

Technology

DripLS uses a combination of technologies to achieve the desired traffic-shaping effect. Under the hood, the current setup consists of two nginx sites that proxy between each other on different ports, and ultimately forward to a cherrypy server that handles the business logic for DripLS (all on a single machine). The segment request always comes through the first nginx site that listens on port 80, which then proxies to the second nginx site on an arbitrary (already pre-shaped for the segment) port, which ultimately forwards to the cherrypy instance. The reason this setup is needed is that, in order to attain the desired traffic shaping, DripLS makes use of tc (traffic control), netem (Linux kernel module), and iptables (network rule chaining), for which the smallest level of granularity is a port.

The basic architecture of the service is shown on Figure 2.

Figure 2: DripLS architecture

Every time an HLS segment is to be traffic-shaped, it’s done exclusively on a port, which is reserved for the segment transmission to the client. The port is shaped via a small custom shell script (see set_ts_lo.sh), in accordance with the desired traffic-shape rule that the segment matched. The URL for the segment is then rewritten in a way that the front nginx site can do a location proxy_pass to the second nginx site, which would accept the request on the already-shaped port. So when the transmission of the segment’s data starts, netem/iptables will make sure that it adheres to the already applied network rules for the port.

#!/bin/bash

if [ $# -ne 3 ]
then
  echo "Usage: `basename $0` port <speed-limit-in-kbps] [packet-loss]"
  exit $E_BADARGS
fi

hexport=$(echo "obase=16; $1" | bc)
netem_loss_handle="$12"

# Add main classes
/sbin/tc qdisc add dev lo root handle 1: htb
/sbin/tc class add dev lo parent 1: classid 1:1 htb rate 1000000kbps

echo "------- Remove any previous rule"
# Delete any old rules (if rules are missing , failure in these commands can be expected)
/sbin/tc qdisc del dev lo parent 1:$hexport handle $netem_loss_handle
/sbin/tc filter del dev lo parent 1:0 prio $1 protocol ip handle $1 fw flowid 1:$hexport
/sbin/tc class del dev lo parent 1:1 classid 1:$hexport
/sbin/iptables -D OUTPUT -t mangle -p tcp --sport $1 -j MARK --set-mark $1

echo "------- Adding rule"
# Add the new rule
/sbin/tc class add dev lo parent 1:1 classid 1:$hexport htb rate $2kbps ceil $2kbps prio $1
/sbin/tc filter add dev lo parent 1:0 prio $1 protocol ip handle $1 fw flowid 1:$hexport
/sbin/tc qdisc add dev lo parent 1:$hexport handle $netem_loss_handle: netem loss $3%
/sbin/iptables -A OUTPUT -t mangle -p tcp --sport $1 -j MARK --set-mark $1

set_ts_lo.sh – Script to simplify interaction with netem, tc, iptables

Although DripLS can be used as a remote cloud service, running the service and the device on the same network helps avoid “last mile” deviations from normal alterations in the network between the service and the device. Despite this recommendation, running DripLS on a remote network has yielded consistent results so far for us. Simulations via DripLS are an alternative to hardware based testing, which is a common way to validate network alterations. The DripLS approach has several key advantages. Namely it allows multiple developers to use the service at once; it allows precise and consistent simulations; it is easier to test variety of scenarios; it requires little-to-no setup; it can test on any network; and last but not least it is possible and easy to share pre-shaped streams with partners.

Future

Currently, we’re using DripLS mainly for manual testing and ad-hoc reproduction of some interesting playback scenarios. When we receive major device firmware upgrades we touch up on the basics and the more common edge case scenarios using the service. We also want to expand DripLS capabilities with support for additional delivery protocols in the future. We’re in the process of deeply integrating our device tests with DripLS, and also arriving with a more standardized set of Acid tests that we can execute across a variety of devices. These efforts will help us establish a level of confidence that the playback engine on a device—and the Hulu app running on top of it—are able to cope with a variety of network conditions and playback scenarios.

Use it, and make it better

DripLS has been so useful for us, that we decided to share it with the world as an open-source tool. You can find DripLS on GitHub at https://github.com/hulu/DripLS — please feel free to fork, comment, improve, fix, and repurpose as you see fit. We also welcome your comments at the discussion group at http://groups.google.com/group/dripls-dev — please let us know if you use DripLS, how you like it, and what changes you’d like to see.

Ludo Antonov, a software engineer, is building things that make your brain go mushi-mush.
* Header image by Branden Williams, licensed CC-BY 2.0 (see http://www.flickr.com/photos/captbrando/3336992646)

 

Last comment: about 10 hours ago 2 Comments
  • Luke Leatherider says:

    The technical side of things (as described above) is all very interesting, but to the end user it means little. such as when one changes Playback settings, they appear to be saved, yet they don’t actually take effect when playback is initiated in a browser. I say this because our household is on a limited GB per month plan with WildBlue. I prefer “low” quality playback partly because I’m viewing on a relatively small screen and barely discern a difference, but primarily to save bandwidth usage during the month. What gives? I don’t think I am doing something incorrectly (the options are fairly straightforward).

    The second problem is the playback quality of advertising. I can understand that a given advertiser would want to display his product or service in the best light — HD quality, etc., for a large screen. But what really happens on this end is that because the files are large (and thus bandwidth hogs) they sputter along, buffering as they go, and nothing of actual importance or continuity comes across. Bad for the advertiser, frustrating for the end user. Having the option to receive these ads in “Low” quality would be a big plus, not to mention retaining the attention of the end user.

    Reminder: not everyone has the biggest and the best out here. Our connections aren’t the fastest nor our equipment the newest and latest. I was taught in web design classes to consider the user with the lowest denominator, not the highest. I think that still holds true, don’t you?

    Best Regards,

    Lucas Leatherider leatherider@wildblue.net

  • Ludo says:

    Hey Luke, this is a great comment about the fact that video streaming in general should be more efficient and aware of the end consumer’s limitations. Indeed, I agree with you things can be done better on an industry scale. Hulu is just a consumer of the video technologies that every other video site makes use of as well.

    The purpose of the article, and the technology behind it (DripLS) is exactly that – to make sure that the tools for experimenting with different scenarios are readily available and easy to use.

    DripLS has helped Hulu quite a bit in terms of reproducing issues, and reporting back to the OEMs to improve the quality and smartness of their players.

*
*