• Hulu
  • TV
  • Movies
  • More TV. On more devices.
Search
Hulu Tech Blog

At a glance: Hulu hits Rails Conf 2012

May 14th, 2012 by Andrew Carter

RailsConf 2012

RailsConf has always been one of my favorite conferences since I started going to it in 2007. I have been primarily a Rails developer since late 2006, long before coming to Hulu. I’ve seen it transition from a promising upstart to the platform it is today. The influence of Rails has been huge. The ideals of Rails – convention over configuration, DRY, test first, and agile are quickly becoming pervasive throughout software development.

As Hulu has been built with Ruby on Rails since the beginning, we decided to give back and sponsor this year’s RailsConf 2012 in Austin. We had seven Hulu employees at the conference. Our web site as well as several services are built with Ruby on Rails.

We gave a talk Building Asynchronous Communication Layer on our project for programatically managing devices like PlayStation 3 and Android mobile phones. The project allows us to do things like automate playback testing. The technology used to build the framework includes XMPP, JavaScript, and Ruby.

Here’s my daily summary of the events:

Day 1

The morning opened with David Heinemeier Hansson’s keynote. He talked about progress both for us as programmers and the Rails community. He stated that it’s ok to disrupt and to not get too comfortable, and emphasized that new programmers to the platform really need to learn even if it is hard or you make mistakes… I’d agree strongly with that. I’m not a fan of creating artificially easy on-ramps. I want to see Rails include anyone that wants to join but at the same time I expect that developers learn the right set of skills. Some of the talk was a bit preachy, but the sentiment of not settling and accepting change are good advice.

The next talk I went to was Sarah Mei’s presentation on Backbone.js. It was a very well done talk. Sarah is an excellent presenter striking a good balance between content and moving the audience through the topic. She gave a quick tour of what Backbone.js is about and did a nice job of relating it to the Rails structure. She did such a good job I was starting to think maybe I should look at this framework much closer as a solution for my own projects, but then at the very end she sort of torpedoed the entire talk saying that she probably wouldn’t use it anymore. I appreciate the honesty but it was kind of a surprise given the strong case she made the previous 45 minutes.

I went to Mark Bates’s talk on Coffee Script as I’m already a believer so this was not really new. It certainly reinforced how great Coffee Script is and I think Mark convinced at least one of my teammates to consider it seriously.

I hit the first major speed bump at Andy Maleh’s talk on Rails engines. He did little to convince me to look at Rails engines. His solutions using engines sounded more like hacks to me as I didn’t see a clear path to code reuse. It felt like trading one complexity for another.

John Bender’s talk on Progressive Enhancement for Mobile Web was next. John is active with JQuery Mobile. He talked a lot about the state of targeting mobile browsers and had some very particular scorn for Android (as he said the “new IE”). One thing I wish he would have talked about more is progressive design since he talked almost exclusively about segregating mobile from desktop browsers. I think it is time to be thinking mobile first with progressive scaling up. It’s disappointing to see separating a mobile site from the main site.

Finally, the closing keynote for the day was by Rich Hickey. Rich talked about simplifying but not from the programmer’s perspective. He had a lot of great points – we often design software to make our lives as programmers easy but not necessarily the user. The tone of his talk though came off a little sanctimonious as I would have liked a little more acknowledgement of the pragmatism that often leads to the decisions we make.

Day 2

Aaron Patterson presented the keynote in the morning. He didn’t try to compete with his over-the-top presentation from last year. Aaron was downright subdued compared to his previous talks and seemed to be encouraging a bit of a back to basics appeal, talking about normalizing things like the queue interface in Rail, etc.

The first talk I went to on the second day by Mike Moore on presenters and decorators. Mike gave a great talk with good takeaways. He did an excellent job breaking down the decorator, presenter, and mediator patterns. It was great timing for me as I’ve been looking at these very things for a project I’m doing now.

Next was Ilya Grigorik on making the web faster. This was a good overall talk on leveraging a number of tools and techniques. Since Ilya is a Google engineer, he was heavily biased to the Google tool kit, but his advice was valid despite what toolkit you might use. A key take-away was focusing on perceived load time for the user over most other performance metrics. It’s a really good point: the user’s experience trumps most anything else.

After lunch, I attended Will Leinweber’s talk on schemaless SQL in Postgres. There are some pretty amazing things coming to PostgreSQL, like the new key value hstore. They are also integrating V8 into the programming space of Postgres which opens up some new scenarios that compete directly with solutions like MongoDB. It’s early for much of this, but it’s a very forward looking approach for Postgres to be taking.

My colleague Steve Jang and I presented a short tour of our automation project for Hulu devices called Bender. We use XMPP, Ruby, and JavaScript to create a communication framework for controlling devices and running scripts. I think the talk went well, but we weren’t sure if the material would be useful to people or not. People had some great questions and so it was great to get a conversation going on projects we have here at Hulu.

Day 3

Yehuda Katz talked about The Next Five Years for Rails. Honestly, Yehuda’s talk should have been the day 3 keynote since I think it was the most honest and informative talk of the entire conference. Yehuda did a great job calling out what it is about Rails that attracted all of us to the platform in the first place. He said (and I agree) that the next big thing should be making Rails just as good for JSON API services as it is for HTML. It’s a mixed world more than ever; building web services whose clients are primarily mobile devices only is increasing.

Nick Quaranto had a great tour of Basecamp Next: Code Spelunking. It was actually nice to see that 37signals has some of the same kinds of tradeoffs in their software as everyone else. One of the big takeaways was to always be pragmatic about things like changing data stores. There were numerous things I want to investigate including all the cool JavaScript console tricks, using GitHub for API documentation (see 37signals BCX API), and the strong parameters gem.

Jared Ning had a nice overview of minitest. If you hadn’t seen it before, it was a good tour of what minitest is about and how it draws from both Test::Unit and RSpec. I’m already in the minitest camp so there wasn’t a lot new for me. I liked the explanation Jared used to talk about mocking and stubbing. I agree with his advice – use mocks and stubs sparingly and only after you test against the real thing.

Summary

Overall, it was another great RailsConf. As a developer that switches among multiple platforms and languages, it’s always great to dive into my favorite language: Ruby. Rails feels to me like it is continuing to mature. It’s moving a little slower now, but I think that’s to be expected. There is still lots opinions among the faithful and that’s healthy. As always, I look forward to apply what I’ve learned to our own projects.

It was great to share what we are doing with Ruby and Rails at Hulu. We would love to talk to anyone passionate about Ruby (as you can see from our list of jobs). I’m glad I got a chance to visit Austin, but I’m definitely looking forward to going back again when the conference returns to Portland.

Hackathon Spring 2012

April 23rd, 2012 by Ilya Haykinson

As we like to do every so often, a few weeks ago we held our hackathon. For the uninitiated, a hackathon is a chance for developers to take a break from their regular duties and work on something new, something unexpected, something experimental. For this hackathon, most of our development team in LA went off-site, grouped around makeshift workspaces, and hacked on a bunch of different projects in small teams. Our Seattle team also formed ad-hoc groups to write some new code. This time around we chose to spend a Thursday and a Friday working on our projects, and then came together early the following week to share our results with each other. In all, there were 18 different projects that made it far enough to be presented. They ran the gamut from setting up an NFS/HDFS proxy to a keg bot that allows you to remotely pour yourself some beer. Below are a few projects that we worked on.

Blinker

Yupeng and Eugene got curious and wanted to create a visualization of Hulu streams across the US. The result is Blinker, an internal website that plots points on a map corresponding to IP addresses beginning video streams.

To build this, they put up a CherryPy-based web service that listens for UDP unicasts containing rough geo-location data. A quick modification to our stream control service allowed it to send data for each request to the Blinker service. A web page embeds a Google Map, and uses WebSockets (and ws4py on the backend) to continually update data.

 

ShowGraph Hang and Feng decided to spend their hackathon looking at graphical representations of show similarity. They used our recommendation system’s show similarity data to perform multidimensional scaling and clustering, resulting in a zoomable chart that plots shows in similarity clusters.

The scaling and clustering was built using R, and the front-end was developed using processing.js.

Parley Fed up with the complexity and cost of our current phone conferencing system, David, Sherin and Ilya came up with Parley – Hulu’s phone conference room system. Using Django and the Twilio API, they built a service that lets Hulu employees create conference rooms with access codes. Callers can then call into an access number and participate in a phone conference. Meanwhile, the employee that created the room can see a log of room accesses and optionally invite people by typing in their phone number (or selecting them from the company contact list) and having the system call them up with an offer to join a phone conference in progress.

Cage One of our teams in Seattle works on software for connected devices – gaming consoles, set-top boxes, internet-connected televisions, and the like. Since these are often fairly closed platforms, debugging and testing can be a challenge, with some validation steps requiring a human to eyeball a screen and see if it’s correctly displaying an application. To help humans with this task, Steve and Dallas built Cage – a system that uses an Android phone and a backend service to capture screenshots of an application at various stages of testing.



When a connected device application runs, it displays a QR code describing the device and the test being performed. The Android app uses the camera to capture the barcode, and then takes a picture of the application as it performs various tasks. These photos are then sent to a service which displays the photo that was taken alongside an “expected results” picture. This will allow us to run some validation tests unattended, and then have humans verify the outcomes in bulk.

DJ Bot Unsatisfied with a quiet environment, Scott decided to build a bot to let us all play DJ. Scott took Alain Gilbert‘s code for creating bots that talk to the turntable.fm API, tweaked it a bit and integrated it into our node.js-based bot framework. Since our bot sits in all the channels in our HipChat chatrooms, we can now issue control a “DJ session” by issuing commands like this:

!dj start             starts playing !dj stop              stops playing !dj skip              skips the current song !dj info              shows the curent song info !dj history           shows playlist history !dj search            search for songs, artists, albums !dj q list            shows songs in your dj queue !dj q clear           removes all songs in your dj queue !dj q add             adds song as next in your dj queue, where is from search output !dj q rm              removes song at from your dj queue !dj q mv              moves a song index index in your queue

And the bot commands a turntable session appropriately. Just add a big speaker, and we’re (collaboratively) grooving.

The Search for the Perfect Stream: Hulu’s New Quality of Service Portal

February 21st, 2012 by Josh Siegel

At Hulu, quality is at the heart of everything we do. It’s right at the top of our team credo: “What defines Hulu”. Last year we formed a new Quality of Service team, dedicated to ensuring our video playback always looks great across every device on every Internet connection. This is especially challenging when you consider how many potential points of failure exist in our video publishing pipeline and request chain:

  • File integrity: Files delivered from our content providers are transcoded and uploaded to our Content Delivery Networks (CDNs). During this process, they are moved multiple times, risking corruption and forcing us to balance the overhead of running MD5 hash checks against the throughput of the pipeline.
  • CDN cache: Once on the CDN, cache misses force the stream to be served from the origin, introducing intra-CDN network latency and slowing end-user delivery.
  • Network routing: Depending on the transit and peering relationships established by the CDN, the video may be routed inefficiently over the Internet to the end user.
  • ISP congestion: Once the video reaches the ISP network, it may encounter congestion in the last mile, causing added latency and dropped packets.
  • Video playback: Once the client receives and buffers the video, the local CPU is taxed both by the decryption and decoding overhead necessary to render the video in the player, often resulting in dropped frames.

Oh, and you have to add to this the complexity of serving over a variety of protocols: RTMPE, HLS, HTTP progressive, etc. Each requires specific configuration, such as optimizing our Flash Media Server (FMS) API interactions or ensuring a device player is sufficiently tolerant to handle a corrupted HLS video chunk. Any of the problems listed above can lead to user visible playback errors. Our first goal was to create a set of dashboards that give us insight into the errors our users are experiencing and detailed diagnostic information to address them. To this end, we recently launched a new internal Quality of Service Portal, shown below. The portal tracks key quality metrics that impact the streaming experience for users, breaking them down by platform, region, and timeframe, and enabling us to track them over time. These include:

  • Rebuffering rates: The percent of sessions that stall because the player’s buffer has under-run, leaving the user to watch a spinning circle icon instead of their video.
  • Incorrect ends: The percent of sessions in which the video stops unexpectedly, requiring a user to re-start the stream in order to continue watching their video.
  • Skip sessions: The percent of sessions in which the video skips ahead a few seconds unexpectedly.

We recognized early on that if we wanted to ensure a great streaming video experience, we had to deepen our relationships with our CDN partners. The portal enables these partners to log in externally and pull the graphs themselves. But insight alone is rarely sufficient to address problems that arise. We also introduced functionality to make the data actionable. When our CDN partners discover an anomaly in performance, they navigate to the “Logs” tab and pull records with detailed session data. This helps them isolate issues in specific hosts and improve network performance.

Architecturally, the primary challenge in building this system was scaling our data ingest capacity and backend data aggregation capabilities. The players built into each of our device endpoints (Flash in the browser, iOS, PS3, Xbox, etc) periodically send “beacons” to our QoS front ends, reporting a wide variety of statistics in a common JSON format. Because most of our usage comes from the US, there’s a large delta between our baseline and peak usage throughout the day. To accommodate changing traffic patterns, we use virtual machines to create on-demand ingress capacity, scaling up our first tier dynamically as needed. The first tier parses the beacons, validates structure and content, and filters out bad records. It then forwards the data to our second tier, comprised of MongoDB instances. Each beacon has a GUID which is modded to distribute load evenly across this persistence layer. Once written to memory, the second tier runs a periodic batch process using Mongo’s native mapreduce framework to aggregate the session data across a number of keys that ultimately enable low latency queries from the portal UI. The aggregated data is then pushed to a third MongoDB tier that persist it for serving to the portal. See this diagram for a high level visual representation:

Now that the initial version of the portal is in production, our next step is to enable greater insight at a finer level of granularity. We plan to add new metrics such as detailed player load latency, average bitrate, and CDN failover. We believe with a continued focus on quality of service, we’ll see significant improvement in our video delivery in 2012.

Joshua Siegel is Senior Program Manager for Hulu’s Content Platform and Video Distribution, guiding development of the technology that gets content from our Media Partners to our users.

Greg Femec is lead engineer on our Quality of Service initiatives in addition to launching Hulu Latino and working on third party authentication services.

An Algorithm for Unique User Prediction

January 18th, 2012 by Tony Zhang

One of the reasons I love working on advertising at Hulu is that many of the problems we are trying to solve have an algorithmic flavor. The problem of predicting how many unique users a given advertising campaign will reach lends itself to an algorithmic approach. More specifically: if we are given two factors – the number of ad views the advertising campaign is scheduled to deliver and the dates the campaign is scheduled to run – how can we better predict the number of unique users the campaign will reach?

A First Approach

A first approach at solving this problem is to compute the average number of ad views per unique user in the given time period. For example, if I want to estimate unique users reached for a two-week long campaign with one million ad views, I might do the following:

  1. From historical data we find that the average number of impressions seen by a user over two weeks is X.
  2. The ad will deliver one million views so the estimated number of unique users it will hit is 1000000 / X.

Analysis

This approach drastically underestimates the actual number of uniques reached. Imagine if all users watched X ads in a two-week period. Then this approach is actually calculating the smallest possible number of users that can be reached and still have this ad deliver one million views!

A Second Approach

To more accurately estimate the reach of this campaign, we need more information. We need to know the distribution of views by user –  that is, how many users watched 1 ad in the time period, 2 ads, 3 ads, etc. The approach we use at Hulu is to run a simulation using this distribution. The concept is fairly simple: given the distribution of views per user, randomly pick one million views out of the distribution and keep track of how many unique users are reached. The distribution of views specifies a discrete probability distribution; the probability that a given view reaches a given user is the number of ad views for that user divided by the total number of ad views over all users.

We can reduce this problem to the problem of simulating the drawing of colored balls at random out of a bag where the number of balls in each color is given to us. The user IDs are the colors and the number of ad views the user sees is the number of balls in the bag. In our case, there are millions of colors and we are choosing many random balls from a given distribution, so we should choose an algorithm that scales well with the number of colors and that makes the “choose” operation fast, possibly with the help of pre-computations.

There are two related problems to consider. The first is the classic problem of drawing from a discrete distribution with replacement where we do not alter the distribution after every draw. Algorithms for this problem are well known and we’ll discuss some standard algorithms for solving it. The second, more difficult, problem is to draw from a distribution without replacement where we alter the distribution after every draw by decreasing the count of the drawn color by one. In this blog post, we’ll discuss an extension to one of the standard algorithms for solving the first problem that allows it to handle the problem of the changing distribution with the same asymptotic runtime efficiency as algorithms to the first problem achieve.

Random Sampling from a Static Discrete Probability Distribution

First we’ll look at the classic problem of drawing a random color from the distribution with replacement. The standard algorithms for solving this problem are a simple linear search and a faster binary search.

Linear Search

A simple solution to this problem is to iterate through the distribution. Say we are given the distribution in the form of a hash map mapping colors to counts, and we have standard random number generation routines.

//randomly choose an integer between 0 (inclusive)
//and sumCounts (exclusive)
int randInt = random(0, sumCounts);
int curSum = 0;

foreach (string color in dist.keys) int colorCount = dist[color];

curSum += colorCount;

if (curSum > randInt) return color;

This solution requires O(n) time where n is the number of colors. Indeed, this is the best we can do if we must process the distribution when we choose the random color, but if we invoke the choose operation many times on a distribution, we can achieve a better amortized bound using some pre-computation.

Binary search

The first solution determines in which interval the randomly chosen integer falls. As an example, suppose the distribution has 3 red balls, 2 green balls and 1 white ball. In this distribution, we choose a random integer from 0 to 5 (6 possible choices corresponding to the 6 balls). The red interval is 0 to 2, the green interval is 3 to 4, and the white interval is 5 to 5. The end points of the intervals, 2, 4 and 5 respectively, will always be strictly increasing, and so we can do a binary search over them. We require pre-computation to compute the end points of the intervals, but after that, randomly choosing a color is an O(log n)  operation.

//Pre-computation
int[] partialSums = new int[dist.length];
string[] colorMap = new string[dist.length];
int psumInd = 0;

foreach (string color in dist.keys) {   if (psumInd == 0) partialSums[psumInd] = dist[color]; else partialSums[psumInd] = partialSums[psumInd - 1] + dist[color];

colorMap[psumInd] = color; psumInd++; }

//Choosing a random color int randInt = random(0, sumCounts);

/* If randInt is not in the array, most binary search implementations will allow you to compute which index it would be in if it were inserted into the array in sorted order. This is the index we want. */

int colorInd = binarySearch(partialSums, randInd);

return colorMap[colorInd];

The second solution requires O(n) for the pre-computation step, which is done once for the distribution. It requires O(log n) for the choose operation.

I should mention that there is another solution that yields O(1) time for the choose operation. We can expand out the distribution array, creating an array of length 6 with 3 reds, 2 greens,and 1 white in the above example. This approach uses space proportional to the sum of all the counts of the colors which is too much space for our application at Hulu. If space is not an issue, then this is the fastest implementation of the choose operation.

An Extension: Simulation Without Replacement

The previous solutions solved the problem of drawing from the probability distribution with replacement. For our problem of estimating uniques, it is more accurate to do simulations without replacement – we would like to keep the ball out of the bag and alter the original distribution by decreasing the count of the chosen color by one. If we apply the binary search solution to this problem, we will need to update the partial sum array. Specifically, we need to decrease the partial sum of every element whose index is greater than or equal to the index of the chosen color. This update requires O(n)  time making the final running time of the solution O(n). Ideally, we would like to preserve the O(log n) bound. To solve the main difficulty of updating all counts after a certain index, we use a modified binary search tree that supports bulk updates.

First we start with a standard binary search tree where each node stores a color and the partial sum of the color:

class Node {
  int partialSum;
  String color;
  Node left, right;
}
The tree satisfies the standard binary search property: every node in the left sub-tree of a node has a smaller partial sum, and every node in the right sub-tree has a larger partial sum. Searching through this tree is easy but we still run into the problem of updating all values after a certain value. If the tree allowed us to efficiently bulk update all the nodes in the right sub-tree of a given node, then we could use this operation to efficiently update all the nodes in a tree whose partial sum is greater than the chosen partial sum. We can modify the tree to support this operation by storing the partial sums of the nodes in the right sub-tree as a delta from the root of that tree. For example, if the original tree was:

We modify the tree to look like:

All nodes are stored as the delta from the partial sum value of its most immediate left ancestor. If it has no left ancestors, the partial sum of the node is stored directly. The stored values in the modified tree do not necessarily satisfy the binary search property, but the implied values (the partial sums) do. As we traverse the tree, we can easily compute the implied value of any node by keeping track of the accumulated deltas as we go, updating the total if and only if we go down the right branch. This tree allows us to efficiently update the counts of all nodes that follow the chosen node. First we observe that the set of all nodes that follow our chosen node is not just the nodes in the right sub-tree of the chosen node. For example, the set of all nodes greater than the 5 in the above tree includes the 7, as well as the 10 and all the nodes in 10’s right sub-tree. We can see that the set of all nodes greater than our chosen node is precisely the union of the right sub-trees of all ancestor nodes from which we went left in order to arrive at the chosen node. We can decrease the count of all these nodes by 1 by decreasing the count of all such ancestors by 1. There are log(n) ancestors in a balanced tree (more about this in a bit) so this algorithm runs in O(log n) time.

C# Implementation of this Algorithm:

public int Draw() {
  if (_totalSum == 0) {
    throw new Exception("Tried to draw from empty dist.");
  }

int r = rand.Next(0, _totalSum);

Node n = Search(_tree, r); _totalSum--;

return n.index; }

/* Try to find the least node whose partial sum is greater than the targetDelta */

private Node Search(Node node, int targetDelta) { if (targetDelta > node.delta || node.delta == 0) { if (node.right == null) { /* No Nodes in this subtree have partial sum greater than the targetDelta */

  return null;
} else {
  /*
  All nodes in the right subtree have partial sum increased
  by the delta of the root. Handle this by decreasing our
  target delta.
  */

  return Search(node.right, targetDelta - node.delta);
}

} else { /* Update delta of ancestors that are greater than the chosen node */ node.delta--;

if (node.left == null) {
  return node;
} else {
  /*
  Search the left subtree for the smallest node whose partial
  sum is greater than target delta. If none exists, then the root
  is the node we want.
  */
  Node leftSearch = Search(node.left, targetDelta);

  return leftSearch == null ? node : leftSearch;
}

} }

Balancing the tree

We’re lucky. We know the number of nodes at pre-computation time so we can just go ahead and build a balanced binary tree right off the bat. We start with the sorted partial sum array and aim to build a balanced tree. We do this by taking the midpoint of the array as the root, recursively building a tree out of the left sub-array and attaching it as the left child of the root and doing the same for the right sub-tree:

/*
psums is the sorted array of partial sums we wish to convert into a tree
minInd and maxInd define the portion of the array we are working on
curSum is our current delta for the right subtree
*/

private Node BuildTree(int[] psums, int minInd, int maxInd, int curSum) { if (minInd > maxInd) return null;

Node ret = new Node();

int mid = (minInd + maxInd) / 2;

ret.index = mid; ret.delta = psums[mid] - curSum; ret.left = BuildTree(psums, minInd, mid - 1, curSum); ret.right = BuildTree(psums, mid + 1, maxInd, psums[mid]);

return ret; }

Optimizations

Instead of explicitly storing the nodes, we can make the implementation faster by storing the tree in a flat array using the array representation of a binary tree. In the array representation of a binary tree, the root is at index 0 and the left and right children of a node at index i are at 2i + 1 and 2i + 2, respectively. For the array to not contain any null entries, it is necessary for our binary tree to be both balanced and complete, that is, all the nodes are filled in from left to right at each level, and we go to the next level only after the current level is full.

Building a complete tree out of a sorted array is a bit more difficult, since taking the midpoint no longer suffices. That’s because, in a complete tree, nodes are filled in from left to right, so generally the left sub-tree will have more nodes than the right sub-tree. In turn, the index of the root will tend to be larger than the midpoint of the array. To compute the index of the root, we first have to compute the depth of the tree and the number of leaves in the last level.

Here’s an example: Say we want to compute the index of the root in a tree with 40 nodes.

  1. The largest power of two less than 40 is 32 = 25. Thus a complete tree with 40 nodes will be a complete tree with five levels plus a partially filled sixth level.
  2. There are 25 – 1 = 31 nodes in the first five levels. One of them is the root which leaves 30 non-root nodes. 15 of these nodes will be in the left sub-tree.
  3. The sixth level can hold up to 25 = 32 nodes. The first 16 of these will be in the left sub-tree. 40 – 31 = 9, so there will be 9 nodes in this level. Since 9 < 16, all 9 are in the left sub-tree. Thus, the total number of nodes in the left sub-tree is 15 + 9 = 24. Therefore, the root is the 25th node and is at index 24.

The revised version of the above method to build a complete binary tree is this:

private Node BuildTree(int[] psums, int minInd, int maxInd, int curSum) {
  if (minInd > maxInd)
    return null;

int numNodes = maxInd - minInd + 1; int pow2 = 1;

while (true) { if (pow2 * 2 - 1 > numNodes) { break; }

pow2 *= 2;

}

int maxLeavesBefore = pow2 / 2;

pow2--;

int numBefore = (pow2 - 1) / 2; numBefore += Math.Min(numNodes - pow2, maxLeavesBefore);

int mid = numBefore + minInd;

Node ret = new Node(); ret.index = mid; ret.delta = psums[mid] - curSum; ret.left = BuildTree(psums, minInd, mid - 1, curSum); ret.right = BuildTree(psums, mid + 1, maxInd, psums[mid]);

return ret; }

Possible improvements

The algorithm is fast enough for our application at Hulu, but I’m sure it could still be improved. The most obvious area of improvement that I see is in the construction of the tree. Right now, we build a complete binary tree fairly haphazardly. Since a search can be terminated as soon as it finds the chosen node (without necessarily having reached a leaf), we can improve the average running time of the algorithm by placing nodes with larger counts closer to the root so that we don’t have to go as deep on average. This won’t improve the worst case time bound of this algorithm from O(log n) but it will make it run faster in practice.

Here we were able to use an algorithm to significantly improve the accuracy of our estimates for a problem that was not well handled by previous approaches. In this blog post, I presented an example of a problem that had an “algorithmic core” underlying it. Such problems appear quite often at Hulu, and being able to apply interesting algorithms to solve them and make an impact on the business is one of the best parts of working at Hulu.  

When Tony Zhang’s not mining vespene gas, he’s busy cracking Russian math competition problems with his bare hands. He also happens to work as a developer on inventory forecasting for our advertising team.

Hostess Menu System

January 3rd, 2012 by Scott Post

When people ask me what it is I do at Hulu, I usually have a few answers prepared ranging from, “I work on computers” (this is surprisingly satisfactory sometimes) to specific details on day-to-day tech projects.  For this post, I’d like to crack open the hood and dive deep into one challenge I’ve helped solve here at Hulu:  A project aptly named Hostess.

No More Control

In late 2008, our Hulu Desktop application was in full-on development.  It became clear that this app was to be the first of many out-of-browser Hulu experiences. (Today, of course, we have a substantial and growing device line-up.)  The fundamental problem we foresaw with these applications was control: we’d lose the luxury of swift deployments, patches, and bug fixes and instead have to deal with third-party quality assurance approval processes and the probable latency associated with them.  This meant each new build deployed could potentially be delayed for weeks before going live. If we weren’t careful, this would include trivial user interface changes, additional content views, or—more specifically—implementation details such as content organization.

Our Solution

The fix was a relatively simple one: let the server determine as much of the experience as possible. Rather than managing this all on the client, our solution was to have the server maintain all of the views, filters, and even UI layout specifications in a database.  This enables the structure and content of our applications to be data driven and decoupled from our deployed bits.  Because the navigation capabilities of devices can be limited (sometimes remote control), the structure of an application is composed of a familiar set of simple Menu objects.  At the root of a tree structure sits the main menu, and each submenu its child.

In addition to Menu objects, the server also maintains associations with Query objects.  These objects basically represent SQL fragments—which are stored in the database themselves (very meta, I know).  Simple menus don’t need to have a Query associated with them. Perhaps only metadata such as the menu’s textual name and ranking relative to its siblings is enough to navigate (look at Now Playing, Popular, Recently Added, or Help, for example). But for menus that represent actual Hulu content—like a most popular shows listing—there is a Query object on the server that represents the command that will ultimately get executed on the database.  This fragmentation scheme allows for cascading of queries and provides a mechanism for children to inherit parent filters.  In addition, it allows us to have a centralized service to manage menus for all of our devices, removes redundancy, and cuts down on the amount of SQL we have to write (and any developer can vouch for SQL as sometimes being cumbersome to deal with in code. Formatting strings anyone?).

 

Example Menu

 It’s not obvious, but the above screenshot actually represents three levels of the Hulu TV application’s menu hierarchy.  Beginning at the top level, notice the menu called Most Popular.  This menu has siblings (which are not shown) and children: Shows, Episodes, Movies, etc.  Looking at the Shows menu, it has a single child of type ShowThumbnailCarousel which the client knows how to render.  Here is the same view represented in the actual database:

mysql> select * from menus where parent_menu_id = 185332;
+---------------+----------------+-----------------+----------+--------------------------------------+
| child_menu_id | parent_menu_id | display         | query_id | app_data                             |
+---------------+----------------+-----------------+----------+--------------------------------------+
|        185333 |         185332 | Search          |       -1 | <view_name>SearchView</view_name>    |
|        185334 |         185332 | Browse Movies   |    24856 | <view_name>BrowseView</view_name>    |
|        185371 |         185332 | Browse TV       |    24881 | <view_name>BrowseView</view_name>    |
|        185383 |         185332 | Recently Added  |       -1 | <view_name>ContentView</view_name>   |
|        185389 |         185332 | Most Popular    |       -1 | <view_name>ContentView</view_name>   |
|        185395 |         185332 | Queue / Profile |       -1 | <view_name>ContentView</view_name>   |
|        185404 |         185332 | Recommended     |       -1 | <view_name>ContentView</view_name>   |
|        185407 |         185332 | Help            |       -1 | <view_name>HelpView</view_name>      |
+---------------+----------------+-----------------+----------+--------------------------------------+

mysql> select * from menus where parent_menu_id = 185389; +---------------+----------------+----------+----------+---------------------------------------------+ | child_menu_id | parent_menu_id | display | query_id | app_data | +---------------+----------------+----------+----------+---------------------------------------------+ | 185390 | 185389 | Shows | -1 | <cmtype>ShowThumbnailCarousel</cmtype> | | 185391 | 185389 | Episodes | -1 | <cmtype>VideoThumbnailCarousel</cmtype> | | 185392 | 185389 | Movies | -1 | <cmtype>ShowThumbnailCarousel</cmtype> | | 185393 | 185389 | Clips | -1 | <cmtype>VideoThumbnailCarousel</cmtype> | | 185394 | 185389 | Trailers | -1 | <cmtype>VideoThumbnailCarousel</cmtype> | +---------------+----------------+----------+----------+---------------------------------------------+

mysql> select * from menus where parent_menu_id = 185390; +---------------+----------------+---------+----------+----------+ | child_menu_id | parent_menu_id | display | query_id | app_data | +---------------+----------------+---------+----------+----------+ | 185337 | 185390 | {name} | 24895 | NULL | +---------------+----------------+---------+----------+----------+

mysql> select * from queries where id = 24895; +-------+--------+--------+--------+-------------------------+-----------------------+ | id | object | params | fields | filter | order_by | +-------+--------+--------+--------+-------------------------------------------------+ | 24895 | Show | NULL | NULL | full_episodes_count > 0 | view_count_today DESC | +-------+--------+--------+--------+-------------------------+-----------------------+

The Menu object responsible for the show carousel is represented by the last two rows above.  It has an associated Query object, so the items of this menu aren’t just further text submenus, but rather are objects of type Show with a particular filter and sorting.  Upon request of this menu, the server looks up the menu data, pieces together a SQL query looking something like “SELECT * FROM shows WHERE full_episodes_count > 0 ORDER BY view_count_today DESC” and finally formats a response containing these items. If we continue on down the tree, each item in this carousel points to yet another submenu, and so on. Below is the child menu you’ll see when you select Parks and Recreation:

And has corresponding data:

mysql> select * from menus where parent_menu_id = 185337;
+---------------+----------------+--------------+----------+-----------------------------------------+
| child_menu_id | parent_menu_id | display      | query_id | app_data                                |
+---------------+----------------+--------------+----------+-----------------------------------------+
|        185339 |         185337 | Episodes     |       -1 | <cmtype>VideoThumbnailCarousel</cmtype> |
|        185342 |         185337 | Clips        |       -1 | <cmtype>VideoThumbnailCarousel</cmtype> |
|        185343 |         185337 | Seasons      |       -1 | <cmtype>NonSlidingMenu</cmtype>         |
|        185345 |         185337 | Rate         |       -1 | <cmtype>RatingControl</cmtype>          |
|        185346 |         185337 | Subscribe    |       -1 | <cmtype>SubscriptionControl</cmtype>    |
|        185347 |         185337 | Related      |       -1 | <cmtype>ShowThumbnailCarousel</cmtype>  |
|        185348 |         185337 | Home         |       -1 | <cmtype>TextControl</cmtype>            |
+---------------+----------------+--------------+----------+-----------------------------------------+

mysql> select * from menus where parent_menu_id = 185343; +---------------+----------------+------------------------+----------+-----------------------------------------+ | child_menu_id | parent_menu_id | display | query_id | app_data | +---------------+----------------+------------------------+----------+-----------------------------------------+ | 185344 | 185343 | Season {season_number} | 24862 | <cmtype>VideoThumbnailCarousel</cmtype> | +---------------+----------------+------------------------+----------+-----------------------------------------+ mysql> select * from queries where id = 24862; +--------+--------+---------+-----------------------+------------------+---------------+-----------------------+ | id | object | params | fields | filter | order_by | group_by | +--------+--------+---------+-----------------------+------------------+---------------+-----------------------+ | 24862 | Video | show_id | season_number,show_id | show_id=:show_id | season_number | season_number,show_id | +--------+--------+---------+-----------------------+------------------+---------------+-----------------------+

mysql> select * from menus where parent_menu_id = 185344; +---------------+----------------+---------+----------+----------+ | child_menu_id | parent_menu_id | display | query_id | app_data | +---------------+----------------+---------+----------+----------+ | 185340 | 185344 | {title} | 24863 | NULL | +---------------+----------------+---------+----------+----------+ mysql> select * from queries where id = 24863; +--------+--------+-----------------------+---------------------------------------------------+----------------+ | id | object | params | filter | order_by | +--------+--------+-----------------------+---------------------------------------------------+----------------+ | 24863 | Video | show_id,season_number | show_id=:show_id AND season_number=:season_number | episode_number | +--------+--------+-----------------------+---------------------------------------------------+----------------+

The above represents another three levels deeper into the menu hierarchy (the previous view can be considered off screen).  Notice the server requires a show_id parameter to properly construct a query for these children menus. In this case, Parks and Recreation was selected and show_id is passed down.  The client accesses these children via URLs which are returned in each response from the server.  So for the above view, the Hulu TV client had to make a few calls to the server:  one to retrieve the menu for Parks and Recreation containing metadata and links to its children, another to follow the link to the Seasons menu, and finally a call to get the menu with episodes for the second season. Now, let’s say we discovered it would be better to sort the Seasons menu by current seasons first and shorten the display text to S2, S1.  We could simply make an update in our CMS so display is ‘S{season_number}’ and order_by is ‘season_number DESC’ and we are done for any device that reads this menu.

Components and Technology Stack

The system has three main components:  the data sync, the web services, and the CMS.  All components were written in Python and built using the CherryPy web framework and the Sqlalchemy ORM module.  The CMS component has a front end UI written in Javascript with jQuery and uses jinja HTML5 templates.  All requests to the Hostess web services are proxied through a caching layer using Varnish.  There is also another Memcached layer to store Menu objects in shared memory since they’re not changed frequently.

Above is a diagram of the entire system and interaction between components.  The data sync component simply syncs metadata from Hulu’s master site database into its own schema.  This is mainly to keep traffic to the site database and devices separate, but devices also have additional metadata to query and join against—and Hulu’s website does not.  The master runs the data sync and CMS.  On a handful of slaves, MySQL replication keeps data synchronized with the master.  Each of these slaves runs as many Hostess web service components as there are CPU cores on that machine. As we scale, we can simply add more slaves to the system.  The web services are load balanced and serve the majority of metadata (in XML or JSON) found on Hulu devices.  This includes living room devices like Xbox, PS3, and Roku and mobile devices like the iPad, iPhone, and Android phones.  Each device uses a particular set of menus (starting at different roots) which we call Hostess applications.  Managing these applications is done via the CMS component.

Hostess CMS

The CMS allows our developers to manage, create, edit, reorganize, and deploy new applications as often as we want, as was our ultimate goal. Applications are edited offline and then exported to either staging environments for testing or production for deployment.  When they’re exported, the application’s menus are inserted into the master database, and these trickle down to all the replicated slaves.  As cache expires, devices pick up the new root menus and their fresh child menus.  In the future we hope to expand the CMS to abstract away having to know SQL to manage apps.  This would enable any project manager to be able create and maintain basic menus for their apps.  A challenge we have (and will continue to face) is making sure tables are properly indexed and queries are optimal.

For me, it’s been fun working with smart problem-solvers here on the Hulu team to build and scale Hostess as our Hulu Plus device offerings have grown.  If you’re still curious why we chose the name Hostess: the job of a hostess  is to serve menus as you prepare to consume something really delicious. We thought this name was a good fit.  It has ultimately solved our deployment woes, enabling us to keep our applications flexible, reusable, and agile as we continue to build on this framework.  And in the process, it has grown into something more than originally intended—but that’s a topic for another post.

Hulu Hacks TechCrunch’s Disrupt Beijing

December 13th, 2011 by Jiasi Zeng

When we heard that TechCrunch would be holding its first international Disrupt Hackathon in Beijing, it was a no-brainer for us: We had to go. Hacking is what Hulugans do, so hackathons are where Hulugans go. And winning the Grand Prize? That would’ve just been a bonus. We went to the venue with our gear, picked a table, and sat down. Since lots of hackers came alone, people started mingling and teaming up with others. Turns out, the hacker world a smaller world that we could have even imagined. We met a guy who sat at our table, and it turned out that he was one of our former dev interns, Yichuan Wang, who had previously worked with the Hulu platform team for three months at the Los Angeles office. At the TC Hackathon, we ended up building a web application named WeDiary which organizes a user’s tweets into a personal diary and streams it along a horizontal timeline in the browser. The app’s technology is not very advanced – after all, we weren’t creating an operating system or building a super-fast new compiler. But the execution of our ideas was very “Hulu” — relentless focus on quality all the way down to the pixel. Since we had fewer than 20 hours to build, we decided at the very beginning that our app should do one thing and do it well.  We used Sina Weibo’s API because it is the Twitter equivalent in China with more than 200 million users. Getting tweets through the API is very straightforward:
  1. Ask the user’s permission to access his/her tweets.
  2. Request an access token from Weibo’s authorization server, with the user’s permission.
  3. Get the user’s tweets with the access token.
Writing the code to get the actual tweets through APIs only took about 30 minutes.  Because we didn’t have a designer on the team,  coming up with a consistent UI style or writing the CSS and JavaScript wasn’t really possible in the time allowed.  But since we’re developers, we knew that there are always some great devs in the community who may have already done some of the heavy lifting for us. We remembered reading about Twitter’s Bootstrap CSS framework on Hacker News a while ago, and we decided to give it a try. It brought us Hulu’s signature lights-out feature, nice pop-ups, hover boxes, etc. We also hacked the framework a bit and made it suitable to our app by hiding scroll bars, tuning the hover boxes’ positions, etc. We worked on a lot of details that could have been completely missed, but without them,  the app just didn’t feel right. For example, we spent a lot of time choosing the perfect background pattern for our app. We wanted to make sure that it looked great not only on our laptops but also on the demo projector screen, since it had noticeably lower definition and contrast. We went over more than 100 patterns (really!) using Subtle Patterns and finally found the right one that we all liked.
At 6 a.m. the next morning we submitted our app, grabbed some food, and got 30 minutes of sleep on the chairs.  After the power nap, each team was given 60 seconds to demo their work to the judges and the rest of the hackers. To our delight, we were thrilled to have been awarded the Grand Prize of the TC Hackathon. Although we really saw that the process of hacking as the real prize, the award was great validation of our work. Since this was the first time we had participated in an overnight hackathon, we were all pleasantly surprised by what can be done in 18 hours.  Even though the app we created was still buggy and might not be ready for end-users, we went through the typical stages of development: brainstorming, prototyping, testing, iterating and delivering the product. We’d like to think the experience has made us even better at working in an intense, fast-paced environment. We would like to thank our awesome teammates: Cheng Chen, Yi Ding, and Yichuan Wang whom we met at the TC Hackathon and who have since become our good friends.
Jiasi Zeng works on the front-end of Hulu search, bringing new features and improving the service’s usability. Jia Cao, a software engineer, is working on the site and other systems for Hulu Japan.

“I had this room!”

October 30th, 2011 by Matt Meredith

At Hulu, our way of working tends to be a bit unconventional.  I am on the DevOps team.  We are a small team with a big mission:  solve traditional IT problems in non-traditional ways.  When we heard that Hulu was exploring third-party vendors to purchase expensive conference room scheduling displays, we thought this would be a great opportunity to hack together a custom solution that would better fit our needs—and save us a ton of money.

So what are we trying to pull off?

  • Resolve conflicts (like double-booked rooms) by providing up to date conference room schedules
  • Completely automate the process of displaying the schedules outside each room
  • Minimize change in process by integrating with our existing hosted Exchange service
  • Accomplish all of this as cheaply as possible

Technical Summary

Step 1 – Generate the Exchange Web Services web references

Visual Studio 2008 did the trick.  Using the “Add Web Reference…” dialog and pointing it to our OWA URL (https://our.owa.url.com/EWS/Services.wsdl), we were able to generate a client library capable of communicating with Exchange.

Step 2 – Import the namespace and poll Exchange

Using the ExchangeServiceBinding object we are able to pull down a list of scheduled events for each conference room.  We run this routine as part of a Windows Service, and run it in a loop every couple of minutes.  To keep things simple, we just dump the data into a text file.

Step 3 – Create a generic ASP.NET website

With a bit of CSS, HTML, and C# we put together a basic web interface which could display the data from the meetings.txt file.  The code behind portion simply draws divs for each of the meetings and places them in the correct slots on the calendar (table).  We wrapped the whole thing in a separate aspx page that uses javascript to refresh the page every 15 seconds.  This allows for the page to automatically update, while still recovering from any transient failures that may occur.

Step 4 – Point the iPads at the website

If there is one thing that an iPad does well, it’s loading a generic website.  We thought we had users tricked by covering the home screen button, but they found they could stray off course with the address bar.  We eventually purchased some $2 apps to put the whole thing in kiosk mode, hiding everything but the scheduling site.  Since this is a simple web site, we get the great side effect of being able to use any web-enabled client to view the calendars.

“Gotchas” (Little Hiccups We Fixed Along the Way)

  • The data was too honest — This project uncovered that some users’ events weren’t actually getting booked in Exchange.  We eventually fixed these users’ issues.
  • People love to touch things — Our employees decided the iPads on the wall were a great opportunity to navigate to our Hulu App and watch TV while waiting for their meetings.  Which is great for viewership numbers, but not so great for keeping the schedule up! We solved this by purchasing steel cases which covered up the Home button, and installed an app to hide the browser controls.
  • iPads just look expensive — We have  a strong culture of frugality at Hulu.  Our solution (including the iPads) came in around 90% cheaper than the next cheapest commercially available solution.   We tried a variety of tablets, but the iPad had the best screen, and by purchasing a first generation refurbished model they weren’t much more expensive.
  • Error handling — This was just a weekend project, not a full-out development effort.  We have incrementally improved error handling.  If the iPad loses its wireless connection, for example, the page will timeout and display a giant message to alert our helpdesk.
  • Monitoring — We added basic heartbeat monitoring into the PollExchange service to make sure the service remained started, and was actually pulling valid data.  We also started logging client IPs in memory, and exposed this into an ASHX status page to see the last pull from each iPad.

The coolest thing about this project is that it grew out of a weekend hack session, and has ended up being very valuable to the organization.  It has given us a platform that can be extended to do all sorts of cool things — and we saved money doing it!  The code is checked in to our source repository, so we’re looking forward to tons more innovation.

Matt is a systems engineer on the devops team, where he takes IT to the lim.

Introducing Hulu Ad Swap

October 3rd, 2011 by Raymond Mak

Hulu Ad Swap puts complete control in the hands of the user by enabling them to instantly swap out of an ad they are watching for one that is more relevant. While it is a simple concept, it represents the culmination of the key insight we gained with the introduction of the original Hulu Ad Selector — empowering users with choices over the ads they consume can lead to a virtuous cycle where advertisers benefit from a more engaged audience, content providers can see a greater return on their content, and users get more control and greater choice in their experience.

Appearances can be deceiving, and as simple as an idea like Hulu Ad Swap may sound, it brought with it a host of implementation challenges. We’d like to share some of the technical problems we solved with the development of Hulu Ad Swap.

Hulu Ad Swap differs from the recently launched dynamic Hulu Ad Selector in two significant ways:

  1. A default ad is shown automatically to the user with Hulu Ad Swap; whereas dynamic ad selector requires the users to choose a particular ad or wait for a timer to expire before showing the actual video ad, and…
  2. Hulu Ad Swap is enabled by default for all ads whereas advertisers have to opt-in explicitly for dynamic ad selectors.

Given the similarities between Hulu Ad Swap and Hulu Ad Selector, I thought I would explain a bit about the implementation behind dynamic ad selector before delving into Hulu Ad Swap. Unlike the original ad selector where choices are statically defined within the selector slate, the ad selection server (or simply the adserver) would need to assemble all the user-selectable options in a dynamic ad selector slate. If you considered the main output of the adserver to be selecting an ad that the user will see, the efficiency of constructing a dynamic ad selector slate is effectively 33% of the norm as only one ad is watched by a user for every three ads selected. Moreover, ads chosen for a dynamic ad selector must obey Hulu’s anti-ad-fatigue (no two ads from the same brand can run side-by-side) and industry separation (no two ads of the same industry but from different advertisers can run side-by-side). This puts limits on how far you can reduce the overhead of selecting the extra ads for a dynamic ad selector. Based on these considerations, we intentionally limit dynamic ad selector to serve, at most, once per stream.

Compared with dynamic ad selector, Hulu Ad Swap is a far more ambitious undertaking since alternative ads have to be chosen for every single ad we serve. If we had gone through with the plan of choosing three alternatives for every ad we serve, the efficiency of the adserver would drop to 25% of what it was before, and anyone who has read the manual on The Challenge of Scaling an Adserver would be extremely nervous about implementing a new feature with this sort of performance implication.

For illustrative purposes, let’s consider how someone would implement Hulu Ad Swap in the most straight-forward manner:

Upon receiving an ad request for an ad break from the video player, the adserver would:

  1. Select the default ad, plus three alternative ads.
  2. Given that each ad break could have more than one ad, Step 1 is repeated until the adserver has all the default + alternative ads the player will need for the ad break, at which point the ads are returned to the player.

Of course, the pseudo code above omits a lot of the difficulties involved in the ad selection process. For example, all of the default and alternative ads chosen have to obey their own targeting rules which could include but are not limited to content, day-part, demographic, or geographic restrictions as well as anti-ad-fatigue and industry separation rules. Most advertisers also require their campaigns to be delivered evenly over time, and that places further constraints on the number of ads available for selection at any given time.

The Hulu adserver also allows advertisers to specify the maximum number of ads that their ads are allowed to run alongside within a single ad break (or pod). This gives advertisers a lot of flexibility in executing their marketing strategy by making trade-offs between focusing and having broad exposure of their messages. However, having different ads-per-pod limits among different campaigns can introduce significant difficulties in carrying out Step 2 of the Ad Swap selection algorithm, as you don’t necessarily know how many ads will be needed in an ad break until the user has chosen among the default and alternative ads. We could use a more conservative approach by selecting ads with ads-per-pod limits equal to or greater than the minimum ads-per-pod limit of all the ads that we have selected for the ad break so far, but that places yet further restriction on the pool of available ads.

Apart from the considerable processing overhead, the large number of rules we need to apply for alternative ads selection had also raised concerns over whether the adserver can actually find that many alternative ads in real-world conditions. So, to increase the chance of the adserver finding the number of alternative ads that Hulu Ad Swap requires, we considered relaxing the anti-ad-fatigue and industry separation rules to apply only to ads that are chosen by the user. This, however, leads to a drastic change in the ad selection protocol between the player and the adserver that resembles the following:

On each ad request from the player, the adserver:

  1. Selects default ad plus alternative ads.
  2. The player waits for the user to make the final selection before making another ad request with the ad chosen by the user. (This way, the ad server has the information it needs to apply anti-ad-fatigue and industry separation rules only to the ad that the user actually watched. And to ensure a good user experience, the adserver needs to avoid serving the same ad that the user has just swapped out.)

This switch from our current one-request-per-ad-break protocol, to a one-request-per-ad (+ alternatives) protocol might not immediately appear to be feasible because:

  1. It introduces a lot more processing overhead in the form of extra network calls between the player and the adserver, on top of the overhead of selecting the alternative ads, and…
  2. The communication protocol between the player and the adserver would need to be more complicated in order to maintain additional state information between ad selections that is currently hidden within a one-request-per-ad-break protocol. (For example, the one-request-per-ad protocol would need to keep track of the ads-per-pod limit of the ads that have been served so far, whereas the adserver keeps track of this internally in the current one-request-per-ad-break protocol.)

Nevertheless, this was the best idea we went with for a long time, until we stumbled upon the following observation: If switching to a one-request-per-ad protocol is the source of undue complexities, can’t we fit alternative ad selection into a one-request-per-ad-break protocol instead? As it turned out, yes we can, and here is how it works:

  1. Select all the default ads within an ad break as the adserver does today.
  2. Select alternative ads for the entire ad break with anti-ad-fatigue and industry separation rules applied across the default and alternative ads within the ad break.

Since we know exactly how many default ads we are going to serve in the ad break, it becomes easy to select alternative ads with a compatible ads-per-pod limit. As an added bonus, given that the alternative ads are available for all ads within the ad break instead of being selected separately for each ad, it becomes possible to select fewer alternative ads while maintaining the same number of choices. More precisely, the adserver only needs to select n + 2 alternative ads to maintain a minimum of at least 3 choices in an n-ads-per-pod (napp) scenario, whereas 3n alternatives would be needed if the adserver selects them separately for each ad. So for a 2app scenario, only 2+2=4 alternative ads would be needed if alternative ads are available for the entire ad break versus the 3×2=6 alternative ads required if they are selected separately for each ad.

As promising as the latest approach looked, we still had a few doubts as to whether or not the adserver could handle the overhead of selecting the reduced number of alternative ads, and if the adserver could find sufficient number of alternative ads to make it worthwhile. So to find out, we actually implemented the code to select the alternative ads in the adserver. This wouldn’t have been possible if we decided to go with the one-request-per-ad protocol, as we would have had to make changes in both the adserver and our video player before we could measure the full impact in production.

The results of our little experiment were definitely encouraging. The overhead of selecting the alternative ads was barely noticeable, and while there were indeed cases where the adserver could not find as many alternatives as we hoped, it could find them in a majority of the cases. At that point, we were fully convinced that this was the correct approach for alternative ad selection.

Looking back, the various designs we have explored were quite similar to one another, but it is the tiniest difference in mindset that turned a seemingly infeasible design into a viable one, and we hope you get a kick out of this the next time you swap an ad on Hulu.

Raymond Mak is the Principal Software Developer focused on keeping Hulu’s ad server delivering harder, better, faster and stronger.
Last comment: Oct 5th 2011 4 Comments

Hulu’s Recommendation System

September 19th, 2011 by Liang Xiang

As the Internet gets more and more popular, information overload poses an important challenge for a lot of online services. With all of the information pouring out from the web, users can be overwhelmed and confused as to what, exactly, they should be paying attention.

A recommendation system provides a solution when a lot of useful content becomes too much of a good thing. A recommendation engine can help users discover information of interest by analyzing historical behaviors. More and more online companies — including Netflix, Google, Facebook, and many others — are integrating a recommendation system into their services to help users discover and select information that may be of particular interest to them.

With literally tens of thousands of hours of premium video content, Hulu users are also prone to content overload. Given the wide variety of content available on the service at any one time, it may be difficult for Hulu users to discover new video that best matches their historic interests. So the first goal of Hulu’s recommendation system is to help users find content which will be of interest to them.

In addition to users, Hulu’s recommendation system should also help content owners promote their video. Part of our mission is to deliver a service that users, advertisers, and content owners all unabashedly love. We have many different content partners, and we understand that these content partners want to more Hulu users to watch their videos — especially when new videos are released. By using personal recommendation instead of more traditional recommendation systems, we can promote video content more effectively since we will promote directly to users who are likely to enjoy the content we are recommending.

Data Characteristics

Before explaining the design of our recommendation system, we wanted to explain some characteristics of our data.

Since a lot of our content is comprised of episodes or clips within a show, we have decided to recommend shows to users instead of individual videos. Shows are a good method of organization, and videos in the same show are usually very closely related.

Our content can be mainly divided into two parts: on-air shows and library shows. On-air shows are highly important since more than half of our streaming comes from them.

Although on-air shows occupy a large part of our content, they are touched by a seasonal effect. During summer months, most of on-air shows do not air, causing on-air show streaming to decrease. Furthermore, there are fewer shows aired during weekends, thus the streaming of library shows will increase. Keeping this information in mind we can design the recommendation system to recommend more library shows to users during the weekend or summer months, as an example.

The key data that drives most recommendation systems is user behavior data. There are two main types of user behavior data: implicit user feedback data and explicit user feedback data. Explicit user feedback data primarily includes user voting data. Implicit feedback data includes information on users watching, browsing, searching, etc. Explicit feedback data can show a user’s preference on a show explicitly, but implicit feedback data cannot. For example, if a user gives a 5-star rating to a show, we know that this user likes the show very much. But if a user only watches a video from a show page or searches for a show, we don’t know whether this user likes the show.

As the quantity of implicit data at Hulu far outweighs the amount of explicit feedback, our system should be designed primarily to work with implicit feedback data.

Architecture

There are many different types of recommendation algorithms, and perhaps the most famous algorithm is collaborative filtering (CF). CF relies on user behavior data, and its main idea is to predict user preferences by analyzing their behaviors. There are two types of CF methods: user-based CF (UserCF) and item-based CF (ItemCF). UserCF assumes that a user will prefer items which are liked by other users who have similar preferences to that user. ItemCF assumes that a user will prefer items similar to the assets he or she preferred previously. ItemCF is widely used by many others (for example, Amazon and Netflix), as it has two main advantages. Firstly, it is suitable for sites where there are a lot more users than items. Secondly, ItemCF could easily explain recommendations given users’ historical behaviors. For example, if you have watched “Family Guy” on Hulu, we will recommend “American Dad” to you and tell you that we recommend this because you have watched “Family Guy”. So we use ItemCF as our basic recommendation algorithm in Hulu.

On-line Architecture

Figure 1 shows our on-line architecture of the recommendation system. This system contains 5 main modules:

<

ol>

  • User profile builder: When a user first comes into the recommendation system, we will first build a profile for them. The profile includes the user’s historical behaviors and topics, and these are generated from their old behaviors. Users can have many different types of behaviors. For example, they can watch videos, add shows to favorites, search for videos and vote on videos and shows. All these behaviors are all considered by our system and, after extracting all these behaviors, we use a topic model which is trained offline to generate users’ preference on topics.
  • Recommendation Core: After generating the list of user’s historical preferences on shows and topics, we put all of those similar shows into raw recommendations.
  • Filtering: For some pretty obvious reasons, raw recommendation results cannot be presented to users directly. We need to filter out shows the user has already seen or engaged with, so we can increase the recommendations shows a little more precise.
  • Ranking: The ranking module will re-rank raw recommendations to make them better fit users preferences. First, we’ll make recommendation more diverse. Then we’ll increase novelty of recommendations so that users will find shows they like, but have never seen before.
  • Explanation:Explanation is one of the most important components of every recommendation system. The explanation module generates some reasoning for every recommendation result using the user’s historical behaviors. For example, we will recommend “American Dad” to a user who had previously watched “Family Guy.” The explanation will say, “We recommend ‘American Dad’ to you because you have watched ‘Family Guy’”.

    Figure 1 : Architecture for Hulu

    Off-line Architecture

    In the above on-line architecture, some components rely on offline resources, such as the topic model, related model, feedback model, etc. The off-line system is also an important part of our recommendation system. Our off-line system has these main components:

    1. Data Center: The data center contains all user behavior data in Hulu. Some of them are stored in Hadoop clusters and some of them are stored in a relational database.
    2. Related Table Generator: The related table is an important resource for on-line recommendation. We use two main types of related table: one that’s based on collaborative filtering (which we’ll call CF), and another based on content. In CF, show A and show B will have high similarity if users who like show A also like show B. With content filtering, we use content information including title, description, channel, company, actor/actress, and tags.
    3. Topic Model: A topic is represented by a group of shows that have similar content. Topics are thus larger in scope than shows, but they’re still smaller than channels. Our topics are learned by LDA, which is a popular topic model in machine learning.
    4. Feedback Analyzer: Feedback specifically means users’ reactions to recommendation results. Using user feedback can improve recommendation quality. For example, say a show is recommended to many users, but most of them do not click this show. In that case, we’ll decrease the rank of this show. Users will also have different types of behavior, so we’ll use all these behaviors in developing the recommendations. However, some users may prefer recommendations to come from their prior watch history, and some users may prefer their recommendations to come from their voting behavior. All these effects can be modeled offline by analyzing users’ feedback on their recommendations.
    5. Report Generator: Evaluation is most important part of the recommendation system. The report generator will generate a report including multiple metrics every day to show the quality of recommendations. At Hulu we monitor metrics including CTR, conversion ratio, etc.

     

    Figure 2 : Architecture for Hulu

     

    Algorithms

    So far, we’ve given a brief overview of our recommendation architecture. From previous discussion, we can see that Hulu’s recommendation system is primarily based on ItemCF. We’ve added many improvements on top of the ItemCF algorithm, too, in order to make it generate better recommendations. To test these improvements, we’ve performed many A/B tests on different algorithms. In following sections, we’ll introduce some of these algorithms and the experiment results.

    Item-based Collaborative Filtering

    Item-based Collaborative Filtering (ItemCF) is the basis of all our algorithms. In ItemCF, let N(u) be a set of items user u has preferred previously. User u’s preference on item i (i is not in N(u)) can then be measured by:

    p(u,i)=\sum\limits_{j\in N(u)}^{} r(u,j)s(i,j)

    Here, r(u,j) is the preference weight of user u on show j, and s(i,j) is the similarity between show i and show j. In CF, the similarity between two shows is calculated by user behavior data on these two shows. Let N(i) be a set of users who watched show i and N(j) be a set of users who watched show j. Then, the similarity s(i,j) between show i and show j is calculated by following formula:

    s(i,j)=\frac{\left | N(i)\cap N(j) \right |}{\sqrt{\left | N(i) \parallel N(j) \right |}}

    In this definition, show i will be highly relevant to show j if most users who watch show i will also watch show j. However, this definition will have the “Harry Potter problem,” which means that every show will have high relevance with popular shows.

    Recent Behavior

    The first lesson we learned from A/B testing is that recommendations should fit users’ recent preference and that users’ recent behavior is more important than their older, historical behaviors. So, in our engine, we will put more weight on users’ recent behaviors. In our system, CTR of recommendations that originate from users’ recent watch behavior is 1.8 times higher than CTR of recommendations originating from users’ old watch behavior.

     

     

    Novelty

    Just because a recommendation system can accurately predict user behavior does not mean it produces a show that you want to recommend to an active user. For example, “Family Guy” is a very popular show on Hulu, and thus most users have watched at least some episodes from this show. These  users do not need us to recommend this show to them — the show is popular enough that users will decide whether or not to watch it by themselves.

    Thus, novelty is also an important metric to evaluate recommendations. The first way we think can increase novelty is by revising ItemCF algorithm:

    1. First, we will decrease weight of popular shows that users have watched before.
    2. Then, we’ll put more weight on shows that are not only similar to shows the active user watched before, but also less popular than shows the active user watched before.

    Explanation-based Diversity

    Most users have diverse preferences, so the recommendation should also meet their diverse interests. In our system, we use explanations to diversify our recommendations. We think a diverse recommendation means most of the recommendation shows have different explanations.

    We have performed an A/B test to show the usefulness of diversification (shown in the above figure). The results of the experiment show that, for active users who had previously watched 10 or more shows, diversification can increase recommendation CTR significantly.

    Temporal Diversity

    A good recommendation system should not generate static recommendations. Users want to see new suggestions every time they visit the recommendation system. If a user has new behaviors, she will find her recommendations have changed because we have put more weight on the user’s recent behaviors. But if a user has no new behaviors, we also need to change our recommendations. We use three methods to keep temporal diversity of our system:

    1. First, we’ll recommend recently-added shows to users. Many new shows are added to Hulu every day, and we will suggest these shows to users who will like them. Thus, users will see fresh ideas for shows to watch when new ones are added.
    2. Second, we will randomize our recommendations. Randomization is the simplest way to keep recommendations fresh.
    3. Finally, we’ll decrease rank of recommendations which users have seen many times. This is called implicit feedback, and data show that CTR is increased by 10% after using this method.

    Performance of Hulu’s Recommendation Hub

    The recommendation hub is a personal recommendation page for every user. On this page users will see 6 carousels. The top carousel is “top recommendations”, which includes shows that we think users will prefer very much. After top recommendations, there are three carousels for three genres. These three genres are selected by analyzing users’ historical preferences. The next carousel is bookmarks, which include shows that users have indicated they’d like to watch later. The last carousel is filled with shows that the user has already rated. This carousel is designed to collect more explicit feedback from users.

    We have performed an A/B test to compare our recommendation algorithms with two simple recommendation algorithms: Most Popular (which recommends the most popular shows to every user) and Highest Rated (which recommends highly-rated shows to every user). As shown in the above figure, experiment results show that the CTR of our algorithm is much higher than both simple methods.

    Lessons

    Every user behavior can reflect user preferences.

    In our system, we use a slew of user behaviors to come up with our recommendations. We’ve calculated the CTR of recommendations originating from different types of behaviors. As shown in Figure 3, we can see that recommendations from every type of behavior can generate recommendations that will be clicked by users.

    Figure 3 : CTR of recommendations come from different types of behaviors

    Explicit Feedback data is more important than implicit feedback data

    As shown in Figure 3, CTR of recommendations that originate from users’ historically loved (vote 5 stars on shows) and liked (vote 4 stars on shows) behaviors is higher than CTR of recommendations that come from users’ historical subscribe/watch/search behavior. So although the size our explicit feedback data is much smaller than implicit feedback data, they’re much more important.

    Recent behaviors are much more important than old behaviors

    Novelty, Diversity, and offline Accuracy are all important factors

    Most researchers focus on improving offline accuracy, such as RMSE, precision/recall. However, recommendation systems that can accurately predict user behavior alone may not be a good enough for practical use. A good recommendation system should consider multiple factors together. In our system, after considering novelty and diversity, the CTR has improved by more than 10%.

    Based on the paper “Recommendation System at Hulu” by Liang Xiang, Hua Zheng and Hang Li.
    Hua Zheng is the senior lead developer in charge of the Hulu content recommendation and behavior targeting systems.
    Dr. Xiang and Dr. Li, associate researchers, are working together on the recommendation system, helping users discover and enjoy relevant premium videos.
  • Last comment: May 8th 2012 3 Comments

    Introducing Ectyper, a dynamic image generation service

    August 22nd, 2011 by Daniel Bear

    Ectyper is a Python project designed to make it trivial to create an image transformation web service. It’s still in the early stages, but is being used in production here at Hulu. We’re releasing it under an open source license, so you’re welcome to use it (see the end of this post for details).

    All you have to do to get started is extend the base Ectyper classes to define where your source images reside. The codebase is a set of Tornado request handlers that allows on-the-fly conversion of images according to a request’s query string options. By default it provides a way to resize, reflect and reformat images.

    Motivation

    Every video, series, network, and genre on Hulu has an image to represent it. Originally these images were generated as part of our ingestion and publish workflows. This is basically when content providers provide raw assets and we perform the automated and manual steps to prepare it for user-visible products. A predefined set of image formats and dimensions were generated and stored alongside the output of our video transcoding tasks. These cut images were then pushed to our origin servers and/or our content delivery network (CDN) servers for storage and highly cached delivery.

    This worked fine for hulu.com, but as we started to build new user experiences for Hulu Plus on mobile phones, tablets, gaming consoles, set-top boxes, and connected TV/Blu-Ray players, it became clear that we would need a variety of new image dimensions. While we could just have the device applications resize the existing images, we had a few motivations for creating custom sizes to fit each UI layout. First, visual quality can suffer from scaling artifacts on certain devices. Also, for performance reasons, it’s preferable to fetch the smallest possible file size, especially on mobile devices on cell networks. And we observed that some low-powered connected TVs could not scale images at all without incurring drastic performance hits.

    Additional Transformations

    While image resizing was the initial motivation for Ectyper, we continued to find opportunities to leverage server-side image transformations. For example, the new UIs our design team dreamed up made use of effects such as image reflections. Certain devices had the power and APIs to do these client-side, but most low-powered devices didn’t, so we added additional parameters to Ectyper to perform these types of operations. Anything you can perform using the underlying image libraries could be exposed in this way.

    Implementation

    Ectyper provides a primary ImageHandler class (extending Tornado’s RequestHandler class). This class parses each request’s query string and converts it into a command-line call into ImageMagick’s convert. This class is wrapped by ectyper.magick.ImageMagick which can be overridden to provide other options.

    A request into an ImageHandler will automatically generate an ImageMagick object based on standard options (see help(ectyper.handlers.ImageHandler) for more). Once your handler calls convert_image() with a local file path or a remote URL, ImageHandler kicks off an ImageMagick convert process with the calculated options. The result is streamed to the browser. You can optionally extend CachingImageHandler or FileCachingImageHandler to stream the output elsewhere for caching purposes.

    A full list of options supported by default:

    &size=NxM
       Resize the source image to N pixels wide and M pixels high.
    
    &maintain_ratio=1
       Maintain aspect ratio when resizing (ignored if size is not
       provided).  If requested size is a different aspect ratio than
       the source image, the resize will scale to fit the width or height,
       whichever is reached first.  The other dimension will be centered
       vertically or horizontally as necessary.  Defaults to 0.
    
    &reflection_height=N
       Flip the image upside down and apply a gradient to mimic a
       reflected image.  reflection_alpha_top and reflection_alpha_bottom
       can be used to set the gradient parameters.
    
    &reflection_alpha_top=N
       The top value to use when generating the gradient for
       reflection_height, ignored if that parameter is not set.  Should be
       between 0 and 1. Defaults to 1.
    
    &reflection_alpha_bottom=N
       The bottom value to use when generating the gradient for
       reflection_height, ignored if that parameter is not set.  Should be
       between 0 and 1.  Defaults to 0.
    
    &format=(jpeg|png|png16)
       Format to convert the image into.  Defaults to jpeg.
       png16 is 24-bit png pre-dithered for 16-bit (RGB555) screens.

    Examples

    Here’s a simple example. Assume you have a directory of images at /path/to/images and this hulu.jpg is one of them:

    Here’s a trivial ImageHandler that directs the request path to the image directory:

    from ectyper.handlers import ImageHandler
    import os
    from tornado import web, ioloop
    
    class StreamLocal(ImageHandler):
        def handler(self, *args):
            self.convert_image(os.path.join("/path/to/images", args[0]))
    
    app = web.Application([
        ('/images/(.*)', StreamLocal),
    ])
    
    if __name__ == "__main__":
        app.listen(8888)
        ioloop.IOLoop.instance().start()

    Now if you run the above to start up this ImageHandler, a resize can be performed like this: http://localhost:8888/images/hulu.jpg?size=200×96

    A reformat like this: http://localhost:8888/images/hulu.jpg?format=png

    And a reflection like so: http://localhost:8888/images/hulu.jpg?size=200×96&format=png&reflection_height=60

    Check out examples.py in the source code for samples of how to introduce logic in the handler to achieve more interesting use cases.

    Tornado

    Tornado serves as the underlying web server. It’s well suited for the use case due to its asynchronous nature. As a result, both the image transformations and necessary interactions with other Hulu services are non-blocking.

    ImageMagick

    The initial implementation leveraged Python Image Library, which was intuitive to develop the necessary transformation operations. After a variety of tests, we found the output quality of ImageMagick to be superior, although the programmatic interface is less intuitive.

    Ectyper in Production

    Image processing is an inherently costly operation relative to many of the tasks our backend services perform. We’ve gone through a variety of designs to bring the Ectyper functionality we require up to production scale, and we’re still tweaking it. Caching is critically important. We leverage an on-disk cache for each of the Ectyper machines (refer to FileCachingImageHandler), as well as edge caching on our CDNs’ servers. While it’s possible to perform transformations on-the-fly, we instead favor a fail-first approach where new transformations are entered into a queue for background processing. When we know new transformations are going to be utilized in production, we have a simple warmer to backfill the new requests without impacting live user requests.

    Future

    Remember, Ectyper is still in the early stages. Here are a few possible next steps:

    • Switch to MagickWand/AsyncHTTPClient instead of forking convert/curl.
    • Change the ImageMagick loading method (right now you set a static variable on the ImageHandler to the class which is a little wonky).
    • Push more of the backfill and warming functionality we’re using in production into the base Ectyper codebase.
    • Add more image transformations (http://www.imagemagick.org/image/examples.jpg).
    • Extract Tornado and ImageMagick functionality into pluggable modules.

    Source

    The Ectyper repository lives at https://github.com/hulu/ectyper, and we would encourage you to join the discussion group at http://groups.google.com/group/ectyper-dev and let us know how you’re using Ectyper. Your feedback and patches are welcome!

    Daniel Bear leads software development for connected device applications.
    Last comment: Oct 17th 2011 1 Comment