Developing for Chromecast, a $35 Internet-to-TV streaming stick. Worth your while?

It`s been over two months since Google made a splash with Chromecast, a 2-in gizmo that plugs into your TV`s HDMI port and can be controlled by more than one device – phone, tablet and computer – if they`re on the same Wi-Fi network as the Chromecast.

One of our projects at Five minutes involved implementing the technologies on an Android and iOS app for a client. We gave an intro to developing apps for Google Chromecast a few weeks back at an Android meetup organized by the Google Developer Group Zagreb, hosted at the Faculty of electrical engineering and computing, but would like to share our experience for all those who couldn`t make it.

Google Cast is, essentially, a screen sharing technology that lets users send and control content like video from a small computing device like a phone, tablet or a laptop to a large display device like a TV. For instance, if you like to send a Youtube video from your phone to your TV you can do it easily, without on-screen menus to navigate, no extra devices.

How it works

The Chromecast stick runs a scaled-down Chrome browser with a receiver application that uses websockets to maintain a control channel to mobile devices or a Chrome browser running on a Mac or a PC. For video playback, Google provides an implementation of a special protocol, called RAMP (Remote Application Media Protocol), on top of this channel. When playing a video, the control device uses RAMP to send a URL pointing to the video resource located on the internet or local network, which is then loaded in an HTML5 video element on the Chromecast stick. RAMP provides means to easily send most common video control commands from control device to the stick and to send playback status in the other direction. This flow can be customized to facilitate authentication, DRM and other scenarios.

The interaction between control devices and the Chromecast stick is not limited to video playback, and other scenarios, like playing a multiplayer board game where each player has their own control device, are certainly possible. But, at least for now, Google places heavy emphasis on multimedia consumption.

UX matters – developing a video playback app

At this point, to even start developing for Chromecast, you need to be whitelisted by Google. Doing this will map your development Chromecast unit to the publicly accessible URL from which the receiver app will be loaded. If this step is skipped, Chromecast will silently refuse to load the receiver app.

Google also places much emphasis on user experience and provides a document with guidelines to follow when specifically developing a video playback app. These are thorough and cover things from placement and behaviour of the Chromecast button on the mobile device to conditions for showing video playback status on a TV screen. For an app to be considered for public availability, it will need to adhere to most, if not all, of the guidelines. A good reference, that does not follow the guidelines to the letter, is Google’s own Youtube app.

Arguably, the most important part of the whole user experience is the ability to control the playback of a video from multiple devices simultaneously. It is important that there is no difference between the device which initiated playback, and other devices that later joined to take control of the video. This poses several challenges:

  • Trying to play a video that is already being casted from another device needs to be handled in a different way
  • Defining the correct behaviour when switching to Chromecast playback from local video player when the video that is being played locally is already being casted from another device – should the video resume playing from the last local playback position or should the player just attach and not interrupt the video on the Chromecast device?
  • Synchronization of all control devices UI when pausing, resuming playback, changing volume etc. on one control device
  • Detecting number of connected control devices on the receiver and stopping playback only when the last device has disconnected

Dealing with these challenges will include working on both sender and receiver side of the system, sometimes fighting against the system or dealing with, at this moment, sparse documentation.

iOS sender application development

The sender part of the system will usually be embedded in the existing video playback application. How the Chromecast menu button (a central point for interaction with Chromecast as defined in the UX guidelines) will be implemented, where will it be placed and how the switching from and to local playback will be performed, therefore, depends on the specific application.

The API that the Google Cast SDK provides is in essence similar to any other video player object (e.g. Cocoa’s own AVPlayer). It provides methods to load a video, pause or resume playback, set volume, get current playback position, enable autoplay when video loads etc. and it also provides a delegate callback when playback status has changed. This makes it convenient for embedding in the existing video player app, especially if their custom wrappers around native player objects and APIs already exist, and conceptually makes casting as simple as dealing with any other video player API. There are however a couple of implementation details to keep in mind.

One of these is that when the playback status update callback is called, the information on which state property has changed is not available, so on every status update any interesting property needs to be tracked and its last value compared to see if it changed. In addition, status update callback is not called every time playback position changes on the receiver side. Instead, SDK tracks the playback time of the last status update and then tries to compute the current playback position when it is requested without contacting the receiver. In theory, this should give the same result as when requesting the info from the receiver, but in practice this mechanism couldn’t be relied on. The playback position being reported would often not change and it would take a couple of seconds for it to finally jump to the correct position. Luckily, we can force a status update on the receiver side each time the playback position changes.

The most important part of what constitutes a playback status is probably the player state property. It can take one of the four values:

  • Unknown
  • Idle
  • Stopped
  • Playing

If no session to a Chromecast device is established, the player state will be Unknown. Once the session is established and receiver app launches, player state will be Idle. After the video has been loaded, the player state will be Stopped. Starting/or resuming playback will change the state to active, and pausing playback will change the state back to Stopped. When the video ends, the player state will change back to Idle. This change to Idle state back from Stopped/Playing is very important since it is the only means to detect on the sender side that it has reached the end of the video (short of sending some kind of flag via content info map).

As was mentioned before, to meet Google’s user experience guidelines, we need to be able to attach the sender app to a video that is already playing on a Chromecast device. This is not handled automatically by the SDK, and loading a video that is already playing will first switch to idle state, and then start playback on Chromecast, as if any other video has been loaded. Therefore, currently playing URL needs to be tracked and compared with the URL that is being loaded. If these two match, video loading should be skipped, but any post loading logic will probably need to get executed in order to get the app into consistent state.

Receiver application development

A receiver applications is just a simple HTML/CSS/Javascript single page app that has one HTML5 media (audio or video) element defined and which runs in a stripped-down Chrome browser. It is completely independent of the rest of the system, and the only hard requirement is the need to use the Javascript SDK that Google hosts on their servers. There are a couple of things to keep in mind when developing the receiver app:

  • the Javascript and rendering engines are the same as in desktop/mobile versions of Chrome and any libraries that are usually used when developing HTML apps can be used on Chromecast as well (e.g. jQuery, underscore, LESS, etc.)
  • the deployment target is just a Chrome browser, which means we have a modern, full-featured browser with support for the latest HTML5, CSS3 standards and standard drafts, a fast Javascript engine, and it is also not required to handle various browser inconsistencies (e.g. no need to use -ff- or -ms- CSS prefixes)
  • the app is running full-screen on a narrow set of possible screen resolutions, so the need for responsive design considerations is minimal, if not non-existent; in addition, user experience guidelines strictly define the must-have screens and their general look – in effect, this means that our app will transition between a couple of absolutely positioned divs that stretch the whole screen (top: 0, bottom: 0, left: 0, right: 0), and video UI overlayed in some player states will also be absolutely positioned
  • since the app is running on a constrained hardware that also needs to handle playback of HD videos, any UI animations/transitions will need to be subtle and simple as possible, and preferably done via CSS instead of Javascript
  • UI graphics are loaded from the receiver host (location defined when whitelisting the device) located on the Internet, and it may take some time to download them for the first time, which can downgrade the experience if not handled properly; try implementing any graphics in CSS if possible, and if not, prefer embedding images via Data URI scheme, since they are as much a part of your app as other page elements. If it is not possible to include an image via Data URI scheme (e.g. video thumbnail), try preloading it first or use a javascript library to detect when it has finished loading, and show the image element only when it has finished loading.
  • While not required, Google Closure library is a natural fit to use

The most important part of the receiver application, when implementing a RAMP based receiver app, is the cast.receiver.RemoteMedia “class”. It acts as a wrapper and a bridge between a HTML5 media element and its events on one side and communication channels that the SDK establishes between the receiver app and the sender apps running on control devices on the other side.

To implement the functionalities specific to the app, methods of cast.receiver.RemoteMedia need to be overridden. Existing functionality can be dropped and completely reimplemented, but we found that we mostly need to make some additions to the current behaviour related to managing the UI or the content info map. Unfortunately, this object does not have it’s prototype.constructor property set to the proper constructor so we can’t use Closure library’s google.base function to refer to the base class implementation. Using the Clousure library provided superClass_ property also didn’t work, as it seems some of the methods need to be defined on the leaf of the prototype chain. In the long run, this worked perfect for us:

goog.provide('custom.namespace.ExtendedRemoteMedia');

custom.namespace.ExtendedRemoteMedia = function() {
    // define private properties and methods here
 }

goog.inherits(custom.namespace.ExtendedRemoteMedia, cast.receiver.RemoteMedia);

goog.object.extend(custom.namespace.ExtendedRemoteMedia.prototype, {
    baseMethod: custom.namespace.ExtendedRemoteMedia.prototype.method,method: function() {
        // do stuff
        this.baseMethod();
        // do stuff
    }
});

In the end, we need to override 9 methods:

  • setMediaElement
  • onLoad
  • onMetadataLoaded
  • onLoadMetadataError
  • onOpen
  • onClosed
  • onEnded
  • onPlay
  • onStop

We also defined two new methods:

  • onTimeUpdate
  • onMediaElementError

We bound these to their respective media element events in the overriden setMediaElement method:

goog.object.extend(custom.namespace.ExtendedRemoteMedia.prototype, {
    baseSetMediaElement:
        custom.namespace.ExtendedRemoteMedia.prototype.setMediaElement,setMediaElement: function(mediaElement) {
        this.baseSetMediaElement(mediaElement);
        this.mediaElement_ = mediaElement;

        this.mediaElement_.addEventListener('timeupdate',this.onTimeUpdate.bind(this));
        this.mediaElement_.addEventListener('error',this.onMediaElementError.bind(this));
    }
});

<p dir="ltr">In the  <code>onTimeUpdate</code> method, in addition to doing some UI transitions, we forced the receiver to send status update to its clients. This had the effect of having a much more precise playback position data on the sender:</p>
<pre>
goog.object.extend(custom.namespace.ExtendedRemoteMedia.prototype, {
    onTimeUpdate: function() {
        this.broadcastCurrentStatus();
        ...
    }
});

We also discovered that clients are not notified of all the errors that happen on the media element. Specifically, if the video can’t be loaded from the supplied URL, no error will be sent on the channel of the client that tried to load the video, so we needed to do this manually:

goog.object.extend(custom.namespace.ExtendedRemoteMedia.prototype, {
    baseOnLoad:
        custom.namespace.ExtendedRemoteMedia.RemoteMedia.prototype.onLoad,onLoad: function(channel, message) {
        this.loading_ = {
            channel: channel,
            message: message
        };
        this.baseOnLoad(channel, message);
        ...
    },
    baseOnMetadataLoaded:
        custom.namespace.ExtendedRemoteMedia.prototype.onMetadataLoaded,onMetadataLoaded: function(channel, message) {
        this.baseOnMetadataLoaded(channel, message);
        this.loading_ = null;
        ...
    },
    baseOnLoadMetadataError:
        custom.namespace.ExtendedRemoteMedia.prototype.onLoadMetadataError,onLoadMetadataError: function(channel, message) {
        this.baseOnLoadMetadataError(channel, message);
        this.loading_ = null;
    },
    onMediaElementError: function() {
        if (this.loading_) {
            this.sendErrorResponse(
                this.loading_.channel,
                this.loading_.message,
                cast.receiver.RemoteMedia.ErrorCode.LOAD_FAILED);
            this.loading_ = null;
        }
    }
});

Finally, we needed to keep track of open channel count. When no more channels were connected, we needed to unload the video and setup a timer to close the app after some period of inactivity:

goog.object.extend(custom.namespace.ExtendedRemoteMedia.prototype, {
    baseOnLoad:
        custom.namespace.ExtendedRemoteMedia.RemoteMedia.prototype.onLoad,onLoad: function(channel, message) {
        ...
        this.closeDelay_ && this.closeDelay_.dispose();
        this.closeDelay_ = null;
        ...
    },
    baseOnClosed: custom.namespace.ExtendedRemoteMedia.prototype.onClosed,onClosed: function(event) {
        this.baseOnClosed(event);
        var channels = this.getChannels();
        if (!channels || !channels.length) {
            // unload video
            ...
            this.closeDelay_ = new goog.async.Delay(function() {
                window.close();
            }, 5 * 60 * 1000);
        }
    }
});

Our view on Chromecast

Although developing for the Chromecast exposes the fact that it is not yet a mature product (from the third-party application development standpoint), which can in-turn cause plenty of headaches, in the end it proved to be a rewarding and interesting endeavour. Highlights for me would be working across different platforms (iOS, Android, web), and using some of the bleeding-edge technologies like CSS3 transitions and animations or HTML5 video.


While Google currently positions the device primarily as a remote media streamer, there is a potential here for a more general solution that bridges the divide between two devices we mostly use for entertainment today – mobile phones and tablets on one side and big screen TVs on the other. With projects like http://www.quakejs.com/ it is not hard to imagine a future where a computer game could be running in a browser on a big screen TV, while a phone or a tablet is used as a controller and secondary display. Although Chromecast doesn’t provide us with this ability just yet, it is certainly a step in the right direction, which is just one reason to follow this product and the technology it is based on in months to come.

Additional article contributors: Krešimir Mišura & Branimir Conjar