Are we Approaching Responsive Images the Wrong Way?

There's a lot of talk about responsive images right now. Some white knights in the web design community are rallying around this problem and fair play to them. It's a tough problem that needs solving. Responsive design is great in theory, but if we are forcing mobile users download ridiculous sized images and grinding their experience down to a halt who can blame some if they declare separate mobile sites to be the best solution?

Our brightest hope for the future seems to be resting on the shoulders of the (currently fictional) <picture> element. Here's the html set-up:


   <picture alt="A giant stone face at The Bayon temple in Angkor Thom, Cambodia">
        <source src="small.jpg">
        <source src="medium.jpg" media="(min-width: 400px)">
        <source src="large.jpg" media="(min-width: 800px)">
    </picture>

Simple enough right? Any browser width below 400 pixels will get served small.jpg. Any browser width between 400-800 pixels will get medium.jpg. Everything else will get large.jpg. A simple and front-end solution no doubt. There are some drawbacks with this method though.

Rigid breakpoints

Remember a technique we called Adaptive Layouts? Instead of using fluid, percentage based layouts it serves several fixed width layouts in pixels using media queries. Browser widths that fall in between each breakpoint would get an unoptimised layout, with space either side of the design.

Doesn't this feel lot like our proposed solution for the <picture> element? Any browser width that falls in between each specified breakpoint would still load unused and wasted pixels. What if my device width was 401 pixels? Using the example code above it would load the full 800 pixel image. Let's not forget that is per image so it could be adding up to a lot of wasted bandwidth.

Adaptive Layouts eventually receded, with the philosophy of Responsive Design advocating a design with layouts accessible to all kinds of devices, not just optimised for the devices you choose at the time of building.

Rigid pre-defined breakpoints based on current devices... how much does this solution feel like that Adaptive Layouts again?

It does not scale

Let's imagine, in the future, that we might need to adapt to other factors in our images. Factors like colour. No point wasting bandwidth on rgb colour values right? How about factors like network speed? Even if the device has a large screen, is there any point loading a massive image on a slow connection? Maybe we want to load an aggressively compressed JPEG?

Let's imagine the same code with just these two factors thrown in:


   <picture alt="A giant stone face at The Bayon temple in Angkor Thom, Cambodia">
        <source src="small_bw.jpg">
        <source src="small_color.jpg" media="(color)">
        <source src="medium_bw.jpg" media="(min-width: 400px)">
        <source src="medium_bw_slow.jpg" media="(min-width: 400px and network: <= 3g)">
        <source src="medium.jpg" media="(min-width: 400px and color)">
        <source src="medium_slow.jpg" media="(min-width: 400px and color and network: <= 3g)">
        <source src="large_bw.jpg" media="(min-width: 800px)">
        <source src="large_bw_slow.jpg" media="(min-width: 800px and network: <= 3g)">
        <source src="large.jpg" media="(min-width: 800px and color)">
        <source src="large_slow.jpg" media="(min-width: 800px and color and network: <= 3g)">
    </picture>

A futuristic example but not an unrealistic one. These factors already exist and affect people browsing the web. Imagine if we added yet another factor into the mix, like very real and very scary ppi variations? We just doubled our mark-up again, each factor combination is also going to need a hi-res copy.

To hell and back

All I've done is point out two major problems with the current, proposed concept. I want this conversation to continue, not die. Here's my proposal:

You know what's really good at handling complex logic and serving different types of content? Server side code. Programming code is very adept at handling complex logic with many variables. There are some very mature and robust image generators written in server side code that do all kinds of magic like dynamically resize, reformat, and modify. It's no surprise that the most robust solution so far takes advantage of dynamic image generation on the server.

An algorithm could dynamically resize to the exact size required by the device visiting the site without any extra work from developers or clients. Just upload the largest size you have and let the server scale it down. We could also conditionally convert the image to greyscale and increase compression if we need to. Again with no extra work from us. It scales like a dream.

What's the catch?

Of course there's a problem with all this! The problem with moving all this logic into the is context. The server needs to know the variables involved to make these decisions. Variables like color support, screen size, network speed, and pixel density are available on the front-end through Javascript APIs.

If we want this information to be available to the server I see three options:

  1. Write a cookie in Javascript to pass to the server. There are already libraries that do this.
  2. Use a user agent library like WURFL to map the UA string to device capabilities.
  3. Pressure browser vendors to add more contextual information in a http request.

With this article I'm trying to do the third. Client requests already contain information servers use to serve different content. Let's just add a bit more. I'd like this content please. Oh and by the way here's some useful information to help you find the right format for me.

Is the server the holy grail? Let me know what you think.