Editor’s note: The term ‘computational photography’ gets used a lot these days, but what exactly does it mean? In this article, the first in a three-part series, guest contributor Vasily Zubarev takes us on a journey from present to future, explaining computational photography today, where it’s going and how it will change the very essence of photography.
Series overview:
- Part I: What is Computational Photography?
- Part II: Computational sensors and optics (coming soon)
- Part III: Computational lighting, 3D scene and augmented reality (coming soon)
You can visit Vasily’s website where he also demystifies other complex subjects. If you find this article useful we encourage you to give him a small donation so that he can write about other interesting topics.
Computational Photography: From Selfies to Black Holes
It’s impossible to imagine a smartphone presentation today without dancing around its camera. Google makes Pixel shoot in the dark, Huawei zooms like a telescope, Samsung puts lidars inside, and Apple presents the new world’s roundest corners. Illegal level of innovations happening here.
DSLRs, on the other hand, seem half dead. Sony showers everybody with a new sensor-megapixel rain every year, while manufacturers lazily update the minor version number and keep lying on piles of cash from movie makers. I have a $ 3000 Nikon on my desk, but I take an iPhone on my travels. Why?
It’s impossible to imagine a smartphone presentation today without dancing around its camera.
I went online with this question. There, I saw a lot of debate about “algorithms” and “neural networks”, though no one could explain how exactly they affect a photo. Journalists are loudly reading the number of megapixels from press releases, bloggers are shutting down the Internet with more unboxings, and the camera-nerds are overflowing it with “sensual perception of the sensor color palette”. Ah, Internet. You gave us access to all the information. Love you.
Thus, I spent half of my life to understand the whole thing on my own. I’ll try to explain everything I found in this article, otherwise I’ll forget it in a month.
What is Computational Photography?
Everywhere, including Wikipedia, you get a definition like this: computational photography is a digital image capture and processing techniques that use digital computation instead of optical processes. Everything is fine with it except that it’s bullshit. The fuzziness of the official definitions kinda indicates that we still have no idea what are we doing.
Stanford Professor and pioneer of computational photography Marc Levoy (he was also behind many of the innovations in Google’s Pixel cameras) gives another definition – computational imaging techniques enhance or extend the capabilities of digital photography in which the output is an ordinary photograph, but one that could not have been taken by a traditional camera. I like it more, and in the article, I will follow this definition.
So, the smartphones were to blame for everything.
So, the smartphones were to blame for everything. Smartphones had no choice but to give life to a new kind of photography — computational.
They had little noisy sensors and tiny slow lenses. According to all the laws of physics, they could only bring us pain and suffering. And they did. Until some devs figured out how to use their strengths to overcome the weaknesses: fast electronic shutters, powerful processors, and software.
Most of the significant research in the computational photography field was done in 2005-2015, which counts as yesterday in science. That means, right now, just in front of our eyes and inside our pockets, there’s a new field of knowledge and technology rising that never existed before.
Computational photography isn’t just about the bokeh on selfies. A recent photograph of a black hole would not have been taken without using computational photography methods. To take such picture with a standard telescope, we would have to make it the size of the Earth. However, by combining the data of eight radio telescopes at different locations of our Earth-ball and writing some cool Python scripts, we got the world’s first picture of the event horizon.
It’s still good for selfies though, don’t worry.
- Link: Computational Photography: Principles and Practice
- Link: Marc Levoy: New Techniques in Computational photography
(I’m going to insert such links in the course of the story. They will lead you to the rare brilliant articles or videos that I found, and allow you to dive deeper into a topic if you suddenly become interested. Because I physically can’t tell you everything in one article.)
The Beginning: Digital Processing
Let’s get back to 2010. Justin Bieber released his first album and the Burj Khalifa had just opened in Dubai, but we couldn’t even capture these two great universe events because our photos were noisy 2-megapixel JPEGs. We got the first irresistible desire to hide the worthlessness of mobile cameras by using “vintage” presets. Instagram comes out.
Math and Instagram
With the release of Instagram, everyone got obsessed with filters. As the man who reverse engineered the X-Pro II, Lo-Fi, and Valencia for, of course, research (hehe) purposes, I still remember that they comprised three components:
- Color settings (Hue, Saturation, Lightness, Contrast, Levels, etc.) are simple coefficients, just like in any presets that photographers used since ancient times.
- Tone Mapping is a vector of values, each tells us that “red with a hue of 128 should be turned into a hue of 240”. It’s often represented as a single-pixel picture, like this one. This is an example for the X-Pro II filter.
- Overlay — translucent picture with dust, grain, vignette, and everything else that can be applied from above to get the (not at all, yeah) banal effect of the old film. Used rarely.
Modern filters have not gone far from these three, but have become a little more complicated from the math perspective. With the advent of hardware shaders and OpenCL on smartphones, they were quickly rewritten under the GPU, and it was considered insanely cool. For 2012, of course. Today any kid can do the same thing on CSS, but he still won’t invite a girl to prom.
However, progress in the area of filters has not stopped there. Guys from Dehan?er, for example, are getting very hands-on with non-linear filters. Instead of poor-human tone-mapping, they use more posh and complex non-linear transformations, which opens up many more opportunities, according to them.
With the release of Instagram, everyone got obsessed with filters.
You can do a lot of things with non-linear transformations, but they are incredibly complex, and we humans are incredibly stupid. As soon as it comes to non-linear transformations, we prefer to go with numerical methods or run neural networks to do our job. The same thing happens here.
Automation and Dreams of a “Masterpiece” Button
When everybody got used to filters, we started to integrate them right into our cameras. It’s hidden in history whoever was the first manufacturer to implement this, but just to understand how long ago it was, think, that in iOS 5.0 released in 2011 we already had a public API for Auto Enhancing Images. Only Steve Jobs knows how long it was in use before it opened to the public.
The automation was doing the same thing that any of us does by opening the photo editor — it fixed the lights and shadows, increased the brightness, took away the red eyes, and fixed the face color. Users didn’t even know that “dramatically improved camera” was just the merit of a couple of new lines of code.
ML Enhance in Pixelmator. |
Today, the battles for the Masterpiece button have moved to the machine learning field. Tired of playing with tone-mapping, everyone rushed to the hype train CNN’s and GAN’s and started forcing computers to move the sliders for us. In other words, to use an input image to determine a set of optimal parameters that will bring the given image closer to a particular subjective understanding of “good photography”. Check out how it’s implemented in Pixelmator Pro and other editors who’s luring you with their fancy “ML” features stated on a landing page. It doesn’t always work well, as you can guess. But you can always take the datasets and train your own network to beat these guys, using the links below. Or not.
- Link: Image Enhancement Papers
- Link: DSLR-Quality Photos on Mobile Devices with Deep Convolutional Networks
Vasily Zubarev is a Berlin-based Python developer and a hobbyist photographer and blogger. To see more of his work, visit his website or follow him on Instagram and Twitter.
Articles: Digital Photography Review (dpreview.com)