#1126. Reframing Photography with Artificial Intelligence

By pascaljappy | Review

Jul 20

As OS updates sneak into our smartphones, uninvited and undetected, some new features make their appearances and surprise users. AI, it seems, is getting a firmer grip on photographic practises.

Majesty
 

Among the features that awaited me on my recent trip to the UK was one abysmal “Live” thingy that appears to capture images before the user clicks and creates unsollicited mini-videos instead of just the desired photograph. Not only does this feel creepy and invasive to me, it also gobbles up tons of memory (therefore natural resources) and I struggle to imagine a scenario where this would be of any use to me

Far more intersting, and even less expected, is a recommended framing guide that pops-up automatically as soon as the phone is held steady in photo mode for a second or two. Instead of aggravation, this one produced amused puzzlement 😉

And so many questions … Let’s dig deeper

Kew Gardens – My shot
Kew Gardens – AI suggested framing
 

When my senile brain finally understood what the “Best framing” spot in the middle of the frame meant (I tried clicking it, feeling really past my prime for a few baffled seconds 😉 😉 😉 ) the idea amused me a lot. Maybe to compensate for my lack of mental speed, my amusement at the feature was quite smug. As if a phone was going to tell me, the great PJ, winner of absolutely every competition I entered (zero, that’s correct), and favourite photographer of my mummy, how to frame. Pshhh

But, that soon faded because my twin interests in composition and in artificial intelligence. So I decided to grab some pairs of shots: first my idea of the best shot, then the camera’s, to review them and present them to you, in order to decide how well AI was doing. The first pairing is above.

Both are valid. But I kinda prefer the phone’s more tranquil version. Ouch 😉

Canal in Camden – My version
Canal in Camden – AI suggested framing
 

What is AI? How does it frame?

Above is another pair. There’s a slight difference here. The AI version is probably what would get published easier, but my vote goes to mine, by a slight margin. What’s starting to emerge in my mind after inspecting a few shots is that the AI seems to be applying “rules”, in particular the @$#ffing rule of thirds 😉 How?

OK, so AI is any technology that allows automatic human-like processing faculties to be performed by a machine. Such as speech, vision, mapping your way from A to B … But, most often, today, the term AI is (wrongly) used to talk about machine learning, a subset of AI that learns patterns from examples. Whenever a specification can be written, a developer can write a program that will produce exact results in accordance to the specification. But, in real life, very few problems actually lend themselves to accurate specifying.

How, for example, do you specify the action of finding a cat in a photograph? You count pointy ears, whiskers, round eyes … But what if the cat is turning it’s back to you? You’ll probably only spot one round feature, no ears and a long fluffy thing that’s not described in the specifications for a face. What if the cat is lying down on its back like an otter waiting for its stone? 😉 Plus, even if you constrain your task to identifying head-on cats, how do you identify a nose, eyes, whiskers, ears …? We don’t have specifications for those either. You could try one for identifying eyes based on the round shape. But what if that cat’s eyes are half shut, what if its pupils are tiny, or huge? What is there’s a strong reflection on the eyes? What if one eye is hidden behind a leaf? Is that cat no longer a cat? Nope, real life problems can rarely be specified accurately enough for exact software to be written.

Libreria bookshop – My framing
Libreria bookshop – AI suggested framing
 

So we write inexact software that gets it right often enough.

Because it’s hard enough to write software that meets a spec, imagine writing software based on no spec at all … Impossible. What machine learning teams do instead is provide a set of examples (of photos of cat, for example), a set of negative examples (pics of dogs, humans dressed as cats at a child’s birthday …) and some sort of decision process that answers Yes or No when fed a new example (such as a new image that wasn’t in the training set). The whole trick is to train this decision process (called the model) as we’d train a kid. Simple models are regression lines (a line that could split data points based on the number of legs they have, for example) or neural networks (which are matrices of regression “lines” (that are not straight) and have multiple inputs, a bit like actual neurons).

Training the model is an optimisation process, and the most interesting part of the whole process, if you ask me. Back in an other millenium, I did my PhD on the mathematical models behind those optimisation processes. Very interesting progress has been made since, in the way models can train themselves by playing simulations and evaluating the outcome of each. This is how the programs that beat grand master at chess and go were trained. They played against themselves, under the supervision of a brilliant team of developpers and learned as they played. It’s also how helicopters learn to fly themselves and perform manoeuvers that no human could achieve. It’s impressive and brilliant stuff.

Thames walk – My Framing (need to straighten that horizon)
Thames Walk – AI suggested framing
 

But nothing so complex is at play in smart-cameras, I’m guessing.

What I’m seeing in most of the AI suggested framing in a strong reliance on the rule of third. And it works superbly. My guess is the team picked a large set of photographs heavily dominated by a rule of thirds composition and declared them to be good examples. Then another set of ‘other’ compositions favouring the center or something else – the sort I would do 😉 – and declared those to be negative examples. They then obtained a model that detects the main masses of colour and suggests a center spot that creates a rule of third composition from those, then uploaded it to the photo app in phones.

Let’s be honest, it works brilliantly. In most cases, my initial nod goes to the camera’s suggested framing! On second viewing, I go back to some of mine, but not all of them! Sometimes the recommended framing is very close to my original idea, sometimes it departs from it significantly.

Towards Canary Wharf – My framing
Towards Canary Wharf – AI suggested framing (little difference)
 

For the record, AI isn’t composing. I am, by placing myself in a specific spot and choosing the focal length. Nor it is correcting sloping horizons … I still have to do that in post (I didn’t in the examples, so as not to crop into the frame), for now. But AI is finding interesting framing, from the chosen spot. And that’s great already. Not failsafe, but definitely worth giving a try. What baffles me is how the camera does this, beyond the model recommendation part, which seems quite straightforward.

How does the camera “know” what data lies outside the frame that I’m using for my own composition? It has to know since it is recommending that some of it be included in the final photograph.

My guess is, as in the first example, that the camera starts scanning as soon as it detects that it’s going to be used (as it is being raised at eye level). In a way, this is no less creepy than in the “live” gizmo. But the result subjectively feels like an interesting little tool rather than an invasion of my privacy. Maybe it’s just me 😉

Shark attack – My framing
Shark attack – AI suggested framing (radically different)
 

Why is this interesting?

Two aspects of this fun innovation are particularly interesting to me.

First of all, my style hinges on clean edges and balanced weights (unless there’s a reason to introduce imbalance, such as in Shark attack). I was trained by books written by traditional guys such as Ansel Adams and Charlie Waite, and like it that way. But this can get stale and the AI version forces me to rethink and double check.

Plus, I tend to be quite lazy with the phone, mainly recording a memory with what appears to be a correct composition, opting to PP my way out of a mess whenever disaster strikes 😉 And this little tool forces me to be a little more careful. Instinctively, there’s always this little dialog going on at the back of my brain “hang one, why is the thingy not aligned with the center of my frame? What am I messing up now?” 😉

It is giving me options, not imposing itself. I love that.

Approaching Camden – my framing
Approaching Camden – AI suggested framing
 

The second aspect is related to the growing chasm between traditional cameras and smart-cameras (let’s call them that, rather than phone cameras).

For the millionth time, why are camera makers still head butting in a pixel race that was obsolte ten years ago, when phones have not only (largely) caught up in purely qualitative terms but are also providing really interesting and useful features. It’s like traditional manufacturers are begging to lose their market share …

For a long time, the technology narrative with regards to human work has been that robots replaced humans because they need no sleep, don’t hurt their backs, don’t go on strike … and that what jobs have been replaced by machines will always compensated for by more valuable jobs and higher qualifications. Blue collars suffered initially, but more white collars made a better living. But then, machines started doing intelligent work, not just manual labour. And now, phone cameras are making artistic decisions while expensive photo cameras are still comparing pixel muscle in the children’s courtyard. This can only go one way …

C suite – My framing
C – Suite – AI suggested framing

My phone is over 3 years old and new ones incorporate far better cameras than mine. But there’s still enough of a quality difference between your [insert brand] high-quality APS-C or FF camera and modern smart-cameras to consider going traditional. However, I don’t give that advantage more than 5 years to fade away in 90% of scenarios. Add to that the innumerable tools such as auto online backup, easy sharing … and the reasons to buy a traditional camera are looking more and more transparent.

Smart cameras are fast and fun to use. They are easy to use. Their ergonomics, not great at the best of times, slowly improve. They come with a huge variety of side applications, such as a phone, a GPS, note taking, mapping, … all immensely useful to a photographer.

I do hope someone gets their head out of the sand and understands how useful it would be to have those features on a traditional camera or how fun it would be to depart from the traditional recipe using different sensors or a different approach. In the mean time, smart cameras: Keep pushing, please keep pushing!

Bamboo forest – My framing
Bamboo forest – AI suggested framing
 

What say you?

 

​Never miss a post

​Like what you are reading? Subscribe below and receive all posts in your inbox as they are published. Join the conversation with thousands of other creative photographers.

  • Lad Sessions says:

    Dang, the AI is good! Between the pairs of images you offer, I blush to confess to choosing AI as often as PJ.

    But you of course hold the overall edge simply because you have made choices the AI has not: time and place and perspective and… But could it be that even these photographic choices will also become possible for an AI? Or at least some of them?

    But of course you also have the ability to learn from what the AI does, and so to improve constantly, and unless the AI can keep up with your choices, I think this is an arms race you should win (with this camera anyway).

    Still, keep shooting, Pascal! Your eye is superb.

    • pascaljappy says:

      Thank you very much, Lad. What appeals to me is the possibility of ‘breaking my mold’ and having to focus harder than the qui snap. I guess, and hope, that the abilities of those smapt-cameras will continue to improve over time, in ways that offer us more creative choices.

  • jean pierre (pete) guaron says:

    Well you’re going to have a fun time when “they” switch your car to AI and all you can do is approach the vehicle – it will automatically open the driver’s door, because it will recognise you – God knows how you lend it to someone else! – and then get in, sit can close the door. Talk to it – nicely – and Siri will plug into a GPS system and whisk you to your destination. Providing CCTV style footage of anyone you run over or crash into along the way.

    Call me a “control freak” – but when I take a photo, I want the pleasure and satisfaction of knowing that I took it – not some nerd in Silicon Valley.

    I use heaps of “post” programs to fiddle around with my shots. It’s part of the fun. Very little of that is done “in cam” – not on my watch anyway.

    I had a flirtation with rectification of verticals, and horizon lines. I came to me sense, when I realised that some shots DO, and others decidedly DON’T, respond to either or both of those corrections.

    2×3, 4×5, 5×5, 5×8? Whatever – hey how about a vertical frame, instead? The cellphone can’t make those decisions, any better than a camera can. Or me either, for that matter. I wrestled with one of my macro shots of an orchid for weeks and weeks – and suddenly realised I was looking at it the wrong way – the “real” point of interest was there all the time, obscured from view by everything else in the frame. The instant I came to my senses and did a severe crop, I found I had a masterpiece on my screen! Stuff the cellphones, they can’t beat that!

    Back to your AI car – how is Silicon Valley going to implant a feel for “aesthetics” in a cellphone? I did a “head to head” with a moron brandishing his brilliant super-expensive iPhone, while I was talking the shots I used later to create my monster panorama of Rottnest [gentle reader – that’s an island 10 miles OR 15 kilometres off our coastline]. When he finally understood what I was doing, he said, “Oh, I can do that” and promptly did – on his cellphone – producing a banana shaped monstrosity with saw tooth edges, that could have BEEN more revolting. Mine however has finished up as an 8 foot/2.65 metre wide panorama on our bedroom wall – and it blows your mind, every time you see it. Game, set & match to “cameras” – cellphones are a “convenience”, but no matter how much they dominate today’s world of “photography”, I cannot see them “replacing” the camera – cellphones have limitations, and there’s no way to fight around that.

    • pascaljappy says:

      Ah, you must talk to Elon about this 😉 His vision stated that your car would deliver you at the office then go pickup other people as a kind of Airbnb/self-driven Uber money maker, then would collect you at the end of the day.

      The beauty of this little tool is that it doesn’t take the photo for you. It just recommends framing, in the way as it recommends exposure. That’s fantastic.

      It’s not difficult to implant a feel for aesthetics, as long as there is a human consensus for it. Thankfully, that’s not happening anytime soon 😉

    • pascaljappy says:

      Ah, the phone doesn’t read this. Humans do, pick photos that tick the boxes and train the AI with it.

      That said, I don’t believe in those articles at all. Noone has time to do that kind of analysis before the shot. Instead, the proper way of learning is what I highlighted in a previous post: to use those rules retrospectively, when analysing your production.

      • jean pierre (pete) guaron says:

        LOL – that sounds like “post”, to me – I do believe I’ve mentioned on more than one occasion, that as far as I’m concerned it isn’t a “photograph” until you complete all your ‘post processing” and produce a print! AndI do believe that one or more of the giants from the past – people with the stature of Ansell Adams – have said much the same thing, long before I did.

  • Frank Field says:

    A bridge too far for those of us who even prefer to focus manually in the third decade of the 21st century, I’m afraid. Soon, we will be able to send our robots on these trips and they will post photos to the internet, allowing us to vicariously enjoy the trip in near real-time while avoiding today’s hassles of international travel.

    • pascaljappy says:

      Indeed, and I suspect that’s what some of the intelligentia wants for the rest of the world.

      But so long as software offers me choices rather than dictate a behaviour, I’m fine with it. There’s probably an off switch for the feature anyway. For now, at least. It’s conceivable that in a few years, the rebels who do not follow the suggested ideas will be sent to labour camps (only half kidding here, I remember being asked with suspiscion by HR why I didn’t have an active Facebook account …)

      Cheers

      • Frank Field says:

        I do not currently have, nor have I ever had, nor will I ever have a facebook account! BTW, if you use Firefox, Mozilla has a nice little “facebook container” that isolates links to facebook we find on so many web pages today and prevents them from back-door tracking your reading habits.

  • Andreas says:

    Well, this could very well explain why a Mamiya 7 is currently selling for more than a used hasselblad X1D on ebay, and why some people are willing to fork out 8000 euros for a Leica M10R.
    Personally, I have gone back to shooting film and when I do buy my next digital camera, I do my best to avoid an electronic viewfinder, AI or not. I prefer not to view the world through the eyes of a machine, not less having my view altered by some engineer…
    (try selling a morgan with an automatic transmission…)

    • pascaljappy says:

      Great analogy. Imagine a three-wheeler with the now obligatory touch-screen on the dashboard 😉

      Mamiya 7s fetch high prices because they were some of the best cameras ever made. I owned two and loved them very dearly. Absolutely wonderful machines and I still dream of one of those with a larger digital sensor (not the 44×33 in the X1D). One can dream. Film is fantastic, but very tricky to get processed in this neck of the woods. So I’ll stick to the X1D for now. Stick around, we’re testing something digital you might enjoy in a couple of weeks.

      Cheers

    • jean pierre (pete) guaron says:

      Andreas, I LOVE Morgans. I used to have a bright daffodil yellow one, with a great big leather strap to hold the bonnet in place.
      The idea of an automatic transmission is utterly revolting! I’m sure you meant the same, when you mentioned it!

      • Andreas says:

        you HAD a morgan? How could you bring yourself to un-have it?
        And yep, automatic transmission should be reserved for family cars and teslas
        /A

  • Sean says:

    Hi Pascal,
    Just wondering. Does that AI you’ve discussed above always choose an alternative frame, say, simply to justify its existence and input? That is, did an AI choice ever agree with your initial choice?

    • pascaljappy says:

      Hi Sean, the marker appears on its own after the phone is held still for a second or two. Very often, its suggestion is quite close to my initial framing, sometimes identical. Occasionally, it’s very different! It seems to ignore basic things like not having a garbage bin in the middle of the frame. It will include it, if the global composition creates the adequate balance. It’s very interesting to watch it “work” 🙂 Cheers

  • Ian Varkevisser says:

    AI composition , definitely not for me , imagine in a few years time all the boring clone images that will come out of the collective

  • Elderin says:

    remarkable. not always the best way to frame the shot but especially for the beginner a valid tool 🙂
    lol – sorry it had to be 😉

  • >