Falsified Photos: Fooling Adobe’s Cryptographically-Signed Metadata [Hackaday]

November 30, 2023 Alon Ghelber

Last week, we wrote about the Leica M11-P, the world’s first camera with Adobe’s Content Authenticity Initiative (CAI) credentials baked into every shot. Essentially, each file is signed with Leica’s encryption key such that any changes to the image, whether edits to the photo itself or the metadata, are tracked. The goal is to not only prove ownership, but that photos are real — not tampered with or AI-generated. At least, that’s the main selling point.

Although the CAI has been around since 2019, it’s adoption is far from widespread. Only a handful of programs support it, although this list includes Photoshop, and its unlikely anybody outside the professional photography space was aware of it until recently. This isn’t too surprising, as it really isn’t relevant to the casual shooter — when I take a shot to upload to Instagram, I’m rarely thinking about whether or not I’ll need cryptographic proof that the photo wasn’t edited — usually adding #nofilter to the description is enough. Where the CAI is supposed to shine, however, is in the world of photojournalism. The idea is that a photographer can capture an image that is signed at the time of creation and maintains a tamper-proof log of any edits made. When the final image is sold to a news publisher or viewed by a reader online, they are able to view that data.

At this point, there are two thoughts you might have (or, at least, there are two thoughts I had upon learning about the CAI)

Do I care that a photo is cryptographically signed?
This sounds easy to break.

Well, after some messing around with the CAI tools, I have some answers for you.

No, you don’t.
Yes, it is.

What’s The Point?

There really doesn’t seem to be one. The CAI website makes grand yet vague claims about creating tamper-proof images, yet when you dig into the documentation a bit more, it all sounds quite toothless. Their own FAQ page makes it clear that content credentials don’t prove whether or not an image is AI generated, can easily be removed from an image by taking a screenshot of it, and doesn’t really tackle the misinformation issues.

That’s not to say that the CAI fails in their stated goals. The system does let you embed secure metadata, I just don’t really care about it. If I come across a questionable image with CAI credentials on a news site, I could theoretically download it and learn, quite easily, who took it, what camera they used, when they edited it and in which software, what shutter speed they used, etc. And thanks to the signature, I would willingly believe all of those things are true. The trouble is, I don’t really care. That doesn’t tell me whether or not the image was staged, or if any of those edits obscure some critical part of the image changing its meaning. At least I can be sure that the aperture was set to f/5.6 when that image was captured.

Comparing Credentials

The CAI Verify Tool

At least, I think I can be sure. It turns out that it isn’t too hard to misuse the system. The CAI provides open-source tools for generating and verifying signed files. While these tools aren’t too difficult to install and use, terminal-based programs do have a certain entry barrier that excludes many potential users. Helpfully, Adobe provides a website that lets you upload any image and verify it’s embedded Content Credentials. I tested this out with an image captured on the new CAI-enabled camera, and sure enough it was able to tell me who took the image (well, what they entered their name as), when it was captured (well, what they set the camera time to), and other image data (well — you get the point). Interestingly, it also added a little Leica logo next to the image, reminiscent of the once-elusive Blue Check Mark, that gave it an added feel of authenticity.

I wondered how hard it would be to fool the Verify website — to make it show the fancy red dot for an image that didn’t come from the new camera. Digging into the docs a bit, it turns out you can sign any old file using the CAI’s c2patool — all you need is a manifest file, which describes the data to be encoded in the signed image, and an X.509 certificate to sign it with. The CAI website advises you to purchase a certificate from a reputable source, but of course there’s nothing stopping you from just self-signing one. Which I did.

Masquerading Metadata

I used openssl to create a sha256 certificate, then subsequently sign it as “Leica Camera AG” instead of using my own name. I pointed the c2pa manifest file at my freshly minted certificate set, pasted in some metadata I had extracted from a real Leica M11-P image, and ran c2patool. After some trial and error in which it kept rejecting my fake certificate for some reason or another, it finally spit out a genuine fake image. I uploaded it to the Verify tool and — lo and behold — not only did the website say that my fake had been taken on a Leica camera and signed by “Leica Camera AG,” but it even sported the little red Leica logo.

One of the images above was taken on a Leica M11-P, and the other on a Gameboy Camera. Can you tell the difference? Adobe’s Verify tool can’t. Download the original left image here, the right image here, then head over to https://contentcredentials.org/verify to try for yourself.

Of course, a cursory inspection of the files with c2patool would reveal the signature’s public key, and it would be a simple matter to compare that key to Leica’s key to find out that something were amiss. Surprisingly, Adobe’s Verify tool didn’t seem to do that. It would appear that it just string matches — if it sees “Leica” in the name, it slaps the red dot on there. While there’s nothing technically wrong with this, it does lend the appearance of authenticity to the image, making any other falsified information easier to believe.

Of course, I’m not the only one who figured out some fun ways to play with the CAI standard. [Dr. Neal Krawetz] over at the Hacker Factor Blog recently dove into several methods of falsifying images, including faked certificates with a method a bit more straightforward than the one I worked out. My process for generating a certificate took a few files and different commands, while his distills it into a nice one-liner.

Secure Snapshots?

So, if the system really doesn’t seem to work that well, why are hundreds of media and tech organizations involved in the project? As a consumer, I’m certainly not going to pay extra for a camera just because it has these features baked in, so why are companies spending extra to do so? In the CAI’s perfect world, all images are signed under their standard when captured. It becomes easy to immediately tell both whether a photograph is real or AI-generated, and who the original artist is, if they’ve elected to attach their name to the work. This serves a few purposes that could be very useful to the companies sponsoring the project.

In this perfect world, Adobe can make sure that any image they’re using to train a generative neural network was captured and not generated. This helps to avoid a condition called Model Autophagy Disorder, which plagues AIs that “inbreed” on their own data — essentially, after a few generations of being re-trained on images that the model generated, strange artifacts begin to develop and amplify. Imagine a neural network trained on millions of six-fingered hands.

To Adobe’s credit, they tend to be better than most other companies about sourcing their training data. Their early generative models were trained solely on images that they had the rights to, or were explicitly public domain or openly-licensed. They’ve even talked about how creators can attach a “Do Not Train” tag to CAI metadata, expressing their refusal to allow the image to be included in training data sets. Of course, whether or not these tags will be respected is another question, but as a photographer this is the main feature of Content Credentials that I find useful.

Other than that, however, I can’t find many benefits to end users in Content Credentials. At best, this feels like yet another well-intentioned yet misguided technical solution to a social issue, and at worst it can lend authenticity to misleading or falsified images when exploited. Misinformation, AI ethics, and copyright are complicated issues that require more than a new file format to fix. To quote Abraham Lincoln, “Don’t believe everything you read on the internet.”

What’s The Point?

Comparing Credentials

Masquerading Metadata

Secure Snapshots?

Spread the word!