Who is working to end the threat of AI-generated Deepfakes, and why is it so difficult?

Who is working to end the threat of AI-generated Deepfakes, and why is it so difficult?

A diagram showing an image manipulated by AI to show two men ballroom dancing and another to show an unrealistic image with the same prompt.

The above images of Trevor Noah and Michael Kosta show what happens when they are put through an AI image generator with the prompt “two men ballroom dancing”, as well as whether or not the image has been modified to reject AI image manipulation.
Picture: Alexander Madry

Like many of the world’s best and worst ideas, the MIT researchers’ plan to combat AI-generated deep fakes was hatched when one of them was watching their favorite news program.

On the October 25 episode of The Daily Show with Trevor Noah, OpenAI’s CTO Mira Murati spoke up AI-generated images. While she could likely discuss OpenAI’s AI image generator DALL-E 2 in detail, it wasn’t a particularly in-depth interview. After all, it was put out for all the people who probably understand little or nothing about AI art. Nevertheless, it offered some nuggets of thought. Noah asked Murati if there was a way to make sure AI programs don’t lead us to a world “where nothing is real, and everything that is real, right?”

Last week, researchers at the Massachusetts Institute of Technology said they wanted to answer that question. They produced a relatively simple program that can use data poisoning techniques to essentially scramble pixels in an image to create invisible noise, effectively rendering AI art generators unable to generate realistic deepfakes based on the images they are fed. Aleksander Madry, a computer science professor at MIT, worked with the team of researchers to develop the program and posted his findings on Twitter and his lab’s blog.

Using photos of Noah with Daily Show comedian Michael Kosta, they showed how this imperceptible noise in the image interferes with a diffusion model AI image generator from creating a new photo using the original template. The researchers suggested that anyone planning to upload an image to the Internet could run their image through their program, essentially immunizing it against AI image generators.

Hadi Salman, a graduate student at MIT whose work revolves around machine learning models, told Gizmodo in a phone interview that the system he helped develop only takes a few seconds to introduce noise into a photo. Higher-resolution images work even better, he said, because they contain more pixels that can be slightly disturbed.

Google is creating its own AI image generator called Imagen, though few people have been able to put their system through its paces. The company is also working on a generative AI video system. Salman said they haven’t tested their system on video, but in theory it should still work, even though MIT’s program must individually mock each frame of a video, which can be tens of thousands of frames for a video longer than a few minutes.

Can data poisoning be applied to AI generators at scale?

Salman said he could envision a future where companies, even those generating the AI ​​models, could certify that uploaded images are immune to AI models. Of course, that’s not very good news for the millions of images already uploaded to open source libraries like LAION, but it could potentially make a difference to all images uploaded in the future.

Madry also told Gizmodo over the phone that this system, while their data poisoning has worked in many of their tests, is more of a proof of concept than a product release of any kind. The researchers’ program proves that there are ways to defeat deepfakes before they happen.

Companies, he said, need to get to know this technology and implement it in their own systems to make it even more resistant to tampering. Additionally, the companies would need to ensure that future versions of their diffusion models, or any other type of AI image generator, will not be able to ignore the noise and generate new deepfakes.

Above left is the original photo with Trevor Noah and Michael Kosta.  Top right is an image created with an AI image generator, and bottom right is what happened when AI researchers tried the same thing, but introduced imperceptible noise into the original image.

Above left is the original photo with Trevor Noah and Michael Kosta. Top right is an image created with an AI image generator, and bottom right is what happened when AI researchers tried the same thing, but introduced imperceptible noise into the original image.
Photo: MIT/Aleksander Madry/Gizmodo

“What should really happen going forward is that any company that develops diffusion models should provide the capacity for healthy, robust immunization,” Madry said.

Other experts in the field of machine learning found some points to criticize the MIT researchers.

Florian Tramèr, professor of computer science at ETH Zurich in Switzerland, tweeted that the biggest difficulty is that you basically get an attempt to fool all future attempts to create a deepfake with an image. Tramèr was co-author of a 2021 newspaper published by the International Conference on Learning Representations which essentially found that data poisoning, like what the MIT system does with its image noise, will not stop future systems from finding ways around it. More so, creating these data poisoning systems will create an “arms race” between commercial AI image generators and those trying to prevent deepfakes.

There have been other data poisoning programs intended to deal with AI-based surveillance, such as Fawkes (yes, as in November 5), which was developed by researchers at the University of Chicago. Fawkes also distorts pixels in images in such a way that it prevents companies like Clearview from achieving accurate facial recognition. Other researchers from the University of Melbourne in Australia and the University of Peking in China have also analyzed possible systems that could create “unbearable examples” which AI image generators cannot use.

The problem is, as pointed out by Fawkes developer Emily Wenger in an interview with MIT Technology Reviewprograms like Microsoft Azure managed to win against Fawkes and detect faces despite their resistant techniques.

Gautam Kamath, a professor of computer science at the University of Waterloo in Onatrio, Canada, told Gizmodo in a Zoom interview that in the “cat and mouse game” between those trying to create AI models and those finding ways to defeat them, the manufacturers new AI systems seem to have the advantage because once an image is on the internet it never really disappears. Therefore, if an AI system manages to bypass attempts to keep it from being faked, there is no real way to fix it.

“It’s possible, if not likely, that in the future we’ll be able to bypass any defenses you put on that particular image,” Kamath said. “And once it’s out there, you can’t take it back.”

Of course there is some AI systems that can detect deepfake videosand there are ways to train people to spot the small inconsistencies which shows that a video is being faked. The question is: will there come a time when neither man nor machine can tell if a photo or video has been tampered with?

What about the biggest AI generator companies?

For Madry and Salman, the answer is to get the AI ​​companies to play ball. Madry said they want to contact some of the big AI generator companies to see if they’d be interested in facilitating their proposed system, though of course it’s still early days, and the MIT team is still working on a public API that would let users immunize their own photos (the code is available here).

In that way, everything depends on the people who make the AI ​​imaging platforms. While OpenAI’s Murati told Noah in that October episode that they have “some guardrails” for their system, further claiming that they don’t allow people to create images based on public figures (which is a pretty vague term in the age of social media where virtually everyone has a public face). The team is also working on more filters that will restrict the system from creating images that contain violent or sexual images.

Back in September, OpenAI announced users could once again upload human faces to its system, but claimed it had built in ways to stop users from showing faces in violent or sexual contexts. It also asked users not to upload images of people without their consent, but it’s a lot to ask of the public internet to make promises without crossing their fingers.

That’s not to say, however, that other AI generators and the people who created them are equally game for moderating the content their users generate. Stability AI, the company behind Stable Diffusion, has shown itself to be much more reluctant to introduce any barriers preventing people from creating porn or derivative artwork using its system. Although OpenAI has been, ahem, open about trying to prevent their system from showing bias in the images it generates, StabilityAI has kept mum.

Emad Mostaque, CEO of Stability AI, has argued for a system without government or corporate influence, and so far has fought back against calls to place more restrictions on its AI model. He has said he believes in image generation will be “solved in a year” so users can create “anything you can dream of.” Of course, that’s just the hype, but it shows that Mostaque isn’t willing to back down from seeing technology push itself further and further.

Still, the MIT researchers remain steadfast.

“I think there are a lot of very uncomfortable questions about what the world is when this kind of technology is readily available, and again, it’s already readily available and will be even easier to use,” Madry said. “We’re really happy, and we’re really happy about the fact that we can now do something about this by consensus.”


#working #threat #AIgenerated #Deepfakes #difficult

Leave a Comment

Your email address will not be published. Required fields are marked *