Deep Fakes

If you're of a certain age, then you may remember the infamous car commercial from the late 1980's with the phrase "This is not your father's Oldsmobile".

The commercial was intended to recharacterize the car brand as exciting and new rather than the staid and boring brand it had become.

While it was a marketing disaster (hint: don't ever disparage your own brand), the intended message was supposed to be clear: Times have changed.

And boy have they. That blunderous phrase lives on today, popping up occasionally like it did right here.

In this article, we'll discuss how times have changed for another idea whose roots go back decades: Imitating a person in order trick or mislead others, whether for good or bad. Today's deep fakes are not your father's Oldsmobile.

Some history for context.

Photoshop

Photo editing began to takeoff back in the late 80's using a product called Photoshop, a name most people recognize, even today. Just like Google, Photoshop became so popular that its very name became a verb. To photoshop an image meant to edit it in some way. Usually that was to crop the image or correct for color hue errors due to bad lighting.

But as Photoshop grew in capabilities, new uses emerged that were heretofore impossible or, at least, very difficult to achieve using practical methods.

You could crop people out of a scene even if they were highly integrated in the photo. You could remove zits, change eye color, hair color, or the shape of the nose. You could also "clone in" items or people that weren't originally present. There are many other editing manipulations you can do.

In short, you could alter reality.

This was usually benign. Back when I shot weddings professionally, I would use Photoshop to fix imperfections in an otherwise good photo. I might remove some zits on the bride's face, clean up the black eye the groom earned the night before, etc. This all made for a preserving a memory without the incidental cruft of the moment.

But bad actors quickly seized on the opportunity to create highly doctored pictures. If you were sufficiently skilled, your end results could be quite convincing. Enough to sway a court case, perhaps.

Multimedia enters the fray

The earlier days of physically imitating other people were mainly in fixed, photographic form. That's because photos are easy. Photos are, well, a snapshot in time. It's a single frame of a person's appearance that has no time duration. It's an absolute instant, zero seconds long.

But as time went on, imitating other people became more than just doctoring individual photos.

Computers and software became more powerful, allowing masters of the craft to create manipulated videos. And what is a video if not a sequence of numerous individual photographs shot at a high speed, audio notwithstanding?

Of course, manually editing each individual photograph that comprises a video is incredibly tedious and error prone. That's a job that software could handle with ease. Make your manual edits to a "master frame" and let the software automatically apply those edits to subsequent frames, taking into account natural movements of the subject.

And if there's video, then we must have audio to match. Additional software made that possible to edit and manipulate.

Creating a convincing, doctored video with audio became possible, and these techniques were being explored and used by filmmakers. It required expensive computers, complex software, and considerable skill and talent to do this so it wasn't something that any old scammer could leverage for their own malignant endeavors.

Rising awareness

As more and more people became aware that "photoshopping" was possible, the bad actors who would use photo manipulation as a means to further a fraud, scam, or misrepresent someone, like an opposing politician, needed to improve their tools and capabilities.

As we discussed just above, these tools were costly and required considerable skill to use. This is more or less how things went for some years.

But things were about to dramatically change.

A.I. comes to town

The water against the dam had been rising for a few years now. Then, in Nov 2022, the dam burst wide open with the release of ChatGPT, the first time that AI became available to the masses.

While ChatGPT, itself, is basically a text-based interactive response engine, the underlying tech that made it possible has been undergoing massive development ever since. Big tech companies have invested hundreds of billions of dollars developing A.I. algorithms in just the last couple of years.

And this is where we find ourselves today. This is when the tools to create convincing deep fakes became available to pretty much anyone, thus ushering in an entirely new world of continually improving frauds and scams.

The rest of this article will focus on that.

A Brave New World

Perhaps Huxley , back in 1932, was more prophet than sci-fi author.

What makes it a "deep fake" vs. a regular fake?

A "regular fake" would be all the examples above involving photoshop and early automation tools. They all required significant skill to master and the output generally comprised significant elements of the input. These were largely manual efforts even if some automation was available to speed-up the process. They were beyond the reach of most people.

"Deep fakes", on the other hand, are completly synthetic, created and edited using A.I. tools, particularly deep learning techniques like generative adversarial networks (GANs).

This is very complex stuff that even I don't fully understand, but I'll use a simple explainer: A GAN works by having two neural engines working adversarially to improve output. The first neural engine is called the generator and the second neural engine is the discriminator. Here's a flowchart of what happens:

The discriminator is given an image either from the generator (fake) or from the training dataset (real).
If the discriminator misclassifies the image (gets it wrong), then it "loses" and adjusts its parameters to do better next time.
If the discriminator correctly classifies the image (gets it right), then it "wins" and the generator adjusts it's parameters to do better next time.
This loop continues until a balance is reached, where the generator creates data that is nearly indistinguishable from real data, and the discriminator cannot reliably differentiate between real and fake samples.

These A.I. tools are provided minimal samples (perhaps a few still images, or a few seconds of video or audio) and, from that, generate whatever you want in terms of tone, wording, expressions, etc.

Earlier deep fakes had more rendering errors, such as additional fingers on a hand, an extra leg, three nostrils, and other body horror, so spotting them was pretty easy.

A lot of that has been fixed. Some of today's deep fakes are so utterly convincing that even experts in the field, people who know the telltale signs to look for, are fooled. What will the next five years bring, I wonder...

Scams

As you can imagine, with such cheap or free sophisticated tools for impersonating specific people, we are ripe for a new wave of scary scams.

Kid in trouble

One of the more comparatively common and super scary scams is to receive a phone call from someone sounding, for all the world, exactly like your kid, telling you they've been arrested and need help now. The scam sometimes involves you receiving a second phone call from a (fake) lawyer or officer of the court with instructions on how to make bail.

The tone of these phone calls is usually one of panic, urgency, and need to act quickly. All this pressure is designed to reduce your ability to think and process information clearly.

The demands from the "lawyer" or "kid" is to make a bail payment using methods that are irreversible. That could be a casual payment platform like Venmo or Zelle. It could be via gift cards. It could be via cash deposit at a Bitcoin ATM. Stop right there.

No legit lawyer, office of the court, law enforcement officer, etc. will ever demand any payment over a phone call, especially via these irreversible methods. Just hang up!

It's all the more convincing because you can actually talk to your kid in real time on the phone. Your kid knows your name, where you live, and other readily public information.

How does that work? The scammer is listening to the call. When you ask your kid what happened, the scammer types in their reply into a generative A.I. program that immediately spits out audio, perfectly impersonating your kid. The A.I. tools might even generate convincing dialog automatically.

Corporate compromise

Using similar techniques, a bad actor may use A.I. tools to impersonate an executive, in order to trick a company employee, such as an executive assistant, into divulging sensitive information. The fake-boss calls might not be urgent in nature or any emergency. Could be just simply trying to trick you into giving up sensitive info.

Fraudulent Promotion or Endorsement

Famous and instantly recognizable voices have been used in advertising for years. Morgan Freeman and James Earl Jones comes to mind. Before A.I. tools existed, most of the fraudulent endorsements were created by voice actors who were skilled at mimicking famous voices.

A while back, I heard radio ads for a local business in Columbia that I'd swear was the voice of Morgan Freeman. I'd bet good money it wasn't really Morgan Freeman.

Today, that practice is comically simple, no voice actor necessary. A.I. voice creation tools have got you covered.

As an authentication aid

I've been reading on some of the tech blogs I frequent that some (legit) companies, such as banks, are asking their customers for a voice sample so that their customer service reps can confirm they're talking to the real person if they should call in at some later date.

That sounds (heh) like a good idea but I'm skeptical. As A.I. generated voice impersonation improves, it's quite likely to fool any voice print authentication tools that might be used. It could make you less secure, not more.

The familiar voice (created by A.I. impersonation) alone may be enough to convince the mark to do as they're instructed. But the bad actor often reinforces the ploy by knowing certain information. We've all had our private, sensitive information breached numerous times by now, so a targeted "spear phishing" attack can be well-armed. In the moment, the mark is too stressed to think about this logically.

The savvy citizen

How to tell you're being phished by a scammer using A.I. impersonation? In short, you can't. At least not by detecting a flaw in the deep fake. Experts in this field, people who study impersonation fraud, all agree that most of today's deep fakes are just simply too good to detect.

Wow, then what can I do?

Here's where real life multi-factor authentication comes in.

Have a "code word" that everyone in your family knows. Then if you get that panicky call from your relative, simply ask them for the code word. Keep it simple but arcane. A silly but memorable word that no one else would guess. Practice that word once a month with your family so no one forgets.

If needed, you can establish a different code word for your boss and work colleagues.

Don't blindly trust the caller ID even if it matches a legit number for the person who's supposedly calling. Caller ID spoofing , though a bit more difficult today, is still very possible for determined fraudsters.

If you get one of these panicky calls but don't already have a code word in place then all is still not lost.

Kid calling (even an adult one)?

Ask them a question about an arcane fact from their childhood that s/he would certainly know but that no one else would know the answer to. Perhaps their favorite cartoon or name of their best friend when s/he was little.

Or ask them a question with no true answer. In the case of the auto accident, ask if s/he dropped off his sister Sally (or other fake relative) before the accident.

If using real verification questions with real answers, they must be arcane. Something memorable but otherwise not top of mind and from many years ago. The idea is that the scammer must not be able suss out the question or its answer from any data breach or social media searching.

You get the idea.

Your boss calling?

Ask them a question with no true answer relating to the thing they are asking you about. Ask leading, easy-to-answer questions yes/no about people and things that don't exist.

Take a beat...

Problem is, during such a phone call, you may be in a panic yourself. That is deliberate. Thinking up effective, ad-hoc, authenticating questions takes a clear, calm, deliberate mind. It's much better to have a code word figured out beforehand.

Even if the call were legitimate, nothing bad is going to happen in the next ten minutes that would not happen in one minute! So hang up, mute them, or put them on hold, and call them back on a number you already have. If they don't answer, it doesn't mean the call to you is real.

Media awareness

Because deep fake videos are so easy to make today, you've got to adopt a suspicious, incredulous stance on any video you see posted to social media especially if that video is inflammatory or outrageous in some manner. If it's just some random video you saw on social media then vetting it is likely impossible so don't put too much stock in it.

But if the person(s) in the video are notable, like big name politicians, celebrities, sports figures, or other well-known people that are frequently or even just occasionally in the news, then there's likely some avenue for vetting it.

Vetting can include numerous approaches:

Did the video originate on social media? If so, it may or may not be true. That's the contextual definition of unreliable.
Did it originate on a legit news outlet? Check out the news outlet on MediaBiasFactCheck.com
Are the statements made in the video in or out of character for the person being portrayed?
Is there corroborating evidence of those statements reported by other legitimate news outlets?

Media literacyis sorely lacking in our country which helps make these viral deep fakes believable.

Regardless of your politics, which naturally could bias which videos you are more likely to believe or not, no one wants to be a tool -- that is, someone easily manipulated by someone else into having certain feelings and opinions. We all want agency in our thoughts and feelings. So don't believe everything you see, even if it supports your position, without proper and objective vetting. Otherwise, you are a tool. There are many lies, but only one truth.

All deep fakes, which are false by definition*, are specifically designed to elicit an emotional response first and foremost. Alas, the emotional response often wins out over the intellectual response, which is exactly what the deep fake producer wants.

* They are false by definition because they aren't real. They are presenting something that never happened in real life.

I tend to read my news rather than watch my news. Reading tends to engage slower, more analytical thinking. To be sure, the written word may contain lies as well. But overall it's a less effective way to spread emotionally charged disinformation due to the natural human response mechanisms involved in reading vs watching.

Woman shown with vector diagram of facial recognition data points