Behind the Scenes

Big Tech’s Unconsented Scraping of YouTube Videos: What Creators Need to Know

In a shocking revelation, The Atlantic has uncovered that nearly 16 million YouTube videos were scraped without consent to train generative AI, raising serious concerns for filmmakers and content creators.

Big Tech Scraped Nearly 16 Million YouTube Videos to Train AI—Is Your Channel One of Them?

Just when you thought the conversation around generative AI couldn’t get more alarming for filmmakers and content creators, The Atlantic drops another bombshell.

They have exposed the unconsented scraping of nearly 16 million YouTube videos to train the next generation of generative AI.

And the companies behind the scraping are not like upstarts; they’re huge corporations using the stuff you put on YouTube to train the programs they want to replace you.

Let’s dive in.

An Unprecedented Heist

The investigation, part of The Atlantic’s new AI Watchdog subsite, reveals that over 15.8 million videos from more than 2 million YouTube channels were downloaded without permission.

You can use their searchable database to see which videos are being used to train generative-AI models, and which tech companies are using that material.

I searched and found a few No Film School videos on there, so that’s fun.

Of course, this kind of stuff is against YouTube’s terms of service, but AI companies are finding other ways around that, via third-party apps and other workarounds.

Not all YouTube videos are copyrighted, but many of the videos found in the Atlantic’s exposé were.

Here’s a part of the article that stood out to me: “Many major tech companies have used these data sets to train AI, according to research papers I’ve read and AI developers I’ve spoken with. The group includes Microsoft, Meta, Amazon, Nvidia, Runway, ByteDance, Snap, and Tencent. I reached out to each of these companies to ask about their use of these data sets. Only Meta, Amazon, and Nvidia responded. All three said they “respect” content creators and believe that their use of the work is legal under existing copyright law. Amazon also shared that, where video is concerned, it is currently focused on developing ways to generate ‘compelling, high-quality advertisements from simple prompts.'”

How Does This Affect Filmmakers?

I know that most of YouTube is not driven by filmmakers, but guess what? AI companies were found to be going after filmmakers’ works specifically.

Another excerpt from the article reads: “AI companies are more interested in some videos than others. A spreadsheet leaked to 404 Media by a former employee at Runway, which builds AI video-generation tools, shows what the company valued about certain channels: ‘high camera movement,’ ‘beautiful cinematic landscapes,’ ‘high quality scenes from movies,’ ‘super high quality sci-fi short films.’ One channel was labeled ‘THE HOLY GRAIL OF CAR CINEMATICS SO FAR’; another was labeled ‘only 4 videos but they are really well done.'”

That means these AI companies see this as a workaround to train their tech to replace the people making these channels. Imagine learning all these cool camera techniques and ideas and developing your original voice, only to have it fed into a machine to copy you?

That’s what’s happening right now.

This is Our Fight.

This is more than just a massive copyright violation. This is an existential threat to creative professions everywhere. And we need to fight back.

Every frame of your work that they ingest is used to build a more effective tool to replace you. They are not augmenting human creativity; they are automating it to cut costs.

The WGA and SAG-AFTRA strikes put a stake in the ground, but this new front—video data—is where the next battle will be fought. This isn’t about stopping technology; it’s about demanding consent, compensation, and control.

What Can We Do?

Feeling overwhelmed is understandable. But giving in is not an option. Share The Atlantic’s reporting and the search tool with every single creator you know.

This problem is too big to solve with lawsuits alone. We need clear, strong legislation that protects creators’ rights and forces transparency from AI developers.

Summing It All Up

I found this report to be staggering and infuriating. It feels like all these companies are just stealing our stuff, waiting to be caught, and paying nominal fees in order to keep going.

It’s a bummer, but we’re all in the fight together.

I’ll keep you updated as it goes.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button