Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI

Tech companies are turning to controversial tactics to feed their data-hungry artificial intelligence models, vacuuming up books, websites, photos, and social media posts, often unbeknownst to the creators.

AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube’s rules against harvesting materials from the platform without permission.

Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce.

The dataset, called YouTube Subtitles,

→ Continue reading at WIRED

Similar Articles

Advertisment

Most Popular