If you’ve ever posted something to the internet—a pithy tweet, a 2009 blog post, a scornful review, or a selfie on Instagram—it has most likely been slurped up and used to help train the current wave of generative AI. Large language models, like ChatGPT, and image creators are powered by vast reams of our data. And even if it’s not powering a chatbot, the data can be used for other machine-learning features.
Tech companies have scraped vast swathes of the web to gather the data they claim is needed to create generative AI—with little regard for content creators, copyright laws, or privacy. On top of this, increasingly, firms with reams
→ Continue reading at WIRED