Meta ‘stole’ my book to train its AI – but there’s a bigger problem
TECHRADAR | MAY 2025

A shadow library might sound like something from a fantasy novel – but it’s real, and far more troubling.
It’s an online archive of pirated books, academic papers, and other people’s work, taken without permission. These libraries have always been controversial. But in the AI world, they’re an open secret – rich sources of high-quality writing used to train large language models.
The books in them are goldmines because they’re long-form, emotional, diverse and generally well-written. Using them to ‘train’ AI is a shortcut to teaching these tools how humans think, feel, and express themselves. But licensing them properly would be expensive and messy. So tech companies just didn’t bother.
This quiet exploitation exploded into public view in March 2025 when The Atlantic released a tool that lets anyone search for their books in LibGen (Library Genesis), one of the biggest shadow libraries.
And there it was, my book Screen Time, along with millions of others.
It’s been revealed in legal documents that Meta, the parent company of Facebook and Instagram, used LibGen to train its large language models, including LLaMA 3. Not every title was necessarily used, but the possibility alone is enough to leave authors reeling.
As a tech journalist, I’ve always tried to stay level-headed about AI – curious but critical. But when it’s your book that’s been ‘stolen’ to train AI, it hits differently. You think about the hours, the edits, the emotion. The despair and euphoria of creating something from nothing.
It feels as if all of that has been swallowed whole by a system that mimics creativity while erasing the creator. The outrage from authors is real – the lack of consent, the lack of compensation. But what haunts me is something deeper, a grief for creativity itself, and the sense it’s slipping away.