- Staying Human
- Posts
- our stolen library of voices
our stolen library of voices
it's no longer just a problem for artists
introduction | edition #005
libraries have long been a refuge for our collective knowledge, arguably our most under-utilized public resource, less arguably supplanted by the modern internet.
but one of their rarely-discussed qualities is the implicitly voluntary nature of their assembly. no one forced anyone to write his or her autobiography; no one’s holding fiction writers hostage as they conjure up distant worlds.
library visits are unsurprisingly nosediving nationwide, with annual visits declining 74% in cities like los angeles over the past decade. and the more pages we turn in the 21st century, the deeper the ravine this trendline’s destination appears to be.
and yet we’re all involuntarily part of another library being built in real-time via the internet — a library of what each and every individual sounds like—captured in some invisible-to-the-public catalog, housed by digital librarians who don’t yet appear accountable or particularly interested in issuing any sort of public benefit in return.
to take just one example, voice actor Allegra Clark told Forbes, “it sucks that we have no personal ownership of our voices. all we can do is kind of wag our finger at the situation…” after stumbling across sexually suggestive versions of a character she played online—her voice intact, saying whatever some AI-based voice-renderers felt like having her say. from the same Forbes piece, “famed actor Stephen Fry said that his voice was scraped from his narration of the Harry Potter books and cloned using AI. in a talk at CogX festival, Fry said the experience “‘shocked’ him.”
bottom-line—there is no control—and any efforts at control are almost exclusively reactive. it’s whack-a-mole with 10,000,000 slots to monitor at once—an impossible exercise, designed entirely for the enjoyment of the mole.
humanity entails retaining agency over the things that make us who we are—and it hardly gets more personal than our own voices. if nothing else, does the quiet, technologically-enabled decision to replicate how we sound not show us that there is no limit to what bad actors leveraging AI will look to capture?
the long-term ramifications of any entity having the ability to generate content that replicates how any person sounds— be it corporate, government, or some high-minded basement-dwellers—are worth grappling with now, before we’re in a vocally untethered world.
what a strange future to imagine — perusing the shelves only to find ourselves—and all our friends—unceremoniously recorded. where anyone can check out anyone’s elses voice at the front desk, because AI made everyone library cards—and there’s no librarian on staff.
this week’s breakdown:
join our community on reddit
a space where you can discuss each piece, offer thoughts and additional research, and join a group interested in talking through a range of topics on how to stay good, rational people in an era that’s anything but:
one click to join the conversation —> r/stayinghuman
3 questions | what the rise of voice collection reveals about us
voices are being “collected” like trading cards — what’s stopping the spreadsheet-assembly of everything that makes us human? 23andme has our genetics (dubiously stored); CLEAR has our irises, alarming leading privacy experts; and now our voices are recorded, uploaded and replicated without much discussion about whether that’s good for us. where do we imagine our collective apathy on sensory collection will leave us?
does our default setting always have to be one that favors high-speed technological scale-out? — what does it say about the intelligence of our governing system that we allow supply-and-demand to rule the arena of artificial intelligence almost unbothered, and not look twice when so many consequences of this smile-and-handwave strategy are detrimental to societal cohesion? the law doesn’t exist in a purely reactive state for healthcare innovation — how many ‘disruptive’ ideas have never gotten off the ground — and for good reason —because they failed to meet quality standards set by an overseeing body like the FDA? how did we let it get to the point where legal precedent is somehow considered “not up with the times”, when it could much more easily be claimed that we’ve gotten ahead of ourselves on all things AI? phenomena that have been demonstrably terrible for us—social media a case in point—were a phenomenal failure in the assessment phase: we simply chose not to acknowledge the tiger that escaped its cage was very much a tiger until it mauled all the zookeepers. why do we insist on defaulting to exclusively reactive approaches? we’re treating AI like climate change, but AI has hit much more quickly and with no-longer-ignorable force, growing in direct relation with the entropy it perpetuates, feeding on the chaos it manufactures. at least not every country (e.g., Australia, the UK, Germany) is looking for ChatGPT for answers. voice cloning is on the docket now nationally and internationally (in no small part because celebrities made their fair share of noise), involving everything from legal avenues to FTC-sponsored hackathons to hopefully get its arms around this before its entirely beyond control.
how do we prevent ourselves from AI-infused voice cloning scams? and what are the best ways to inform (particularly older) members of our family to help them protect themselves? by collecting audio samples through readily accessible data sources like youtube and tiktok, bad actors build a sense of urgency (much like other phone-based scammers), but leverage the familiarity of a close friend or family member to push for gift cards, money wires and crypto transfers. knowing their methods — especially when it comes to anything financial — is a good starting point, and actors like Australia’s Media, Entertainment & Arts Alliance are pushing for comprehensive reform to set precedent with stringent regulations against AI theft. but what happens in the not-so-distant future — when all our voices are as easy to obtain as an overplayed meme?
3 facts | about the voice cloning market
the voice cloning market is poised to explode — valued at $1.5 billion in 2022, it’s estimated to reach $16.2 billion by 2032, growing at a CAGR of 27.3% between 2023 and 2032 (Allied Market Research). for perspective, the online gambling market is expected to be $15 billion by 2027, as are today’s plant-based food and pet care markets. these projections might even prove low as AI developments continue to outpace even the most aggressive projections for many industries and applications, not to mention, less-developed nations are catching up to europe and the united states on AI-based applications like voice cloning.
predators of voice cloning can be quantified too — according to the American Bar Association, “In 2023, the FBI reported a 14% increase in the number of complaints of telephone scams filed to them by adults over the age of 60, with losses of US$3.1 billion in 2022 to US$3.4 billion a year later.” They further detail these findings, including how ChatGPT explicitly accelerates the generation of” new audio content based on existing recordings” in “Imitation is the Sincerest Form of Fraudulent Activity: Artificial Intelligence in Financial Scams Against Older Adults” found here. protecting your loved ones as they age is paramount: remind them to never send money or share personal information via device, especially in a rush or to an unfamiliar source.
voice cloning can be done in minutes, if not faster — there is no stopping this technology, and companies like ElevenLabs promise to “clone your voice with only a few minutes of audio”. by 2030, there’s no reason to believe this won’t be a process that lasts longer than a few seconds — restrained only by the time it takes you to record and upload the voices you intend to clone.
3 quotes | to put AI voice cloning into perspective
“Our generation still carry the old feelings. A part of us refuses to let go. The part that wants to keep believing there’s something unreachable inside each of us. Something that’s unique and won’t transfer. But there’s nothing like that, we know that now. You know that. For people our age it’s a hard one to let go.”
The only condition of fighting for the right to create is faith in your own vocation, readiness to serve, and refusal to compromise.
I do things like get in a taxi and say, 'The library, and step on it.'
final thought
when the day comes that all our voices, all our ways of communicating, down to every guttural-throat-clearing-signal-between-strangers-in-public-restrooms can be reproduced, stored and distributed — what then? what freedoms will we have let go of? why does it so often seem the modern illusion of creation is that “more” of something underhandedly means “less” of something else—an exchange of something superficial for something intextricably fundamental to the human experience.
Ben L. | Staying Human
(1) Earn Amazon gift cards through referrals, (2) give feedback on the newsletter, and (3) comment on each edition
If 10 (real) people sign up, we’ll send you a $25 Amazon Gift Card
If 25 (real) people sign up, we’ll send you another $50 Amazon Gift Card
If 100 (real) people sign up, we’ll send you a final $100 Amazon Gift Card
Use your custom referral link below to make that happen.
In weeks and months to come, we’ll be sharing out a referral leaderboard to give everyone the props they’ve earned.
What did you think of this week's edition?your feedback shapes future editions - what did you think? |
Reply