What are large language models (LLMs), why have they become controversial?

August 30, 2021

artificial intelligence large language models

Go deeper - join us!

Create your account, choose weekly or yearly billing, and ALL STORIES are instantly available.

“Ethical” and “AI” aren’t two words often seen together (and one of them seems rare enough on its own these days), yet artificial intelligence ethics are extremely important for all of the non-artificial beings meandering around – especially when AI has the possibility to shape and influence real-world events.

The problems presented by unethical AI actions start with large language models (LLMs) and a fairly high-profile firing in Silicon Valley.

The Morning Brew’s Hayden Field explains that large language models are machine learning processes used to make AI “smarter” – if only perceptibly. You’ve seen them in use before if you use Google Docs, Grammarly, or any number of other services contingent on relatively accurate predictive text, including AI-generated emails and copy.

This style of machine learning is the reason we have things like GPT-3 (one of the most expansive large language models available) and Google’s BERT, which is responsible for the prediction and analysis you see in Google Search. It’s a clear convenience that represents one of the more impressive discoveries in recent history.

However, Field also summarizes the problem with large language models, and it’s not one we can ignore. “Left unchallenged, these models are effectively a mirror of the internet: the good, the mundane, and the disturbing,” she writes. Remember Microsoft’s AI experiment, Tay?! Yikes.

If you’ve spent any time in the darker corners of the Internet (or even just in the YouTube comment section) you’re aware of how profoundly problematic people’s observations can be. The fact that most, if not all of those interactions are catalogued by large language models is infinitely more troubling.

GPT-3 has a database spanning much of the known (and relatively unknown) Internet; as Field mentions, “the entirety of English-language Wikipedia makes up just 0.6% of GPT-3’s training data,” making it nearly impossible to comprehend just how much information the large language model has taken in.

So when the word “Muslim” was given to GPT-3 in an exercise in which it was supposed to finish the sentence, it should come as no surprise that in over 60 percent of cases, the model returned violent or stereotypical results. The Internet has a nasty habit of holding on to old information or biases as well as ones that are evergreen, and they’re equally available to inform large language models.

Dr. Timnit Gebru, a former member of Google’s Ethical AI division, recognized these problems and teamed up with Dr. Emily Bender of University of Washington and coworker Margaret Mitchell to publish a paper detailing the true dangers of the largest language models.

Gebru and Mitchell were fired within a few months of each other shortly after the paper warning of LLM dangers was published.

There is a hilariously high number of other ethical issues regarding large language models. They take up an inordinate amount of processing power, with one model training generating up to 626,000 pounds of CO2. They also tend to grow, making that impact higher over time.

They also have a lot of trouble incorporating languages that are not specifically American English due to the majority of training taking place here, making it tough for smaller countries or cultures to develop their own machine learning at a comparable pace, which widens the gap and strengthens ill perceptions that feed into the potential for prejudicial commentary from the AI.

The future of large language models is uncertain, but with the models being unsustainable, potentially problematic, and largely inaccessible to the majority of the non-English-speaking world, it’s hard to imagine that they will continue to accelerate upward. And given what we know about them now, it’s hard to see why anyone would want them to.

Jack Lloyd, Senior Staff Writer

Jack Lloyd has a BA in Creative Writing from Forest Grove's Pacific University; he spends his writing days using his degree to pursue semicolons, freelance writing and editing, oxford commas, and enough coffee to kill a bear. His infatuation with rain is matched only by his dry sense of humor.

How Zoom ‘Focus Mode’ just made meetings infinitely better

Apparently, the chip shortage is NOT easing up this year…

Unlock AG Pro Today

Why Now?

Get everything, no strings.

Get your fill of no-BS brilliance.

All in, all year. Zero lockouts.

*Most Popular

Full access, no pressure. Just power.

Useful, just not unlimited.

Upgrade later -
we’ll be here!

Unlock AG Pro Today

Why Now?

Get everything, no strings.

Get your fill of no-BS brilliance.

All in, all year. Zero lockouts.

*Most Popular

Full access, no pressure. Just power.

Useful, just not unlimited.

Upgrade later -
we’ll be here!

What are large language models (LLMs), why have they become controversial?

Go deeper - join us!

4 COMMENTS

You can’t track the Stripper Index, but these 10 weird indices might be even better

Optimizing your website accessibility as a small business

A 3-day workweek could be the next boom for employee retention

Did you know influencer marketing is actually centuries old?

How blemishes can boost your company’s sales

latest stories

Optimizing your website accessibility as a small business

A 3-day workweek could be the next boom for employee retention

Did you know influencer marketing is actually centuries old?

How blemishes can boost your company’s sales

Unlock AG Pro Today

Why Now?

Get everything, no strings.

Get your fill of no-BS brilliance.

All in, all year. Zero lockouts.

*Most Popular Full access, no pressure. Just power.

Useful, just not unlimited.

Upgrade later -we’ll be here!

Unlock AG Pro Today

Why Now?

Get everything, no strings.

Get your fill of no-BS brilliance.

All in, all year. Zero lockouts.

*Most Popular Full access, no pressure. Just power.

Useful, just not unlimited.

Upgrade later -we’ll be here!

What are large language models (LLMs), why have they become controversial?

Go deeper - join us!

4 COMMENTS

latest stories

*Most Popular

Full access, no pressure. Just power.

Upgrade later -
we’ll be here!

*Most Popular

Full access, no pressure. Just power.

Upgrade later -
we’ll be here!