How We Taught Computers the Difference Between Apple and apples: The NLP Breakthrough

Read Time: 7min

By Emmanuel Wada Jr
@endlessedge__
June 19, 2025
Read Time: 7min

Part 1 of Demystifying AI 

The Language Problem That's Hiding in Plain Sight

Everyone talks about AI being the future, but here's what they don't tell you: computers are terrible at understanding language. Take the word "bat" – is it a flying mammal or sports equipment? The word "bank" – are we talking money or rivers? Without context, computers have no clue what you mean.

Over the last 20 years there have been massive breakthroughs in developing technologies that now allow computers to understand human language, these breakthroughs are quite simple and straightforward. In this article we will pull back the curtain on AI and see how it actually works. 

The Embarrassing Truth About Computers and Language

Picture this: To a computer, "Apple" (the trillion-dollar tech giant) and "apple" (the fruit in your lunch) are exactly the same. Same letters, same spelling. The computer has zero idea that one makes iPhones while the other is used to make pie.

Think about that for a second. When you read "Apple stock," your brain instantly knows:

  • We're talking about a tech company
  • Stock means shares, not soup stock
  • This has absolutely nothing to do with fruit

But to a computer it sees: a-p-p-l-e-[space]-s-t-o-c-k. Just characters. Meaningless symbols. It might as well be "xqzpt wrngl" – makes no difference to the machine.

How Search Used to Fail (And Why It Matters)

To understand how far we've come, let's look at the disaster that was pre-2000s search:

  • Search for "Java" → coffee results when you wanted programming
  • Look up "Mercury" → thermometer ads instead of planet facts
  • Ask about "Amazon" → shopping suggestions, not river information
  • Type "Paris weather" → Paris, Texas. Not France. Seriously.

These weren't bugs. These were features of a system that couldn't understand context. Computers were matching characters, not meaning. And it was costing businesses millions.

Enter Natural Language Processing

This is where Natural Language Processing comes in. NLP. Sounds very complicated but here's what it actually is: teaching computers that "Apple" next to "iPhone" means the company, while "apple" next to "pie" means fruit.

The solution is based on something your mom probably told you: "You are the company you keep." Turns out, this kindergarten wisdom is exactly how we teach computers to understand language.

The Simple Principle Behind AI Language Understanding

So words that hang out together probably mean something similar. If "doctor" keeps showing up with "hospital," "patient," and "medicine," even a computer can figure out these words are related.

Think of it like this: If you see a kid who's always at basketball courts, wearing jerseys, carrying a ball – you don't need to be a genius to figure out they like basketball. Same principle. We just do it with words.

The 5-Step Process

Let’s see how this actually works:

Step 1: Feed the Machine

First, we dump massive amounts of text into the computer. Wikipedia, books, news articles, websites – everything. This mountain of text is what we call a corpus. Simple concept: it's just a huge collection of text that becomes the computer's textbook for learning language. Think of it like dropping someone in Rio de Janeiro with every Portuguese book ever written and saying "figure out Portugese." Total immersion.

Step 2: Count Like Your Life Depends On It

Next, the computer counts every single word. How often does "the" appear? Common words like "the" show up millions of times. Weird words like "supercalifragilisticexpialidocious" might appear once.

This frequency list is what we call a vocabulary. Think of it as the computer's dictionary, but instead of definitions, it tracks how popular each word is. Most used to least used. This vocabulary becomes crucial because it tells us which words actually matter in the language.

Step 3: Give Every Word an ID Badge

Next, we assign each word a number. Like employee IDs at a company. So "cats love fish" becomes:

  • "cats" → ID #42
  • "love" → ID #108
  • "fish" → ID #67

Now the computer can work with nice, clean numbers instead of messy human language. This allows the system to look 

Step 4: Map the Relationships (This Is Where the Magic Happens)

We look at every word and check out its neighbors. We pick a target word (the word we're focusing on) and see what hangs around it. The words nearby, we call that the context window.

Example: "The doctor examined the patient at the hospital"

  • Target word: "doctor" (this is our focus)
  • Context window: "examined," "patient," "hospital" (these are doctor's neighbors)

Now we create what we call training pairs – basically connections between words. But there are two ways to do this:

Skip-gram (what I just showed you):

  • Start with target word → predict context words
  • From "doctor" → predict ["examined", "patient", "hospital"]
  • Training pairs: ["doctor"→"examined"], ["doctor"→"patient"], ["doctor"→"hospital"]
  • Like saying "If I see 'doctor', what words probably appear nearby?"

CBOW (Continuous Bag of Words - the reverse):

  • Start with context words → predict target word
  • From ["examined", "patient", "hospital"] → predict "doctor"
  • Training pairs: [["examined", "patient", "hospital"]→"doctor"]
  • Like saying "If I see these medical words together, the missing word is probably 'doctor'"

Think of Skip-gram as "given a person, guess their friends" and CBOW as "given a group of friends, guess the missing person." Both teach the computer word relationships, just from different angles.

Do this millions of times and patterns from the text will start to emerge. "Doctor" loves medical terms. "Pizza" hangs with "cheese," "delivery," "Italian." The computer builds a massive web of word relationships.

Step 5: Turn Words into GPS Coordinates

Here's where we put the intelligence into artificial intelligence. Remember those training pairs from Step 4? We feed them to something called an embedding model. This model converts these word relationships into numbers called vectors.

Think of vectors as GPS coordinates for words. Just like New York and Boston are close on a map while New York and Tokyo aren't, "Apple" near "iPhone" gets coordinates close to "Microsoft," while "apple" near "pie" lands in food territory. These vectors let computers "see" that some words are related and others aren't.

See It In Action Right Now

How computers learned to understand human language in 5 simple steps

Want proof this works? Try this:

  1. Google "python tutorial" – watch it know you mean programming, not snakes
  2. Search "apple stock" – notice how it never shows fruit prices
  3. Type "jaguar speed" – it figures out if you mean the car or the cat

This wasn't magic. This was Natural Language Processing doing its job, understanding context through word relationships.

Why This Changes Everything

Once computers learned word relationships, things changed for search and now AI:

  • Search engines know "Apple + iPhone" = tech company, "apple + recipe" = fruit
  • Translators understand "bank" (money) vs "bank" (river)
  • Chatbots get that "cancel" = "unsubscribe" = "I want to quit"
  • Amazon knows you want "python documentaries" not "Python programming"

This didn't happen overnight. But these breakthroughs fundamentally changed how computers understand human language.

It Continues to Get Interesting

Remember those GPS coordinates I mentioned? In Part 2, we’re going to go over what happens when you do math with words.

Spoiler alert: Take the coordinates for "King," subtract "Man," add "Woman" and you get... "Queen."

I'm not making this up. This isn't a party trick. This is the foundation of how AI understands meaning, how Google reads your mind, and how Netflix knows what you'll binge next before you do.

Ready for the rabbit hole? Part 2: "The Magic of Word Embeddings" reveals how teaching computers that words are just points in space unlocks abilities that seem like magic but are really just clever math.

Quick Recap

  • Computers see words as meaningless symbols
  • NLP teaches computers that words appearing together are related
  • We turn these relationships into coordinates (vectors)
  • This simple idea powers every AI system you use
  • Next up: How doing math with words creates AI magic