KIKIneAhnung
Back to Fundamentals

The Engine Room: How Does Artificial Intelligence Work?

You now know what AI is and where it comes from. But how does it all actually work under the hood? Don't worry — we'll explain the technology so everyone can understand it. Promise.

1. Machine Learning — The Trial and Error Principle

Machine Learning is the heart of modern AI. Instead of giving a computer exact rules (“If the email contains the word lottery, move it to spam”), we show it thousands of examples and let it find the rules on its own.

Think of an apprentice working in a restaurant. On the first day, they guess which side dish each guest wants with every order — and usually get it wrong. But after hundreds of orders, they notice: “Anyone who orders fish almost always takes rice.” Nobody dictated that rule; they learned it from the data. That's exactly how machine learning works.

The machine starts with random assumptions, compares its results with reality, measures the error, and corrects itself — billions of times. The end result is a “model” that can make surprisingly good predictions for new, unseen data.

2. Neural Networks — The Brain Analogy

The name “neural network” sounds biological — and that's intentional. The idea is inspired by our brains: billions of nerve cells (neurons) communicate via electrical signals. The more often a particular pathway is used, the stronger the connection becomes — that's how we learn.

An artificial neural network works similarly. It consists of many small computing units (“neurons”) arranged in layers:

  • Input layer: Raw data enters here — e.g., the pixels of an image.
  • Hidden layers: This is where the real magic happens. Each neuron takes signals from the previous layer, weights them, and passes on a result.
  • Output layer: The answer comes out here — e.g., “That's a cat” or “This email is spam.”

The “weights” on the connections are the knobs that get adjusted during training. Think of them as volume sliders on a mixing desk: through fine-tuning thousands of sliders, a clear signal eventually emerges from the noise.

3. Deep Learning — Many Layers, Great Knowledge

Deep Learningis essentially “machine learning with many layers.” The “deep” refers to the depth of the network — the number of hidden layers. A simple network might have three layers; modern language models like GPT have billions of parameters spread across hundreds of layers.

Why are more layers better? Each layer learns something different:

  • The first layers detect simple patterns — edges, colours, shapes.
  • Middle layers combine these into more complex structures — eyes, noses, wheels.
  • The final layers recognise entire concepts — “That's a face” or “This sentence expresses joy.”

It's like a detective team: the first finds fingerprints, the second matches them, the third solves the case. The more detectives (layers) working together, the more complex cases they can crack.

4. Training: Data In, Patterns Out, Prediction

Training an AI model happens in three phases:

Phase 1: Collect data

Enormous amounts of data are gathered — texts from the internet, millions of images, audio files. The quality and diversity of the data largely determines how good the model will be.

Phase 2: Recognise patterns

The model processes the data and adjusts its internal weights. With each pass, the error is measured and minimised — a process called “backpropagation.” Imagine throwing a ball at a target: after each throw, you adjust your angle slightly.

Phase 3: Prediction

After training, the model can be applied to completely new data. It has learned to recognise general patterns — not memorised individual examples. That's why ChatGPT can answer questions it has never seen during training.

5. Supervised vs. Unsupervised Learning

Supervised LearningUnsupervised Learning
PrincipleLearning with a “teacher” — every example has a correct answerLearning without a teacher — the AI searches for structures on its own
AnalogyLike learning vocabulary with flashcardsLike a child sorting toys by colour
ExampleSpam filter (thousands of labelled emails as training)Customer segmentation (automatically detecting groups)
StrengthVery precise for clear-cut tasksFinds hidden patterns that humans overlook

There's also Reinforcement Learning, where the AI acts like a player: it tries different actions and receives rewards or penalties. That's exactly how AlphaGo learned to play the legendary Move 37.

6. Why GPUs Matter So Much

You've probably heard of GPUs(Graphics Processing Units) — originally graphics chips for video games. Why are they so crucial for AI?

A normal processor (CPU) is like a brilliant solo worker: it solves tasks step by step, but only one at a time. A GPU, on the other hand, is like a huge team of thousands of simple workers: each individual is less specialised, but they all work simultaneously.

AI training consists of billions of simple calculations that can run in parallel. This is exactly where GPUs shine. Without them, training a modern language model would take centuries instead of weeks. Companies like NVIDIA have therefore become some of the most important players in the AI revolution.

7. Practical Example: Image Recognition Step by Step

Let's bring it all together with a concrete example: an AI is supposed to learn to distinguish dogs from cats.

1

Collect data

10,000 images of dogs and 10,000 images of cats, each labelled “dog” or “cat.”

2

Convert images to numbers

Each pixel becomes a number (brightness/colour value). An image of 224 x 224 pixels results in approximately 150,000 input values.

3

Feed through the network

The numbers flow through the layers. Early layers detect edges, middle layers recognise shapes (pointed ears? snout?), final layers make the decision.

4

Measure error & correct

Does the model say “dog” when it's actually a cat? The error is calculated and the weights are adjusted. This step is repeated millions of times.

5

Done: Recognise new images

The trained model can now analyse an image it has never seen before and say with high probability: “That's a dog.”

This exact principle — just with far more layers and far more data — is behind facial recognition, autonomous driving, and also the language models you chat with.

What would you like to discover next?