Machine Learning

🔮 Reading time: 20 min | Level: Beginner–Intermediate

What is Machine Learning?

Machine Learning (ML) is the central and most important branch of artificial intelligence today. It's the technology behind almost everything we know as AI—from ChatGPT to Netflix recommendations.

Formal definition: Machine learning is a field that enables computers to learn from data and improve at tasks—without being explicitly programmed for every possible situation.

💡 Main difference from regular programming:

Regular programming: You write rules → the computer executes
Machine learning: You give examples → the computer discovers the rules by itself

It sounds simple, but it's a revolution. Instead of a programmer having to think of every possible case and write a rule—the computer learns by itself from examples and discovers patterns even humans might not spot.

How does it work? The full picture

The child and the dog

Suppose we want to teach a small child what a "dog" is:

👶 Traditional way (regular programming)

We explain rules to the child:

"A dog is an animal with 4 legs"
"It has fur"
"It has ears"
"It has a tail"

❌ Problems:

Cats also have 4 legs, fur, ears, and a tail
Some dogs have no tail
Some dogs have pointy ears, some floppy
Tiny dogs and huge dogs
You can't write all the rules!

🧒 Machine learning way

We show the child thousands of images:

"This is a dog" (see 1000 images)
"This is not a dog" (see 1000 images of cats, rabbits, etc.)

✅ What happens:

The child learns the patterns by themselves
They can't explain exactly how they know—but they know
They can recognize new dogs they've never seen

That's exactly what machine learning does! The computer sees many examples, finds patterns, and builds an internal "model" that lets it generalize to new cases.

The technical process (simplified)

📊 Step 1: Collect data

Gather many examples. More is usually better.

Example: 10,000 dog images and 10,000 non-dog images

🏷️ Step 2: Labeling (sometimes)

Label the data—what the correct answer is for each example.

Example: Each image is labeled "dog" or "not dog"

🏋️ Step 3: Training

The computer goes through all examples, analyzes them, and looks for patterns.

It's like "studying for a test"—the computer practices again and again until it succeeds.

Time: Can take minutes to weeks, depending on complexity

🎯 Step 4: Model

The result is a "model"—a mathematical structure that represents what the computer learned.

The model is like the "brain" built from learning.

🔮 Step 5: Inference

Now you can give the model new input and it "guesses" the answer.

Example: New image → model says "93% chance this is a dog"

📈 Step 6: Evaluation and improvement

Check how often the model is right. If it makes many mistakes—improve and adjust.

It's an iterative process of continuous improvement.

Three types of machine learning

There are three main approaches, each suited to different situations:

1. Supervised Learning 👨‍🏫

The most common type. The computer gets examples with correct answers (labels) and learns to predict the answer for new examples.

How it works:

Like learning from a teacher who gives you exercises with solutions:

The computer sees: input (X) → answer (Y)
The computer learns: what's the relationship between X and Y?
Now when there's new X → the computer guesses Y

Real-world examples:

Spam detection: Thousands of emails labeled "spam" / "not spam" → model learns to tell them apart
House price prediction: Data on sold houses and prices → model predicts price for a new house
Medical diagnosis: X-rays labeled "healthy" / "disease" → model detects conditions
Face recognition: Images labeled with names → model recognizes who's in the photo
Translation: Millions of translated sentences → model learns to translate

💡 Advantage: Accurate, reliable results
⚠️ Disadvantage: Need lots of labeled data (expensive and time-consuming)

2. Unsupervised Learning 🔍

The computer gets data without answers and looks for patterns and structure by itself. Like giving a child a pile of toys and asking them to sort—without saying how.

How it works:

The computer receives unlabeled data
Looks for similarity between data points
Clusters or finds hidden structure

Real-world examples:

Customer segmentation: System analyzes buying behavior and divides into groups (VIP, occasional, dormant...)
Anomaly detection: Finding unusual credit card transactions
Data compression: Finding a more efficient representation of information
Recommendation engines: Grouping similar products/users
Social network analysis: Finding communities

💡 Advantage: No need to label data, discovers hidden patterns
⚠️ Disadvantage: Harder to judge if results are good

3. Reinforcement Learning 🎮

The computer learns by trial and error—takes actions, gets feedback (reward or penalty), and learns what's worth doing.

How it works:

Like training a dog:

Agent: Who learns (the dog / the software)
Environment: The world it operates in
Actions: What the agent can do
Reward: Feedback on actions (+1 for success, -1 for failure)

The agent gradually learns which actions lead to the highest reward.

Real-world examples:

AlphaGo: Learned to play Go at world-champion level
Robots: Learning to walk, grasp objects
Autonomous driving: Learning from driving experience
Video games: AI that beats human players
Optimization: Managing power in data centers

💡 Advantage: Can learn complex tasks, improve without limit
⚠️ Disadvantage: Needs many trials, sometimes unstable

Deep Learning

Deep learning is a subfield of machine learning that drove the revolution in recent years. It's the technology behind ChatGPT, face recognition, automatic translation, and more.

What is a neural network?

An artificial neural network is a mathematical structure that mimics (in a very simplified way) how the human brain works.

How it works:

Neurons: Small computing units that take input and produce output
Layers: Neurons are organized in layers
Connections: Each neuron connects to neurons in the next layer
Weights: Each connection has a "weight" that sets its influence

During learning, the system adjusts the weights to improve results.

Why "deep"?

"Deep" refers to the number of layers. Modern neural networks can have dozens or even hundreds of layers—that's what lets them learn complex patterns.

Example—face recognition:

Early layers: Detect edges and basic curves
Middle layers: Detect shapes (eyes, nose, mouth)
Deep layers: Recognize full faces and identity

Each layer learns more and more abstract representations.

Types of neural networks

CNN - Convolutional Neural Networks

Specialize in images and video. Used for face recognition, object detection, medical image analysis.

RNN / LSTM - Recurrent Neural Networks

Specialize in sequences (text, time). Used for translation, sentiment analysis, forecasting.

Transformers

The revolutionary architecture from 2017. The basis for GPT, BERT, and most modern language models. Can process long input and understand distant relationships in text.

Detailed real-world examples

🏥 Medicine - cancer diagnosis

Problem: Doctors sometimes miss small tumors in scans

Solution:

Training on millions of labeled scans
Model learns to spot tumor patterns
Shows "suspicious areas" to the doctor

Result: Detection rate of 95%+, sometimes better than specialist doctors

🏦 Banking - fraud detection

Problem: Millions of transactions per day, can't check manually

Solution:

Model learns what's "normal" for each customer
Detects anomalies: unusual location, amount, buying pattern
Sends alert or blocks the transaction

Result: Saves billions of dollars per year

🎵 Spotify - music recommendations

Problem: Tens of millions of songs—how to find what you'll love?

Solution (combination of approaches):

Collaborative Filtering: "People who liked X also liked Y"
Content-Based: Musical analysis (tempo, instruments, genre)
NLP: Analyzing lyrics and descriptions
Audio Analysis: Analyzing the sound itself

Result: Discover Weekly—a personal list that surprises you every week

🚗 Tesla - autonomous driving

Problem: Driving requires understanding a complex environment and real-time reactions

Solution:

Cameras: 8 cameras around the car
Computer Vision: Detecting cars, pedestrians, signs, lanes
Prediction: Predicting behavior of other road users
Path planning: Deciding where to drive
Fleet learning: Every Tesla drive teaches the system

Result: Billions of kilometers of driving data

💬 ChatGPT - language model

Problem: Creating natural conversation with a computer

Solution:

Pre-training: Training on trillions of words from the internet
Task: Predict the next word in text
Fine-tuning: Tuning with examples of good conversations
RLHF: Reinforcement learning from human feedback

Result: A model that can answer almost any question in natural language

What does it take to work?

📊 1. Data - the fuel

Without data there's no machine learning. You need:

Quantity: Usually more is better (thousands to billions)
Quality: Clean, accurate data without errors
Diversity: Data that represents all possible cases
Labeling: In supervised learning—answers must be marked

⚠️ "Garbage in = garbage out"

If the data is bad or biased—the model will be too!

💻 2. Compute

Training models requires heavy processing:

GPU: Graphics processors that excel at parallel computation
TPU: Google's special chips for ML
Cloud: AWS, Google Cloud, Azure offer powerful hardware

Cost: Training a GPT-5-scale model is estimated at tens to hundreds of millions of dollars.

🧮 3. Algorithm

The mathematical method the computer uses to learn:

Choosing architecture (CNN, Transformer, etc.)
Objective function (what is the model trying to improve?)
Optimization (how to update the weights?)
Hyperparameters (learning rate, batch size, etc.)

👨‍💻 4. Expertise

You need people who understand:

Data Scientists—analyze data and build models
ML Engineers—implement and deploy systems
Domain Experts—understand the specific field

⏰ 5. Time

Training takes time:

Small model: minutes to hours
Medium model: hours to days
Large model (e.g. GPT-5): weeks to months

Key concepts

🎯 Overfitting

When the model "memorizes" the training data instead of learning general patterns.

Example: A student who memorized all answers on a practice test—but fails the real test.

Solution: More data, regularization, testing on new data.

📉 Underfitting

When the model is too simple and fails to learn the patterns.

Solution: More complex model, more features, more training.

🔄 Training / Validation / Test Sets

Split the data into three sets:

Training (70%): For training the model
Validation (15%): For tuning and calibration
Test (15%): For final evaluation—don't touch until the end!

📊 Accuracy, Precision, Recall

Metrics for evaluating model quality:

Accuracy: How often we were right overall
Precision: When we said "yes"—how often were we right?
Recall: Of all the true "yes"—how many did we find?

⚖️ Bias

When the model discriminates systematically. Happens due to unbalanced data.

Example: A hiring model trained on existing employees (mostly men) discriminates against women.

Summary 📝

Machine learning: Computers that learn from examples instead of explicit rules
3 types: Supervised (with answers), unsupervised (without), reinforcement (trial and error)
Deep learning: Neural networks with many layers—the technology behind ChatGPT
What you need: Data + compute + algorithm + expertise + time
Examples: Recommendations, fraud detection, medical diagnosis, autonomous driving
Challenges: Overfitting, bias, need for lots of data

Machine Learning

What is Machine Learning?

How does it work? The full picture

The child and the dog

👶 Traditional way (regular programming)

🧒 Machine learning way

The technical process (simplified)

📊 Step 1: Collect data

🏷️ Step 2: Labeling (sometimes)

🏋️ Step 3: Training

🎯 Step 4: Model

🔮 Step 5: Inference

📈 Step 6: Evaluation and improvement

Three types of machine learning

1. Supervised Learning 👨‍🏫

How it works:

Real-world examples:

2. Unsupervised Learning 🔍

How it works:

Real-world examples:

3. Reinforcement Learning 🎮

How it works:

Real-world examples:

Deep Learning

What is a neural network?

How it works:

Why "deep"?

Example—face recognition:

Types of neural networks

CNN - Convolutional Neural Networks

RNN / LSTM - Recurrent Neural Networks

Transformers

Detailed real-world examples

🏥 Medicine - cancer diagnosis

🏦 Banking - fraud detection

🎵 Spotify - music recommendations

🚗 Tesla - autonomous driving

💬 ChatGPT - language model

What does it take to work?

📊 1. Data - the fuel

💻 2. Compute

🧮 3. Algorithm

👨‍💻 4. Expertise

⏰ 5. Time

Key concepts

🎯 Overfitting

📉 Underfitting

🔄 Training / Validation / Test Sets

📊 Accuracy, Precision, Recall

⚖️ Bias

Summary 📝

📝 Test yourself

1 What is machine learning?

2 What are the three main types of learning?

3 In supervised learning, what does the system receive?

4 Which type of learning is used to train a robot to play games?

5 What is a neural network?

6 What happens with overfitting?

7 What is unsupervised learning?

8 Why is it important to split data into training and test sets?

9 What is the role of "weights" in a neural network?

10 What is needed for machine learning to succeed?

Your results