Machine Learning

Machine learning - neural network

What is Machine Learning?

Machine Learning (ML) is the central and most important branch of artificial intelligence today. It's the technology behind almost everything we know as AI—from ChatGPT to Netflix recommendations.

Formal definition: Machine learning is a field that enables computers to learn from data and improve at tasks—without being explicitly programmed for every possible situation.

💡 Main difference from regular programming:
  • Regular programming: You write rules → the computer executes
  • Machine learning: You give examples → the computer discovers the rules by itself

It sounds simple, but it's a revolution. Instead of a programmer having to think of every possible case and write a rule—the computer learns by itself from examples and discovers patterns even humans might not spot.

How does it work? The full picture

The child and the dog

Suppose we want to teach a small child what a "dog" is:

👶 Traditional way (regular programming)

We explain rules to the child:

  • "A dog is an animal with 4 legs"
  • "It has fur"
  • "It has ears"
  • "It has a tail"

Problems:

  • Cats also have 4 legs, fur, ears, and a tail
  • Some dogs have no tail
  • Some dogs have pointy ears, some floppy
  • Tiny dogs and huge dogs
  • You can't write all the rules!

🧒 Machine learning way

We show the child thousands of images:

  • "This is a dog" (see 1000 images)
  • "This is not a dog" (see 1000 images of cats, rabbits, etc.)

What happens:

  • The child learns the patterns by themselves
  • They can't explain exactly how they know—but they know
  • They can recognize new dogs they've never seen

That's exactly what machine learning does! The computer sees many examples, finds patterns, and builds an internal "model" that lets it generalize to new cases.

The technical process (simplified)

📊 Step 1: Collect data

Gather many examples. More is usually better.

Example: 10,000 dog images and 10,000 non-dog images

🏷️ Step 2: Labeling (sometimes)

Label the data—what the correct answer is for each example.

Example: Each image is labeled "dog" or "not dog"

🏋️ Step 3: Training

The computer goes through all examples, analyzes them, and looks for patterns.

It's like "studying for a test"—the computer practices again and again until it succeeds.

Time: Can take minutes to weeks, depending on complexity

🎯 Step 4: Model

The result is a "model"—a mathematical structure that represents what the computer learned.

The model is like the "brain" built from learning.

🔮 Step 5: Inference

Now you can give the model new input and it "guesses" the answer.

Example: New image → model says "93% chance this is a dog"

📈 Step 6: Evaluation and improvement

Check how often the model is right. If it makes many mistakes—improve and adjust.

It's an iterative process of continuous improvement.

Three types of machine learning

There are three main approaches, each suited to different situations:

1. Supervised Learning 👨‍🏫

The most common type. The computer gets examples with correct answers (labels) and learns to predict the answer for new examples.

How it works:

Like learning from a teacher who gives you exercises with solutions:

  • The computer sees: input (X) → answer (Y)
  • The computer learns: what's the relationship between X and Y?
  • Now when there's new X → the computer guesses Y

Real-world examples:

  • Spam detection: Thousands of emails labeled "spam" / "not spam" → model learns to tell them apart
  • House price prediction: Data on sold houses and prices → model predicts price for a new house
  • Medical diagnosis: X-rays labeled "healthy" / "disease" → model detects conditions
  • Face recognition: Images labeled with names → model recognizes who's in the photo
  • Translation: Millions of translated sentences → model learns to translate
💡 Advantage: Accurate, reliable results
⚠️ Disadvantage: Need lots of labeled data (expensive and time-consuming)

2. Unsupervised Learning 🔍

The computer gets data without answers and looks for patterns and structure by itself. Like giving a child a pile of toys and asking them to sort—without saying how.

How it works:

  • The computer receives unlabeled data
  • Looks for similarity between data points
  • Clusters or finds hidden structure

Real-world examples:

  • Customer segmentation: System analyzes buying behavior and divides into groups (VIP, occasional, dormant...)
  • Anomaly detection: Finding unusual credit card transactions
  • Data compression: Finding a more efficient representation of information
  • Recommendation engines: Grouping similar products/users
  • Social network analysis: Finding communities
💡 Advantage: No need to label data, discovers hidden patterns
⚠️ Disadvantage: Harder to judge if results are good

3. Reinforcement Learning 🎮

The computer learns by trial and error—takes actions, gets feedback (reward or penalty), and learns what's worth doing.

How it works:

Like training a dog:

  • Agent: Who learns (the dog / the software)
  • Environment: The world it operates in
  • Actions: What the agent can do
  • Reward: Feedback on actions (+1 for success, -1 for failure)

The agent gradually learns which actions lead to the highest reward.

Real-world examples:

  • AlphaGo: Learned to play Go at world-champion level
  • Robots: Learning to walk, grasp objects
  • Autonomous driving: Learning from driving experience
  • Video games: AI that beats human players
  • Optimization: Managing power in data centers
💡 Advantage: Can learn complex tasks, improve without limit
⚠️ Disadvantage: Needs many trials, sometimes unstable

Deep Learning

Deep learning is a subfield of machine learning that drove the revolution in recent years. It's the technology behind ChatGPT, face recognition, automatic translation, and more.

What is a neural network?

An artificial neural network is a mathematical structure that mimics (in a very simplified way) how the human brain works.

How it works:

  • Neurons: Small computing units that take input and produce output
  • Layers: Neurons are organized in layers
  • Connections: Each neuron connects to neurons in the next layer
  • Weights: Each connection has a "weight" that sets its influence

During learning, the system adjusts the weights to improve results.

Why "deep"?

"Deep" refers to the number of layers. Modern neural networks can have dozens or even hundreds of layers—that's what lets them learn complex patterns.

Example—face recognition:

  • Early layers: Detect edges and basic curves
  • Middle layers: Detect shapes (eyes, nose, mouth)
  • Deep layers: Recognize full faces and identity

Each layer learns more and more abstract representations.

Types of neural networks

CNN - Convolutional Neural Networks

Specialize in images and video. Used for face recognition, object detection, medical image analysis.

RNN / LSTM - Recurrent Neural Networks

Specialize in sequences (text, time). Used for translation, sentiment analysis, forecasting.

Transformers

The revolutionary architecture from 2017. The basis for GPT, BERT, and most modern language models. Can process long input and understand distant relationships in text.

Detailed real-world examples

🏥 Medicine - cancer diagnosis

Problem: Doctors sometimes miss small tumors in scans

Solution:

  • Training on millions of labeled scans
  • Model learns to spot tumor patterns
  • Shows "suspicious areas" to the doctor

Result: Detection rate of 95%+, sometimes better than specialist doctors

🏦 Banking - fraud detection

Problem: Millions of transactions per day, can't check manually

Solution:

  • Model learns what's "normal" for each customer
  • Detects anomalies: unusual location, amount, buying pattern
  • Sends alert or blocks the transaction

Result: Saves billions of dollars per year

🎵 Spotify - music recommendations

Problem: Tens of millions of songs—how to find what you'll love?

Solution (combination of approaches):

  • Collaborative Filtering: "People who liked X also liked Y"
  • Content-Based: Musical analysis (tempo, instruments, genre)
  • NLP: Analyzing lyrics and descriptions
  • Audio Analysis: Analyzing the sound itself

Result: Discover Weekly—a personal list that surprises you every week

🚗 Tesla - autonomous driving

Problem: Driving requires understanding a complex environment and real-time reactions

Solution:

  • Cameras: 8 cameras around the car
  • Computer Vision: Detecting cars, pedestrians, signs, lanes
  • Prediction: Predicting behavior of other road users
  • Path planning: Deciding where to drive
  • Fleet learning: Every Tesla drive teaches the system

Result: Billions of kilometers of driving data

💬 ChatGPT - language model

Problem: Creating natural conversation with a computer

Solution:

  • Pre-training: Training on trillions of words from the internet
  • Task: Predict the next word in text
  • Fine-tuning: Tuning with examples of good conversations
  • RLHF: Reinforcement learning from human feedback

Result: A model that can answer almost any question in natural language

What does it take to work?

📊 1. Data - the fuel

Without data there's no machine learning. You need:

  • Quantity: Usually more is better (thousands to billions)
  • Quality: Clean, accurate data without errors
  • Diversity: Data that represents all possible cases
  • Labeling: In supervised learning—answers must be marked
⚠️ "Garbage in = garbage out"

If the data is bad or biased—the model will be too!

💻 2. Compute

Training models requires heavy processing:

  • GPU: Graphics processors that excel at parallel computation
  • TPU: Google's special chips for ML
  • Cloud: AWS, Google Cloud, Azure offer powerful hardware

Cost: Training a GPT-5-scale model is estimated at tens to hundreds of millions of dollars.

🧮 3. Algorithm

The mathematical method the computer uses to learn:

  • Choosing architecture (CNN, Transformer, etc.)
  • Objective function (what is the model trying to improve?)
  • Optimization (how to update the weights?)
  • Hyperparameters (learning rate, batch size, etc.)

👨‍💻 4. Expertise

You need people who understand:

  • Data Scientists—analyze data and build models
  • ML Engineers—implement and deploy systems
  • Domain Experts—understand the specific field

⏰ 5. Time

Training takes time:

  • Small model: minutes to hours
  • Medium model: hours to days
  • Large model (e.g. GPT-5): weeks to months

Key concepts

🎯 Overfitting

When the model "memorizes" the training data instead of learning general patterns.

Example: A student who memorized all answers on a practice test—but fails the real test.

Solution: More data, regularization, testing on new data.

📉 Underfitting

When the model is too simple and fails to learn the patterns.

Solution: More complex model, more features, more training.

🔄 Training / Validation / Test Sets

Split the data into three sets:

  • Training (70%): For training the model
  • Validation (15%): For tuning and calibration
  • Test (15%): For final evaluation—don't touch until the end!

📊 Accuracy, Precision, Recall

Metrics for evaluating model quality:

  • Accuracy: How often we were right overall
  • Precision: When we said "yes"—how often were we right?
  • Recall: Of all the true "yes"—how many did we find?

⚖️ Bias

When the model discriminates systematically. Happens due to unbalanced data.

Example: A hiring model trained on existing employees (mostly men) discriminates against women.

Summary 📝

  • Machine learning: Computers that learn from examples instead of explicit rules
  • 3 types: Supervised (with answers), unsupervised (without), reinforcement (trial and error)
  • Deep learning: Neural networks with many layers—the technology behind ChatGPT
  • What you need: Data + compute + algorithm + expertise + time
  • Examples: Recommendations, fraud detection, medical diagnosis, autonomous driving
  • Challenges: Overfitting, bias, need for lots of data

📝 Test yourself

Answer 10 questions to check your understanding of machine learning.

1 What is machine learning?

2 What are the three main types of learning?

3 In supervised learning, what does the system receive?

4 Which type of learning is used to train a robot to play games?

5 What is a neural network?

6 What happens with overfitting?

7 What is unsupervised learning?

8 Why is it important to split data into training and test sets?

9 What is the role of "weights" in a neural network?

10 What is needed for machine learning to succeed?

Your results

0/10 Correct answers
0% Success rate

→ 2. History of AI 4. AI in Daily Life – Next ←