Machine Learning Paradigms

Machine learning algorithms learn from data in different ways. The learning paradigm defines how the algorithm receives feedback and improves its performance over time. Understanding these paradigms is fundamental to choosing the right approach for your problem.

The choice of learning paradigm is determined by the type of data you have and the problem you’re trying to solve. Labeled data? Use supervised learning. Unlabeled data? Consider unsupervised or self-supervised. Sequential decision-making? Reinforcement learning is your answer.

Supervised Learning

Supervised learning is the most common paradigm. The algorithm learns from labeled data—input data paired with the correct output. Think of it as learning with a teacher who provides the answers.

How It Works

Provide labeled examples: (input, label) pairs
The model learns to map inputs to labels
On new data, the model predicts the label

Data Type	Example	Task Type
Labeled images	`(cat_image, "cat")`	Classification
Historical prices	`(house_features, price)`	Regression
Email content	`(email_text, "spam/not_spam")`	Classification

Types of Supervised Learning

Type	Predicts	Examples
Classification	Categories/labels	Spam detection, image recognition, sentiment analysis
Regression	Continuous values	House prices, temperature forecasting, sales prediction

Common Algorithms

Algorithm	Typical Use
Linear Regression	Simple regression tasks
Logistic Regression	Binary classification
Decision Trees	Interpretable classification/regression
Random Forests	High accuracy, robust
Support Vector Machines (SVM)	Complex classification boundaries
Neural Networks	Complex, high-dimensional data

Real-World Examples

Email filters: Learn to classify spam vs. not spam from labeled examples
Credit scoring: Predict loan default risk from historical customer data
Medical diagnosis: Classify tumors as benign/malignant from labeled images
Price prediction: Predict house prices from feature data

Limitation: Requires large amounts of labeled data, which can be expensive and time-consuming to create.

Unsupervised Learning

Unsupervised learning works with unlabeled data. The algorithm finds patterns, structures, or relationships in the data without being told what to look for.

How It Works

Provide unlabeled data
The model discovers hidden patterns or structures
Output reveals relationships, groupings, or reduced representations

Types of Unsupervised Learning

Type	Goal	Common Algorithms
Clustering	Group similar items together	K-Means, Hierarchical Clustering, DBSCAN
Dimensionality Reduction	Reduce features while preserving information	PCA, t-SNE, UMAP
Association	Discover relationships between items	Apriori, FP-Growth
Anomaly Detection	Identify unusual patterns	Isolation Forest, One-Class SVM

Real-World Examples

Customer segmentation: Group customers by purchasing behavior
Recommendation systems: Find similar items/users (collaborative filtering)
Anomaly detection: Detect fraudulent transactions, network intrusions
Data compression: Reduce dataset dimensions for visualization or storage

Advantage: Works with unlabeled data, which is more abundant than labeled data.

Challenge: Evaluating results is harder without ground truth labels.

Self-Supervised Learning

Self-supervised learning is a clever approach that bridges the gap between supervised and unsupervised learning. It uses unlabeled data but creates its own “labels” from the data structure.

How It Works

Take unlabeled data
Create a “pretext task” that generates labels from the data itself
Train the model on this self-generated task
The learned representations transfer to downstream tasks

Key Difference from Unsupervised Learning

Aspect	Unsupervised Learning	Self-Supervised Learning
Goal	Discover hidden patterns/structures	Learn useful representations
Output	Clusters, reduced dimensions	Feature representations for downstream tasks
Task-driven	No specific task	Has a specific pretext task

Common Pretext Tasks

Task Type	Description	Example
Masked Language Modeling	Predict masked words in text	BERT, RoBERTa
Next Token Prediction	Predict the next token	GPT models
Contrastive Learning	Identify similar/dissimilar pairs	SimCLR, MoCo
Rotation Prediction	Predict image rotation angle	Self-supervised vision models

Why Self-Supervised Learning Matters

This paradigm has been crucial for training large language models like GPT-4 and Claude, as well as earlier models like BERT. It allows models to learn from massive amounts of unlabeled data (the entire internet) while still having a clear objective function.

Example: A model trained to predict the next word in a sentence (self-supervised) can later be fine-tuned for sentiment analysis, translation, or summarization (supervised transfer learning).

Reinforcement Learning

Reinforcement learning (RL) is about learning through interaction. An agent learns to make decisions by performing actions in an environment and receiving rewards or penalties.

Core Components

Component	Description	Example
Agent	The learner/decision-maker	A robot, a game-playing AI
Environment	The world the agent interacts with	A maze, a game board, a simulation
State	Current situation of the agent	Position in maze, game configuration
Action	What the agent can do	Move left/right, jump, shoot
Reward	Feedback signal	+10 for winning, -1 for each step

How It Works

Agent observes current state
Agent selects an action
Environment returns new state and reward
Agent updates its policy based on reward
Repeat to maximize cumulative reward

Key Concepts

Concept	Description
Policy	The strategy the agent uses (mapping from state to action)
Value Function	Expected future reward from a state
Q-Function	Expected future reward from a state-action pair
Exploration vs Exploitation	Balancing trying new things vs. using known good strategies

Common Algorithms

Algorithm	Type	Notable Use
Q-Learning	Value-based	Simple RL problems
DQN	Deep Q-Network	Atari games
PPO	Policy gradient	Robotics, game playing
A3C	Actor-Critic	Complex environments

Real-World Examples

Game playing: AlphaGo (Go), AlphaZero (Chess, Shogi, Go)
Robotics: Robots learning to walk, grasp objects
Autonomous driving: Learning to navigate traffic
Recommendation systems: Optimizing long-term user engagement

Challenge: Requires many interactions; can be sample-inefficient and unstable.

Quick Comparison

Paradigm	Data Type	Feedback	Use When…	Complexity
Supervised	Labeled	Direct answers	You have labeled data and need predictions	Low-Medium
Unsupervised	Unlabeled	None	You want to discover patterns in unlabeled data	Medium
Self-Supervised	Unlabeled	Self-generated	You have lots of unlabeled data and want representations	Medium-High
Reinforcement	Interactive	Rewards/Penalties	Problem involves sequential decision-making	High

TL;DR

Supervised Learning: Learn from labeled examples. Use when you have (input, label) pairs.
Unsupervised Learning: Discover patterns in unlabeled data. Use for clustering, dimensionality reduction, anomaly detection.
Self-Supervised Learning: Generate labels from data structure. Key to training Foundation Models on massive unlabeled datasets.
Reinforcement Learning: Learn through trial and error via rewards. Use for sequential decision-making problems (games, robotics).

The paradigm you choose depends on your data, your problem, and your resources.

Machine Learning Paradigms

Supervised Learning

How It Works

Types of Supervised Learning

Common Algorithms

Real-World Examples

Unsupervised Learning

How It Works

Types of Unsupervised Learning

Real-World Examples

Self-Supervised Learning

How It Works

Key Difference from Unsupervised Learning

Common Pretext Tasks

Why Self-Supervised Learning Matters

Reinforcement Learning

Core Components

How It Works

Key Concepts

Common Algorithms

Real-World Examples

Quick Comparison

TL;DR

Resources