When is Machine Learning Not Appropriate

Machine learning is powerful, but it’s not the answer to every problem. Using ML when it’s not needed adds complexity, cost, and maintenance burden without delivering value. Knowing when not to use ML is just as important as knowing when to use it.

ML is a tool, not a default. Start with the simplest solution that works. Only reach for ML when simpler approaches can’t meet your requirements.

When ML is Overkill

ML adds unnecessary complexity when simpler solutions exist.

Signs ML is Overkill

Sign	Simple Alternative
Clear rules can solve the problem	Rule-based system
Deterministic calculation works	Formula/function
Small, fixed set of cases	Lookup table
One-time or rarely changes	Manual process or simple script
Strict guarantees/provable correctness needed	Deterministic/verified methods + human review

Examples: ML Overkill

Problem	ML Approach	Simple Approach
Calculate sales tax	Train ML model on receipts	`price × tax_rate`
Check email has basic format	Classifier on emails	Simple check (contains @, valid domain) + confirmation email
Shortest-path routing on a map	Reinforcement learning	Dijkstra/A* algorithm
Sort numbers	Neural network	QuickSort
Password authentication	Deep learning model	Hash comparison

Keep it simple and maintainable. If rules or math work, use them.

When Data is Insufficient

ML requires quality data. Without it, you’re wasting time.

Data Requirements

Requirement	Insufficient When…
Quantity	Too few samples for problem complexity (can be <100 or millions depending on task; transfer learning can help)
Quality	High noise, many errors, inconsistencies
Relevance	Features don’t relate to target
Coverage	Gaps in feature space, edge cases missing
Labeling	No ground truth available
Stability	Objective or labels keep changing (you’ll train to noise)

Cost-Benefit Analysis

Factor	Question
Data collection cost	Is it cheaper to build a rules system?
Labeling effort	Can humans directly make the decisions?
Data quality	Will improving data cost more than value gained?
Ongoing maintenance	Who maintains the data pipeline?

Example: When Data is the Bottleneck

Problem: Predict customer churn for a startup with 50 customers

ML approach:
- Collect more data? Need months/years
- Label data? Expensive and slow
- Train model? Won't generalize from 50 samples

Simple approach:
- Interview churned customers directly
- Identify common patterns
- Build simple rules from findings
- Iterate as customer base grows

When Interpretability is Critical

Some domains require explainable decisions.

High-Stakes Domains Requiring Explanation

Domain	Why Interpretability Matters
Medical diagnosis	Doctors need to understand reasoning
Credit decisions	Legal requirement to explain denials
Legal sentencing	Due process requires justification
Financial auditing	Regulators need clear logic
Safety-critical systems	Need to verify correctness

Interpretability Tradeoffs

Approach	Intrinsic Interpretability	When to Use
Rules/decision trees	High	Must explain decisions
Linear models	Medium-High	Good balance
Ensemble methods	Low	Performance > explanation (post-hoc explanations possible)
Deep learning	Low	Maximum performance needed (post-hoc explanations possible)

Note: Post-hoc explanations (SHAP, LIME) exist for complex models but may not satisfy regulatory requirements.

Example: Credit card approval

Problem: Approve or deny credit applications

ML approach: "Denied" (why? Model says so)
→ Customer can't appeal, regulatory issues

Rule-based approach: "Denied because debt-to-income > 40%"
→ Clear reason, customer can address specific issue

When Latency Constraints are Too Tight

Some applications require faster responses than ML can provide.

Latency Requirements vs. ML Capabilities

Application	Latency Requirement	ML Feasible?
High-frequency trading	<1ms	Simple models only
Real-time control systems	<10ms	Usually limited
Interactive voice response	<100ms	Challenging
Web page loading	<500ms	Yes
Batch processing	Minutes/hours	Yes

ML Inference Latency (Rough Estimates)

Latency is highly implementation and hardware dependent—always measure p50/p95 for your specific setup:

Model Type	Rough Latency Range
Linear models	Microseconds to <1ms
Decision trees	Microseconds to low ms
Small ensembles	Low ms (depends on tree count)
Small neural nets	Low-mid ms
Large LLMs	100ms-seconds (scales with output tokens and context)

If latency is extremely tight, consider: simpler/optimized models, distillation, quantization, caching, specialized hardware (GPU/TPU/NPU), or non-ML methods.

When Cost Outweighs Benefit

ML has ongoing costs that might exceed its value.

Cost Categories

Cost Type	Description
Development	Data collection, labeling, experimentation
Infrastructure	Compute, storage, serving
Maintenance	Monitoring, retraining, debugging
Operations	MLOps, pipelines, personnel
Opportunity	Time spent vs. other priorities

Cost-Benefit Framework

Question	If “No,” consider alternatives
Will ML significantly improve metrics?	Simple rules or heuristics
Is the improvement worth the cost?	Status quo
Can we afford ongoing maintenance?	One-time solution
Do we have the right team?	Outsource or simplify

Example: Cost vs. Benefit

Problem: Automatically categorize support tickets

ML Approach:
- Development: 3 months engineer time
- Training data: 500 labeled tickets
- Infrastructure: $500/month
- Maintenance: Monthly retraining
- Benefit: Saves 30 min/day for support team

Rule-based Approach:
- Development: 1 week
- Data: None needed
- Infrastructure: $0
- Maintenance: Occasional rule updates
- Benefit: Saves 25 min/day for support team

→ Rules are 95% as effective, 10x cheaper to build and maintain

Alternatives to ML

When ML isn’t the right choice, consider these alternatives:

1. Rule-Based Systems

Explicit if-then rules defined by domain experts.

Pros	Cons
Highly interpretable	Brittle, doesn’t adapt
Fast to implement	Maintenance burden
Predictable behavior	Doesn’t handle edge cases well

When to use: Clear business rules exist, domain is well-understood.

2. Heuristics

Simple rules of thumb based on experience.

Examples
”If email has ‘FREE’ and ‘WIN’ in subject, flag as spam"
"Recommend items with highest rating"
"Prioritize by oldest date first”

When to use: Quick solution needed, acceptable to be approximately optimal.

3. Classical/Simple Models

Simple predictive models without complex ML infrastructure.

Method	Use Case
Moving averages	Trend forecasting
Thresholds	Anomaly detection
Correlations	Simple relationships
Linear/logistic regression	Clear relationships with few features

When to use: Relationships are simple, interpretability matters.

4. Human Decision-Making

Sometimes humans are the best solution.

When humans beat ML
Complex moral judgments
Novel situations
Very small data
High-stakes one-time decisions

Decision Framework

Use this flow to decide:

Is the problem deterministic?
    YES → Use rules or formulas
    NO  ↓

Is there sufficient quality data?
    NO → Collect data or use simpler approach
    YES ↓

Is interpretability critical?
    YES → Use interpretable models (linear, trees)
    NO  ↓

Are latency constraints extreme?
    YES → Consider optimized or simpler models
    NO  ↓

Does the benefit justify the cost?
    NO → Use simpler alternative
    YES → ML is appropriate

Quick Reference: When Not to Use ML

Scenario	Use Instead
Simple rules work	Rule-based system
Mathematical formula exists	Direct calculation
<100 samples	Collect more data or use rules
Must explain decisions	Interpretable models or rules
<10ms latency required	Optimized algorithms
One-time decision	Human judgment
Static problem	Fixed solution
High maintenance cost	Simpler, stable approach
Small improvement expected	Current system or heuristics

TL;DR

Don’t use ML when:

Simple rules or formulas work
Data is insufficient or poor quality
Interpretability is required (regulations, safety)
Latency requirements are too tight (and can’t be optimized)
Cost exceeds benefit
Problem is deterministic or static
It’s a one-time or rare decision
The objective or labels are unstable/undefined

Use simpler alternatives:

Rule-based systems
Heuristics
Statistical methods
Human decision-making

Decision framework:

Is the problem deterministic? → Rules
Is there good data? → Otherwise, collect or skip
Is interpretability critical? → Simple models
Are latency constraints tight? → Optimized algorithms
Does benefit justify cost? → Otherwise, don’t use ML

Start simple. Only use ML when it’s clearly the best solution for your problem, constraints, and resources.