Skip to content

Generative AI Challenges

Generative AI introduces unique challenges beyond traditional AI systems. These models create new content—text, images, code, audio—which creates new types of risks and ethical considerations that organizations and practitioners must navigate.

What makes Generative AI different? GenAI is optimized to generate content, which expands the risk surface compared to classic predictive ML. A traditional AI might tell you whether an image contains a cat; Generative AI can create a brand new image of a cat. This creative capability brings novel challenges around truth, ownership, and control.


Hallucination is when GenAI confidently produces false or fabricated information.

TypeDescriptionExample
Factual errorsWrong information about the worldInventing historical events
Citation fabricationFake or non-existent referencesMade-up papers, URLs, people
Logical inconsistenciesContradictory statementsSaying X, then later saying not X
Mathematical errorsIncorrect arithmetic or logicWrong calculations, flawed reasoning
Code errorsNon-functional or incorrect codeWrong APIs, non-compiling code, insecure patterns
CauseExplanation
Probabilistic natureModels predict likely next tokens, not truth
Lack of groundingWithout retrieval/tools, models have no live fact source; with RAG/tools they can be grounded—if those components are correct
Weak calibrationModels can’t reliably distinguish known from unknown
Objective misalignmentOptimized for fluency, not accuracy
StrategyHow It Works
RAG (Retrieval-Augmented Generation)Ground responses in retrieved facts
Human reviewPeople verify outputs before use
Confidence signalsCan help but are not reliably calibrated; use with retrieval evidence and verification
Fact-checking layersSeparate verification step
Constrained generationLimit responses to verified sources

Practical tip: Treat GenAI as a creative assistant, not a truth-teller. Always verify important information.


GenAI models can perpetuate, amplify, or introduce biases present in their training data.

Bias TypeDescriptionImpact
Representation biasUnder/over-representation in training dataStereotypical outputs
Content biasSkewed perspectives in source dataOne-sided viewpoints
Objective/optimization biasLoss function, RLHF, decoding push outputs toward certain stylesCertain outputs systematically favored
Deployment biasUsed in contexts different from trainingPoor performance for some groups
AreaBias Manifestation
ImagesStereotypical portrayals of professions, cultures
TextDialect or accent bias, cultural assumptions
CodeWestern coding conventions, English-only comments
RecommendationsPerpetuates existing popularity disparities
ApproachWhat It Involves
Diverse training dataInclude multiple perspectives, cultures, languages
Bias testingEvaluate outputs across demographic groups
Fine-tuningAdjust model for specific use cases
Red-teamingAdversarial testing to find problematic outputs
Human oversightReview and curation of training data and outputs

GenAI models trained on copyrighted content raise complex legal questions that are still being resolved.

IssueCurrent Status
Training data copyrightCan you train on copyrighted content without permission?
Output copyrightIs AI-generated content protectable?
Style mimicryDoes imitating an artist’s style violate copyright?
AttributionHow do you credit training data sources?
JurisdictionKey Developments
United StatesCopyright Office: AI-generated works not copyrightable without human authorship
European UnionAI Act: GPAI providers must publish summary of training content and comply with EU copyright rules (incl. opt-outs)
IndiaActive policy debate; proposals include licensing/compensation frameworks for AI training data (evolving)
PracticeDescription
Opt-out mechanismsAllow creators to exclude their work from training
** licensing agreements**Platform-specific licenses for AI training
Content credentialsMetadata indicating AI-generated vs. human-created
Model attributionDocumenting training data sources

Guidance: If you’re using GenAI commercially, consult legal counsel. This area is rapidly evolving.


GenAI can be weaponized to cause harm at scale.

CategoryRisk Examples
DisinformationFake news, propaganda at scale
Social engineeringConvincing phishing, scam emails
HarassmentTargeted bullying, hate speech
FraudDeepfakes for financial gain
CyberattacksAutomated vulnerability discovery, exploit generation
Non-consensual contentDeepfake explicit imagery
ChallengeWhy It’s Difficult
QualityHigh-quality AI content is hard to distinguish from human
VolumeAutomated generation at overwhelming scale
EvolutionTechniques improve constantly
Plausible deniability”Real” vs. “fake” becomes harder to prove
StrategyApproach
WatermarkingEmbed signals indicating AI-generated content
Content authenticationCryptographic provenance for media
Usage policiesClear terms prohibiting harmful use
Red-teamingTest for vulnerabilities before deployment
MonitoringTrack misuse patterns and respond

Training large GenAI models requires significant computational resources.

Energy use varies widely with model size, hardware efficiency, tokens generated, and data center infrastructure. Estimates are highly uncertain.

AspectScale
Training (frontier models)Third-party estimates: tens of GWh (high uncertainty; depends on compute, efficiency, re-runs)
Training (smaller models)0.1-1 GWh
Inference (per query)Highly variable: fractions of a Wh to several Wh (depends on model, tokens, hardware, PUE)
HardwareGPU manufacturing, data center infrastructure
StrategyImpact
Efficient architecturesSmaller models, quantization, distillation
Renewable energyPower data centers with green energy
Model reuseShare models instead of training from scratch
Carbon trackingMonitor and report energy consumption

ChallengeDescription
Instruction followingModels may not obey constraints perfectly
JailbreakingUsers can bypass safety measures
Prompt injectionMalicious instructions hidden in content
Objective misgeneralizationModel pursues narrow objective in unintended ways
IssueManifestation
InconsistencyDifferent responses to the same prompt
Context windowLimited memory of conversation
Reasoning errorsFlawed logic in multi-step problems
Factual currencyKnowledge cutoff date limits

  • Hallucination: GenAI confidently produces false information; verify important outputs
  • Bias: Models perpetuate training data biases; diverse data and testing help
  • Copyright: Complex legal landscape; training on copyrighted content is unsettled law
  • Misuse: GenAI enables new forms of abuse; watermarking and policies help
  • Environment: Training is energy-intensive; efficient models and green energy help
  • Technical: Control and reliability are ongoing challenges; human oversight remains essential

GenAI’s capabilities come with new responsibilities. Understanding these challenges is the first step toward mitigating them.