Gemini CLI

Gemini CLI is an open-source AI agent that lives in the terminal. It’s designed for tasks requiring local file access and integration with multiple tools—making it ideal for developers who work primarily from the command line.

Why Gemini CLI?

Feature	Details
Free Tier	Up to ~60 model requests/min, ~1,000/day with personal Google account (limits vary by model; one prompt may trigger multiple requests)
Powerful Models	Access to Gemini 3 models with 1M token context window
Built-in Tools	Google Search grounding, file operations, shell commands, web fetching
Extensible	MCP (Model Context Protocol) support for custom integrations
Terminal-First	Designed for developers who live in the command line
Open Source	Apache 2.0 licensed—inspect, customize, or fork the code

How It Works

Gemini CLI follows an agentic loop pattern:

flowchart TD
    A["You send a prompt"] --> B["Gemini model reasoning"]
    B --> C{"Need more info?"}
    C -->|Yes| D["Call tools"]
    D --> E["Read files / Run commands / Search web"]
    E --> B
    C -->|No| F["Response delivered"]
    
    style A fill:#4285f4,color:#fff
    style B fill:#34a853,color:#fff
    style C fill:#fbbc04,color:#000
    style D fill:#ea4335,color:#fff
    style E fill:#ea4335,color:#fff
    style F fill:#4285f4,color:#fff

This iterative loop allows Gemini CLI to run for extended periods—doing reasoning and calling different tools—to build entire applications or debug complex issues autonomously.

Terminal Agent Advantages

Direct File System Access

Reads files in your project/workspace (and directories you explicitly add), subject to confirmations and sandbox policies
Context-aware—feels like a partner in your task
No need to copy-paste code into a web interface

Swiss Army Knife of Tools

Can invoke installed CLI tools (git, npm, gcloud, etc.), typically with your approval and within sandbox/policy constraints
Can attempt to install tools for you via package managers (with your approval)
Reduces context switching—one interface for many tasks

Automation & Scripting

Generate and run scripts without learning the syntax
Automate repetitive tasks through natural language

Built-in Tools

Tool Category	Capabilities
File System	List directories, read files, write to files, search and edit
Web Search	Access fresh data, recent releases, news—beyond training data
Shell Commands	Run any CLI tool (GitHub CLI, gcloud, npm, etc.)
Memory	Save preferences and details across sessions with `/memory add`

Automatic Tool Setup

When a task requires external tools (like ffmpeg for video processing or whisper for transcription), Gemini CLI can:

Check if the required tools are installed
Diagnose any missing dependencies
Attempt to set up the environment using package managers like uv, pip, or brew (with your approval)
Retry with alternative approaches if errors occur

This self-healing behavior means you can focus on the task while Gemini CLI attempts to handle the tooling.

Supported Data Types

Gemini CLI can analyze both structured and unstructured data:

Data Type	Examples
Structured Data	CSVs, spreadsheets, JSON, databases
Unstructured Data	Google Docs, PDFs, system logs, text files
Source Code	Any programming language—Python, JavaScript, Go, etc.
Visual Data	Images, diagrams, screenshots, UI mockups
Video & Audio	MP4, podcasts—transcribe, clip, and analyze multimedia
Web Content	Live web pages, documentation, APIs

This flexibility makes Gemini CLI ideal for data pipelines that span multiple formats and sources.

Extensibility

Gemini CLI supports a vast extensions ecosystem:

MCP Servers — Connect custom capabilities (e.g., media generation with Imagen, Veo, or Lyria)
Custom Commands — Define your own shortcuts and workflows
GEMINI.md — Project-specific context files to tailor behavior

Model Context Protocol (MCP)

MCP is an open standard that enables AI agents to seamlessly and securely connect with external tools, APIs, and data sources. It provides a standardized way for LLMs to interact with external context.

Why MCP?

Benefit	Description
Portability	Not locked into Gemini CLI—same tools work with other MCP-compatible agents
Extensibility	Build on top of built-in capabilities with custom integrations
Ecosystem	Popular cloud services (MongoDB, GitHub, Canva, Snyk, Google Cloud) available as MCP servers
Custom Servers	Build your own MCP server to run proprietary code

Adding an MCP Server

Use the gemini mcp add subcommand to connect a remote MCP server:

# Add a remote HTTP MCP server
gemini mcp add -t http <server-name> <remote-url>

# Example: Add Canva MCP server
gemini mcp add -t http canva https://mcp.canva.com

Once added, the server appears in your settings and is available the next time you start Gemini CLI.

Authentication

Some MCP servers require authentication. Use /mcp auth to complete the OAuth flow:

/mcp auth
# Select the server from the list (e.g., canva)
# Complete OAuth in your browser

Managing MCP Servers

Command	Description
`/mcp`	List all configured MCP servers with connection status
`/mcp auth`	Authenticate to a remote MCP server via OAuth
`/mcp desc`	Show detailed descriptions of all available MCP tools

Connected servers display a green icon in the /mcp list.

Extensions

Extensions take MCP to the next level by bundling one or more MCP servers into easy-to-install packages with custom context files and custom slash commands.

Why Extensions?

Feature	Benefit
Bundled MCP Servers	Pre-configured tools ready to use
Custom Context	Teaches Gemini how to use tools effectively for complex workflows
Slash Commands	Purpose-built shortcuts (e.g., `/sprint-summary` for Monday.com)
One-Click Install	Simple installation from the gallery

Extensions Gallery

Explore the extensions catalog at geminicli.com. Popular extensions include:

Google Workspace — Access Docs, Sheets, Calendar, and Gmail
Linear — Manage issues, cycles, and project updates
Monday.com — View tasks, update boards, get sprint summaries
GitHub — Manage repos, PRs, and issues
Nano Banana — AI image generation and enhancement for content creation
Databases — MongoDB, PostgreSQL, and more

Installing Extensions

Find your extension in the gallery and run the one-click install command:

gemini extension install <extension-name>

# Example: Install Google Workspace
gemini extension install google-workspace

After installation, Gemini CLI will trigger an OAuth flow on startup to authenticate with the service.

Managing Extensions

Command	Description
`/extensions list`	View all installed extensions
`/extensions`	Manage extensions (install, remove, update)

Example: Google Workspace Workflow

With the Google Workspace extension, you can orchestrate across multiple services:

Read a Google Doc: “Read my conference schedule from Google Docs”
Create Calendar Events: “Schedule holds on my calendar for each session”
Update Sheets: “Add attendee counts to my tracking spreadsheet”

Gemini CLI automatically uses the right tools (getMe, readDoc, createCalendarEvent) and handles concurrent sessions intelligently.

Memory & Context

Context is critical for defining what your agent knows and how it behaves. Gemini CLI uses GEMINI.md files to manage project-specific and global instructions.

Context Hierarchy

Gemini CLI loads context by traversing upward from your current directory:

Local Context: Checks for a GEMINI.md in the current folder.
Subdirectory Context: folder-specific instructions (e.g., for /tests or /build).
Ancestor Search: Traverses up until it finds a .git folder or home directory, loading all GEMINI.md files in scope.
Global Context: User-level instructions stored in ~/.gemini/GEMINI.md.

Domain-Specific Context

Tailor your GEMINI.md with domain knowledge to get better results. For example, a data analysis project might include:

# Lead Scoring Criteria
- **Hot Lead**: 3+ booth visits, attended keynote, requested demo
- **Warm Lead**: 1-2 booth visits, attended sessions
- **Cold Lead**: Badge scan only, no engagement

# Dashboard Requirements
- Use company brand colors (#4285F4, #34A853)
- Include executive summary for leadership audience
- Show top companies, job titles, and session engagement

This context helps Gemini CLI make informed decisions when cleaning, analyzing, or visualizing your data.

Large Document Optimization

For large files (like 200+ page textbooks), avoid context window limits by instructing Gemini CLI to read in chunks:

# Resource Handling
- Use `pypdf` to read the textbook in 50-page chunks
- Summarize each chapter individually before aggregating

Managing Memory

Use the memory system to save preferences (e.g., tech stack preferences, your name, or coding habits) across all projects.

Command	Action
`/memory show`	View current global and project memory contents
`/memory add`	Add new instructions to your global context

Project Initialization (`/init`)

Starting a new project? Use the /init command to automatically bootstrap your context.

/init

Gemini CLI will:

Analyze Your Project: Reads package.json, dependencies, and key files (e.g., index.js, data files).
Determine Tech Stack: Detects frameworks (React, Vite, Python, etc.).
Generate GEMINI.md: Creates a well-structured context file including project overview, prerequisites, and common commands.

Use Cases

Category	Examples
Software Development	Implement features, debug issues, review code
CI/CD Integration	Run as a GitHub Action for automated PR reviews
Content Creation	Process podcasts, generate social media content
Data Analysis	Clean datasets, analyze trends, build dashboards
Cloud Operations	Deploy resources, query databases, manage infrastructure
Study Buddy	Personal tutor, flashcard generation, interactive quizzes, research

Example: Data Analysis with GCP

Gemini CLI can orchestrate entire data pipelines using tools already installed on your machine—like gcloud CLI and bq (BigQuery CLI). Here’s a real-world workflow:

The Pipeline

flowchart LR
    A["Raw Data"] --> B["Clean & Dedupe"]
    B --> C["Analyze & Segment"]
    C --> D["Export to BigQuery"]
    D --> E["Generate Report"]
    E --> F["Build Dashboard"]
    
    style A fill:#4285f4,color:#fff
    style D fill:#34a853,color:#fff
    style F fill:#ea4335,color:#fff

Step-by-Step

Ingest Data: “Read lead scan data from leads.csv and the Google Sheets export”
Clean & Process: Gemini CLI creates Python scripts to remove duplicates and invalid entries
Analyze: Segment leads by priority (hot/warm/cold), identify top companies and trends

Upload to BigQuery: Uses your local bq CLI to create datasets and load tables

bq mk --dataset conference_leads
bq load --source_format=CSV conference_leads.processed ./cleaned_leads.csv

Generate Reports: Creates analysis documents, optionally saving to Google Docs
Build Dashboard: Scaffolds a Flask/React app with your brand colors and logo

Using `/dir add` for Multi-Project Context

Need assets from another project (like brand guidelines or logos)? Add directories to your session:

/dir add /path/to/brand-assets

This gives Gemini CLI access to files outside your current project without switching directories.

GCP Tools Integration

Gemini CLI leverages any CLI tools installed on your machine:

Tool	Use Case
`gcloud`	Deploy Cloud Run services, manage IAM, pull logs
`bq`	Create BigQuery datasets, run queries, load data
`gsutil`	Upload/download from Cloud Storage
`kubectl`	Manage GKE clusters and deployments

Example: Multimedia Content Creation

Gemini CLI excels at processing video and audio content for marketing workflows. Here’s how to turn a podcast recording into promotional assets:

The Pipeline

flowchart LR
    A["Video/Podcast"] --> B["Transcribe"]
    B --> C["Create Clips"]
    C --> D["Generate Blog"]
    D --> E["Social Posts"]
    
    style A fill:#4285f4,color:#fff
    style C fill:#34a853,color:#fff
    style E fill:#ea4335,color:#fff

Step-by-Step

Transcribe with Timestamps: Gemini CLI uses ffmpeg and whisper to generate accurate transcriptions
```
"Transcribe the video with timestamps and save to transcript.txt"
```
Output includes multiple subtitle formats (SRT, VTT) automatically.
Create Short Clips: Identify key moments and extract them as standalone videos
```
"Create 3 short clips highlighting the best insights from the video"
```
Gemini CLI analyzes the transcript, picks compelling timestamps, and uses ffmpeg to cut the clips.
Generate Blog Post: Create long-form content with embedded visuals
```
"Write a blog post based on the transcription, include screenshots from the video"
```
With the Nano Banana extension, screen grabs can be enhanced with AI-generated imagery.

Social Media Posts: Generate platform-ready promotional content

"Create 4 social media posts promoting this video for Twitter and LinkedIn"

Tools Used

Tool	Purpose
`ffmpeg`	Video/audio processing, clip extraction
`whisper`	Speech-to-text transcription
Nano Banana	AI image generation and enhancement

Example: Personal Tutor & Study Companion

Use Gemini CLI as a proactive study partner to master complex topics using the Socratic method.

The Workflow

Ingest Material: “Read my course-notes.pdf and the textbook in resources/”
Summarize & Structured Notes: Have Gemini CLI create chapter-by-chapter summaries in Google Docs
Terminal window
```
"Summarize Chapter 1-5 and save to a new Google Doc"
```

Interactive Quiz App: Ask Gemini CLI to build a local web app to test your knowledge

"Create a simple HTML/JS quiz app based on these summaries. Make it interactive so I can see which answers I got wrong."

Socratic Tutoring:

User: “Quiz me on Support Vector Machines.” Gemini: “Sure. What is the primary advantage of use the kernel trick?” User: “It maps data to infinite dimensions?” Gemini: “Close! It allows you to operate in high-dimensional space without computing the coordinates. Can you explain why that’s computationally cheaper?”

Key Features for Learners

Chunk-based processing for massive textbooks (using pypdf)
Active Recall through generated quizzes
Cross-examination to identify knowledge gaps

Installation

Prerequisites: Node.js v20+, macOS/Linux/Windows

# Run instantly with npx (no install)
npx @google/gemini-cli

# Install globally with npm
npm install -g @google/gemini-cli

# Install with Homebrew (macOS/Linux)
brew install gemini-cli

Authentication Options

Best for individual developers. Free tier included.

gemini
# Follow the browser authentication flow

Option 2: Gemini API Key

Best for specific model control or paid tier access.

export GEMINI_API_KEY="YOUR_API_KEY"
gemini

Get your key from Google AI Studio

Option 3: Vertex AI

Best for enterprise teams and production workloads.

export GOOGLE_API_KEY="YOUR_API_KEY"
export GOOGLE_GENAI_USE_VERTEXAI=true
gemini

GitHub Integration

Integrate directly into GitHub workflows with Gemini CLI GitHub Action:

PR Reviews — Automated code review with contextual feedback and severity-colored suggestions
Issue Triage — Auto-labeling and prioritization
On-demand Help — Comment @gemini-cli in issues/PRs for explanations or debugging
Custom Workflows — Build automated pipelines tailored to your team

Setting Up GitHub Actions

Use the /setup-github command to automatically configure GitHub Actions for your repository:

/setup-github

This downloads workflow files, configures them for your repo, and opens a setup guide for authentication. Once configured, Gemini CLI will:

Automatically review every PR with a summary and inline suggestions
Respond to @gemini-cli mentions in comments
Label suggestions by severity (e.g., yellow for medium priority)

Custom Slash Commands

Create your own slash commands for repetitive workflows. Commands live in .gemini/commands/ as .toml files.

Command Structure

description = "Implement a feature from a GitHub issue"
prompt = """
Fetch the GitHub issue using: !{gh issue view {{args}}}

Then:
1. Plan the implementation
2. Implement the feature iteratively
3. Run tests and fix any failures
4. Commit with a descriptive message
"""

Syntax

Syntax	Description
`!{command}`	Execute a shell command inline
`{{args}}`	Pass arguments from the slash command invocation

Usage

# Run the custom command with issue number
/implement-feature 5

Gemini CLI will prompt for confirmation before executing shell commands.

Image Input

Gemini CLI supports multimodal input—paste images directly into your prompt for visual analysis:

Take a screenshot or copy an image
Type your prompt in Gemini CLI
Paste the image (it appears in the prompt box)
Press Enter

Use cases:

UI Feedback: “Make this layout use columns instead of full-width”
Bug Analysis: “What’s wrong with this error screenshot?”
Design Implementation: “Build this mockup as a React component”

Interactive Shell

For commands that require interaction (like test watchers or REPLs), use Ctrl+F to focus the shell:

Ctrl+F — Focus on the running shell process
Q — Quit the focused process (for tools like Jest)
Ctrl+C — Cancel or exit

This lets Gemini CLI run tests, see failures, and automatically fix them in a loop.

Commands

Command	Description
`/help`	List all available built-in commands and keyboard shortcuts
`/model`	View or change the model (default: `auto`)
`/settings`	Configure preferences like vim mode, hide footer
`/theme`	Customize the color scheme
`/clear`	Wipe conversation history—use between tasks to start fresh
`/dir add <path>`	Add an external directory to your session for cross-project access
`/memory`	Show or add to your global/project memory
`/init`	Bootstrap a new `GEMINI.md` file for your project
`/mcp`	List configured MCP servers and their connection status
`/mcp auth`	Authenticate to a remote MCP server
`/mcp desc`	Show detailed descriptions of available MCP tools
`/extensions list`	View all installed extensions
`/setup-github`	Configure GitHub Actions for automated PR reviews
`/stats`	View session statistics: code changes, tool calls, model routing
`/docs`	Open documentation directly within the CLI
`/exit` or `/quit`	Exit with a session history snapshot

Model Selection

The default model is auto, which intelligently routes requests based on task complexity, capacity, and your auth method:

Simple prompts → Uses a faster, lighter model
Complex tasks → Uses the most capable available model

Use /model to override or see available models.

/model

Preview Models

You can opt-in to preview features (including preview models like gemini-3-pro-preview or gemini-3-flash-preview) via /settings. Availability depends on account type and rollout.

See the official model selector docs for the latest model IDs.

File References

Use the @ symbol to reference files directly in your prompt:

@suggestions.md summarize this file

This immediately reads the file and provides context. Without @, Gemini CLI will still find the file, but takes extra time to reason about it.

Shell Mode

For quick command execution (bypassing the agent’s reasoning), use shell mode by prefixing commands with !:

! npm install
! npm run dev

This passes the command directly to your shell. Caution: Commands executed this way have the same permissions as if you ran them directly in your terminal.

Tool Confirmation

When Gemini CLI wants to perform an action (like writing a file), it shows a confirmation prompt:

Shortcut	Action
`Ctrl+S`	Expand to see the full diff
Allow once	Execute this specific action
Always allow	Auto-approve this tool in the future
Open in editor	View diff in external editor (VS Code, etc.)
Reject	Decline the suggestion

This keeps you in control of all file modifications and tool executions.

Keyboard Shortcuts

Access the full list with /help. Key shortcuts include:

Ctrl+S — Expand diff preview
Ctrl+C — Cancel current operation
Escape — Exit settings/menus

Maintenance & Updates

Auto-Update

Gemini CLI can be configured to check for updates automatically in the background, ensuring you always have the latest tools and model definitions.

/settings set auto_update true

Manual Update

Alternatively, you can manually update the package via npm:

npm update -g @google/gemini-cli

Troubleshooting

If the agent behaves unexpectedly, you can check the logs for debugging details:

Logs Directory: ~/.gemini/logs/
Settings: ~/.gemini/settings.json (user) and .gemini/settings.json (project). See Configuration docs.

Gemini CLI

Why Gemini CLI?

How It Works

Terminal Agent Advantages

Direct File System Access

Swiss Army Knife of Tools

Automation & Scripting

Built-in Tools

Automatic Tool Setup

Supported Data Types

Extensibility

Model Context Protocol (MCP)

Why MCP?

Adding an MCP Server

Authentication

Managing MCP Servers

Extensions

Why Extensions?

Extensions Gallery

Installing Extensions

Managing Extensions

Example: Google Workspace Workflow

Memory & Context

Context Hierarchy

Domain-Specific Context

Large Document Optimization

Managing Memory

Project Initialization (/init)

Use Cases

Example: Data Analysis with GCP

The Pipeline

Step-by-Step

Using /dir add for Multi-Project Context

GCP Tools Integration

Example: Multimedia Content Creation

The Pipeline

Step-by-Step

Tools Used

Example: Personal Tutor & Study Companion

The Workflow

Key Features for Learners

Installation

Authentication Options

Option 1: Login with Google (Recommended)

Option 2: Gemini API Key

Option 3: Vertex AI

GitHub Integration

Setting Up GitHub Actions

Custom Slash Commands

Command Structure

Syntax

Usage

Image Input

Interactive Shell

Commands

Model Selection

Preview Models

File References

Shell Mode

Tool Confirmation

Keyboard Shortcuts

Maintenance & Updates

Auto-Update

Manual Update

Troubleshooting

Resources

Project Initialization (`/init`)

Using `/dir add` for Multi-Project Context