AI & LLMAdvanced

Vision Agent

by Vision AI Community

1.4k

AI vision agent for image and document analysis. Performs OCR, object detection, scene understanding, and visual question answering on images and PDFs.

CAPABILITIES

OCR and text extraction
Object detection
Scene understanding
Visual question answering
Document analysis
Image classification

INSTALLATION

Terminal

npx -y @vision-agent/mcp-server

CLAUDE DESKTOP CONFIG

claude_desktop_config.json

{
  "mcpServers": {
    "vision-agent": {
      "command": "npx",
      "args": [
        "npx", "-y",
        ""@vision-agent/mcp-server"
      ]
    }
  }
}

Config file location: ~/Library/Application Support/Claude/claude_desktop_config.json

QUICK FACTS

Category: AI & LLM
Difficulty: Advanced
Maintained By: Vision AI Community
NPM Package: @vision-agent/mcp-server

RELATED SERVERS

Sequential Thinking
EverArt
AWS KB Retrieval

NEW TO MCP?

Learn how to set up MCP servers with Claude Desktop step by step.

Read the tutorial →