Server Directory
AI & LLMAdvanced

Vision Agent

by Vision AI Community

GitHub
1.4k

AI vision agent for image and document analysis. Performs OCR, object detection, scene understanding, and visual question answering on images and PDFs.

CAPABILITIES

  • OCR and text extraction
  • Object detection
  • Scene understanding
  • Visual question answering
  • Document analysis
  • Image classification

INSTALLATION

Terminal
npx -y @vision-agent/mcp-server

CLAUDE DESKTOP CONFIG

claude_desktop_config.json
{
  "mcpServers": {
    "vision-agent": {
      "command": "npx",
      "args": [
        "npx", "-y",
        ""@vision-agent/mcp-server"
      ]
    }
  }
}

Config file location: ~/Library/Application Support/Claude/claude_desktop_config.json

QUICK FACTS

Category
AI & LLM
Difficulty
Advanced
Maintained By
Vision AI Community
NPM Package
@vision-agent/mcp-server

NEW TO MCP?

Learn how to set up MCP servers with Claude Desktop step by step.

Read the tutorial →