CAIRO: The In-App Operator Framework

by ColomboAI

CAIRO is the world's first In-App Operator — a framework that transforms traditional software, robots, and web platforms into autonomous, context-aware, goal-driven systems.

Five Foundational Intelligence Modules

Scraping2Tools

SiteMap2Tools

MCP-RAG

Memo AI

Agent Skills

1. Scraping2Tools

"Convert Any Interface Into Callable Tools"

Key Benefits

• No manual integration required
• Works on legacy systems, desktop apps, and web apps
• Real-time adaptability — any UI change is auto-re-mapped

2. SiteMap2Tools

"Give CAIRO a Global Map of the System"

Key Benefits

• Enables multi-page reasoning and navigation
• Prevents context loss when switching views
• Adds support for cross-workflow operations

3. MCP-RAG

"Intelligent Retrieval and Planning Engine"

Key Benefits

• Keeps CAIRO from being flooded with thousands of tools
• Reduces noise and improves reasoning efficiency
• Enables transparent debugging and trust

4. Memo AI

"CAIRO's Dynamic Memory System"

Key Benefits

• Enhances personalization and adaptability
• Allows CAIRO to improve autonomously over time
• Enables explainable decision-making

5. Agent Skills

"Composable, Testable Modules for Autonomous Action"

Key Benefits

• Modular development for teams
• Reusable and shareable across industries
• Safe and explainable execution pipelines

Security & Compliance

• Policy-based access control
• Transparent execution logs
• Role-scoped credentials

Quick Start

npm install @colomboai/cairo

Cairo SDK Installation

Production-ready SDK with MCP-RAG v2.0, 100% tool selection accuracy, and sub-20ms latency

npm install cairo-sdk

Or using yarn:

yarn add cairo-sdk

Key Features: Multi-tier caching (95%+ hit rate), Native ChromaDB, Skills-first optimization (FREE skills + PAID MC-1 fallback)

Getting Your API Key

Visit platform.colomboai.com
Sign up or log in to your account
Navigate to API Keys section
Generate a new API key for your application
Copy the key and add it to your environment variables

Create a .env file in your project root:

CAIRO_API_KEY=sk-your-api-key-here

Important: API key is required for authentication, usage tracking, and enabling skills-first optimization

Quick Start - Minimal Setup

Only API key required - everything else is automatic!

import { Cairo } from 'cairo-sdk';

// Only API key required - everything else is automatic!
const cairo = new Cairo({
  apiKey: process.env.CAIRO_API_KEY
});

// That's it! Start using Cairo SDK
const result = await cairo.ask("Your query here");
console.log(result);

Core API Methods

Cairo SDK provides simple methods for interacting with the platform. All backend communication is handled automatically.

ask()

Skills-first execution with intelligent routing (sync for skills, optimized)

const cairo = new Cairo({ apiKey: process.env.CAIRO_API_KEY });

const result = await cairo.ask("Find and click the submit button");
console.log(result);

askStream()

Streaming responses for real-time feedback (chat interfaces, long-form content)

const cairo = new Cairo({ apiKey: process.env.CAIRO_API_KEY });

for await (const chunk of cairo.askStream("Process this form step by step")) {
  console.log(chunk.data);
}

SDK Usage Examples

Next.js Integration (Recommended)

// app/api/cairo/route.ts
import { Cairo } from 'cairo-sdk';
import { NextRequest } from 'next/server';

const cairo = new Cairo({
  apiKey: process.env.CAIRO_API_KEY
});

export async function POST(request: NextRequest) {
  const { query } = await request.json();
  const result = await cairo.ask(query);
  return Response.json(result);
}

// Frontend usage
const response = await fetch('/api/cairo', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ query: userMessage })
});
const result = await response.json();

Direct SDK Usage (Server-side)

import { Cairo } from 'cairo-sdk';

const cairo = new Cairo({
  apiKey: process.env.CAIRO_API_KEY
});

// Use ask() for complete responses
const result = await cairo.ask("Qualify this lead");
console.log(result.content);

// Use askStream() for progressive display
for await (const chunk of cairo.askStream("Generate report")) {
  console.log(chunk.data);
}

Choosing the Right Method

// Use ask() for:
// Automation & Background Jobs
const result = await cairo.ask("Qualify this lead");

// Batch Processing
const descriptions = await Promise.all(
  properties.map(p => cairo.ask(`Generate description for ${p}`))
);

// Use askStream() for:
// Chat Interfaces (Progressive Display)
for await (const chunk of cairo.askStream(userMessage)) {
  displayInChat(chunk.data);
}

// Long-Form Content Generation
for await (const chunk of cairo.askStream("Write detailed report")) {
  updateProgress(chunk);
}

Express.js Integration

import express from 'express';
import { Cairo } from 'cairo-sdk';

const app = express();
const cairo = new Cairo({ apiKey: process.env.CAIRO_API_KEY });

app.post('/api/cairo', async (req, res) => {
  const result = await cairo.ask(req.body.query);
  res.json(result);
});

app.listen(3000);

How Cairo SDK Works

Cairo SDK intelligently routes your requests for cost optimization:

Your App → Cairo SDK → Intelligent Routing
                         ↓
                    Skills (FREE)
                         ↓
                    MC-1 API (PAID)

Common queries use FREE skills (cached responses)
Complex queries use PAID MC-1 (advanced LLM)
Cairo SDK automatically chooses the best option
You save money without any extra code

Authentication

All API requests require authentication using Bearer tokens:

Authorization: Bearer YOUR_API_KEY

Get your API key from the Developer Portal

Response Format

All successful responses follow this structure:

{
  "id": "uuid-here",
  "type": "sync",
  "content": "Task completed successfully",
  "media": null,
  "executionPlan": {
    "steps": [
      {
        "toolId": "btn_submit",
        "action": "execute",
        "parameters": { "tool_name": "click_button" }
      }
    ],
    "mode": "sync"
  },
  "actionResults": [
    {
      "id": "action-uuid",
      "status": "success",
      "result": "Executed click_button tool"
    }
  ]
}

Error Handling

Error responses include detailed information:

{
  "success": false,
  "error": "Invalid API key",
  "code": "AUTH_ERROR",
  "details": "The provided API key is invalid or expired"
}

Common Error Codes

• AUTH_ERROR - Authentication failed
• RATE_LIMIT - Too many requests
• INVALID_REQUEST - Malformed request
• TOOL_ERROR - Tool execution failed

Backend API Endpoints

Tool Discovery (MCP-RAG)

Semantic tool search using vector embeddings:

POST /v1/tools/search

{
  "query": "payment processing",
  "k": 6,
  "scope": "api"  // Options: ui, api, navigation, all
}

Response:
{
  "tools": [...],
  "query_time_ms": 16.5,
  "total_tools": 150,
  "cache_hit": true
}

Skills Execution

Execute autonomous workflows:

POST /v1/skills/execute

{
  "query": "complete checkout process",
  "context": { "userId": "user123" }
}

Response:
{
  "skill_name": "complete_checkout",
  "status": "executed",
  "confidence": 0.95,
  "source": "database",
  "success": true
}

Health Check

System status and capabilities:

GET /v1/health

Response:
{
  "status": "healthy",
  "latency_ms": 12.5,
  "capabilities": {
    "hasToolExecution": true,
    "hasAdvancedRAG": true,
    "supportedModels": ["MC1", "qwen-3"],
    "maxToolsPerRequest": 50
  },
  "performance": {
    "avg_search_latency_ms": 16.5,
    "cache_hit_rate": 0.95
  }
}

Architecture Overview

Request Flow:

1. API Request → Bearer Token Authentication
2. Skills Registry Search (Database + In-Memory)
3. If skill found (confidence > 0.35):
   → Execute skill workflow (FREE)
   → Return result
4. Else:
   → MC-1 API call (PAID)
   → Media generation support
   → Return result

Vector Search:
- ChromaDB for semantic tool discovery
- PostgreSQL for tool storage
- 50-100ms average response time
- LRU caching for sub-10ms retrieval

Security Features

Bearer Token Authentication: All endpoints secured with API keys
Input Validation: Pydantic schemas for all requests
Rate Limiting: Built-in protection mechanisms
Request Size Limits: 10MB maximum payload size
HTTPS Enforcement: Production environment requires HTTPS
CORS Configuration: Restricted to authorized origins

Skills Registry

Cairo includes pre-built autonomous workflows for common tasks:

complete_checkout

E-commerce checkout flows

process_refund

Refund handling workflows

navigate_to_section

Site navigation automation

fill_form_with_data

Form automation workflows

Best Practices

Use specific queries: Clear instructions improve accuracy
Provide context: Include relevant user data and session info
Handle errors gracefully: Implement retry logic with exponential backoff
Cache tool manifests: Sync tools once, not on every request
Monitor usage: Track API calls through the developer portal
Use streaming: For long-running tasks, use the streaming endpoint
Secure your keys: Never expose API keys in client-side code

Performance Benchmarks

Metric	Achievement	Target	Status
Query Latency	16.5ms	<50ms	67% better
Tool Accuracy	100%	>85%	Perfect
Cache Hit Rate	95%+	>80%	Excellent
Memory Usage	<100MB	<200MB	Optimized

Troubleshooting

Common Issues

"Invalid API Key" Error

Check API key is correct from platform.colomboai.com
Verify environment variable is loaded
Ensure no extra spaces or quotes in .env file

"Request Timeout" Error

Check your internet connection
Try increasing timeout: new Cairo({ timeout: 60000 })

Streaming Not Working

Ensure you're using for await loop
Check Node.js version >= 18
Verify API route returns proper stream

License

[email protected]