Skip to content

AI Agent Demo

This demo showcases GenAI observability with OpenTelemetry - automatic instrumentation of LLM calls using the opentelemetry-instrumentation-ollama package.

What is GenAI Observability?

GenAI observability captures telemetry from LLM interactions including:

  • Prompts and responses - Full text of user prompts and model outputs
  • Token usage - Input and output token counts
  • Latency - Response time for each LLM call
  • Model information - Which model was used

Quick Start

Docker

# Start TinyOlly core first
cd docker
./01-start-core.sh

# Deploy AI agent demo (pulls pre-built images from Docker Hub)
cd ../docker-ai-agent-demo
./01-deploy-ai-demo.sh

This starts:

  • Ollama with TinyLlama model for local LLM inference
  • AI Agent with automatic GenAI span instrumentation

Access the UI at http://localhost:5005 and navigate to the AI Agents tab.

For local development: Use ./01-deploy-ai-demo-local.sh to build locally

Stop: ./02-stop-ai-demo.sh

Cleanup (remove volumes): ./03-cleanup-ai-demo.sh

How It Works

The demo uses zero-code auto-instrumentation - no OpenTelemetry imports in the application code:

# agent.py - NO OpenTelemetry imports needed!
from ollama import Client

client = Client(host="http://ollama:11434")

# This call is AUTO-INSTRUMENTED
response = client.chat(
    model="tinyllama",
    messages=[{"role": "user", "content": "What is OpenTelemetry?"}]
)

The magic happens in the Dockerfile:

# Install auto-instrumentation packages
RUN pip install opentelemetry-distro opentelemetry-instrumentation-ollama

# Run with auto-instrumentation wrapper
CMD ["opentelemetry-instrument", "python", "-u", "agent.py"]

What You'll See

In the AI Agents tab:

Field Description
Prompt The user's input to the LLM
Response The model's output
Tokens In Number of input tokens
Tokens Out Number of output tokens
Latency Response time in milliseconds
Model Model name (e.g., tinyllama)

Click any row to expand the full span details in JSON format.

Supported LLMs

The OpenTelemetry GenAI semantic conventions work with any instrumented LLM provider:

  • Ollama - Local LLM inference (this demo)
  • OpenAI - GPT models via opentelemetry-instrumentation-openai
  • Anthropic - Claude models
  • Other providers - Any with OpenTelemetry instrumentation

Configuration

The demo is configured via environment variables in docker-compose.yml:

ai-agent:
  environment:
    - OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
    - OTEL_SERVICE_NAME=ai-agent-demo
    - OLLAMA_HOST=http://ollama:11434

Troubleshooting

No AI traces appearing?

  • Ensure TinyOlly core is running
  • Check agent logs: docker logs ai-agent-demo
  • Verify Ollama is ready: docker logs ollama

Model download taking long?

  • TinyLlama is ~600MB, first download may take a few minutes
  • Check Ollama logs for download progress

Agent errors?

  • Ollama needs time to load the model after container starts
  • The agent waits 10 seconds before first call