AI Agent Demo
This demo showcases GenAI observability with OpenTelemetry - automatic instrumentation of LLM calls using the opentelemetry-instrumentation-ollama package.
What is GenAI Observability?
GenAI observability captures telemetry from LLM interactions including:
- Prompts and responses - Full text of user prompts and model outputs
- Token usage - Input and output token counts
- Latency - Response time for each LLM call
- Model information - Which model was used
Quick Start
Docker
# Start TinyOlly core first
cd docker
./01-start-core.sh
# Deploy AI agent demo (pulls pre-built images from Docker Hub)
cd ../docker-ai-agent-demo
./01-deploy-ai-demo.sh
This starts:
- Ollama with TinyLlama model for local LLM inference
- AI Agent with automatic GenAI span instrumentation
Access the UI at http://localhost:5005 and navigate to the AI Agents tab.
For local development: Use ./01-deploy-ai-demo-local.sh to build locally
Stop: ./02-stop-ai-demo.sh
Cleanup (remove volumes): ./03-cleanup-ai-demo.sh
How It Works
The demo uses zero-code auto-instrumentation - no OpenTelemetry imports in the application code:
# agent.py - NO OpenTelemetry imports needed!
from ollama import Client
client = Client(host="http://ollama:11434")
# This call is AUTO-INSTRUMENTED
response = client.chat(
model="tinyllama",
messages=[{"role": "user", "content": "What is OpenTelemetry?"}]
)
The magic happens in the Dockerfile:
# Install auto-instrumentation packages
RUN pip install opentelemetry-distro opentelemetry-instrumentation-ollama
# Run with auto-instrumentation wrapper
CMD ["opentelemetry-instrument", "python", "-u", "agent.py"]
What You'll See
In the AI Agents tab:
| Field | Description |
|---|---|
| Prompt | The user's input to the LLM |
| Response | The model's output |
| Tokens In | Number of input tokens |
| Tokens Out | Number of output tokens |
| Latency | Response time in milliseconds |
| Model | Model name (e.g., tinyllama) |
Click any row to expand the full span details in JSON format.
Supported LLMs
The OpenTelemetry GenAI semantic conventions work with any instrumented LLM provider:
- Ollama - Local LLM inference (this demo)
- OpenAI - GPT models via
opentelemetry-instrumentation-openai - Anthropic - Claude models
- Other providers - Any with OpenTelemetry instrumentation
Configuration
The demo is configured via environment variables in docker-compose.yml:
ai-agent:
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
- OTEL_SERVICE_NAME=ai-agent-demo
- OLLAMA_HOST=http://ollama:11434
Troubleshooting
No AI traces appearing?
- Ensure TinyOlly core is running
- Check agent logs:
docker logs ai-agent-demo - Verify Ollama is ready:
docker logs ollama
Model download taking long?
- TinyLlama is ~600MB, first download may take a few minutes
- Check Ollama logs for download progress
Agent errors?
- Ollama needs time to load the model after container starts
- The agent waits 10 seconds before first call