DATA is a production-grade LLM application development platform designed for teams building AI-powered products. From simple chatbots to complex multi-agent workflows, DATA provides the visual tools, infrastructure, and runtime to go from prototype to production in hours, not weeks.
| Feature | Description | |———|————-| | Visual Workflow Builder | Drag-and-drop pipeline design with real-time preview | | Multi-Agent Orchestration | Chain, route, and parallel-execute AI agents | | RAG Pipeline | Built-in document ingestion, chunking, embedding, and retrieval | | 50+ Built-in Tools | Web search, code execution, image generation, API connectors | | Plugin Ecosystem | Install community plugins or build your own | | Human-in-the-Loop | Pause workflows for human approval or input | | Auto-Scaling | Horizontal scaling for production workloads |
┌─────────────────────────────────────────────────────────┐
│ Web Frontend (Next.js) │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Workflow │ │ Agent │ │ Dataset │ │ Settings │ │
│ │ Studio │ │ Config │ │ Manager │ │ Page │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└──────────────────────┬──────────────────────────────────┘
│ HTTP/WebSocket
┌──────────────────────▼──────────────────────────────────┐
│ API Server (FastAPI) │
│ ┌─────────────┐ ┌────────────┐ ┌──────────────────┐ │
│ │ Workflow │ │ Agent │ │ Model Runtime │ │
│ │ Engine │ │ Runner │ │ Manager │ │
│ ├─────────────┤ ├────────────┤ ├──────────────────┤ │
│ │ RAG Service │ │ Tool System│ │ Plugin Registry │ │
│ └─────────────┘ └────────────┘ └──────────────────┘ │
└──────────────────────┬──────────────────────────────────┘
│
┌──────────────────────▼──────────────────────────────────┐
│ Data & Cache Layer │
│ ┌────────────┐ ┌────────────┐ ┌──────────────────┐ │
│ │ PostgreSQL │ │ Redis │ │ Vector DB │ │
│ │ │ │ │ │ (multi-engine) │ │
│ └────────────┘ └────────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────────┘
User Request → API Gateway → Auth Check → Workflow Engine
↓
Agent Selection → Context Building → LLM Invocation
↓
Tools Execution ← RAG Retrieval ← Knowledge Base
↓
Response Assembly → Streaming → User
# Clone the repository
git clone https://github.com/your-org/data-agent
cd data-agent
# Start all services
cd docker
docker compose up -d
# Access the platform
# Web UI: http://localhost:3000
# API: http://localhost:5001
# Swagger: http://localhost:5001/docs
# Backend
cd api
cp .env.example .env
uv sync
uv run flask db upgrade
uv run python app.py
# Frontend
cd web
cp .env.example .env.local
pnpm install
pnpm dev
| Variable | Default | Description |
|---|---|---|
DATA_BIND_ADDRESS |
0.0.0.0 |
API server bind address |
DATA_PORT |
5001 |
API server port |
SECRET_KEY |
(required) | App encryption key |
DB_USERNAME |
data |
PostgreSQL username |
DB_PASSWORD |
data123456 |
PostgreSQL password |
DB_HOST |
localhost |
PostgreSQL host |
DB_PORT |
5432 |
PostgreSQL port |
REDIS_HOST |
localhost |
Redis host |
REDIS_PORT |
6379 |
Redis port |
DATA supports a rich plugin ecosystem:
| Type | Examples |
|---|---|
| Model Providers | OpenAI, Anthropic, Ollama, AWS Bedrock, Azure |
| Tools | Web search, Calculator, Code execution, Image gen |
| Vector Stores | Qdrant, Milvus, Weaviate, Pinecone |
| Document Loaders | PDF, HTML, Notion, Confluence, S3 |
| Observability | Langfuse, OpenTelemetry, Sentry |
# Backend tests
cd api
uv run pytest tests/unit_tests/ -v
# Frontend tests
cd web
pnpm test
# E2E tests
pnpm test:e2e
We welcome contributions! See our Contributing Guide for details.
git checkout -b feature/amazing)git commit -m 'Add amazing feature')git push origin feature/amazing)Please read our Code of Conduct.
This project is licensed under the Apache License 2.0 — see the LICENSE file for details.