AI-Powered Kubernetes Observability & Infrastructure Intelligence
Real-time telemetry β’ AI-driven anomaly detection β’ Operational intelligence
PodSage AI is the organization behind this repository. The PodSage-AI project is a flagship open-source platform built by PodSage AI to deliver intelligent Kubernetes observability, AI-assisted anomaly detection, and infrastructure intelligence.
PodSage-AI is a flagship project built by the PodSage AI organization. It is an intelligent Kubernetes observability platform that monitors, analyzes, and correlates real-time infrastructure behavior using AI-powered operational insights.
The project includes a React/Vite frontend dashboard for interactive metric visualization, dependency maps, and AI insights.
Built for the ABB Accelerator 2026 challenge, PodSage-AI combines Kubernetes telemetry, Prometheus metrics, anomaly detection, dependency analysis, and infrastructure intelligence into a unified monitoring ecosystem.
The mission is simple:
Transform raw Kubernetes metrics into actionable operational intelligence.
Traditional observability platforms expose metrics.
PodSage AI focuses on transforming telemetry into actionable operational intelligence using AI-assisted infrastructure analysis.
Instead of only showing dashboards, PodSage AI helps explain:
flowchart LR
A["Applications / Microservices"]
B["Data Collection Layer
β’ Prometheus
β’ Node Exporter
β’ kube-state-metrics
β’ cAdvisor"]
C["AI Intelligence Layer
β’ CPU Analysis Engine
β’ Memory Analysis Engine
β’ Dependency Mapper
β’ Correlation Engine"]
D["Infrastructure Intelligence Layer
β’ Prometheus
β’ SQLite
β’ Loki
β’ ML Models"]
E["Dashboard & Visualization Layer
β’ React / Vite
β’ Recharts
β’ React Flow
β’ WebSockets"]
A --> B
B --> C
C --> D
D --> E
PodSage-AI/
βββ backend/
β βββ app/
β β βββ api/
β β βββ database/
β β βββ models/
β β βββ services/
β β βββ websocket/
β β βββ main.py
β β
β βββ Dockerfile
β βββ docker-compose.yml
β βββ prometheus.yml
β βββ requirements.txt
β βββ podsage.db
βββ frontend/
β βββ index.html
β βββ package.json
β βββ README.md
β βββ vite.config.js
β βββ src/
β β βββ App.jsx
β β βββ main.jsx
β β βββ api/
β β β βββ client.js
β β βββ components/
β β β βββ AIInsights.jsx
β β β βββ AnomalyTable.jsx
β β β βββ ClusterSummary.jsx
β β β βββ DependencyGraph.jsx
β β β βββ Header.jsx
β β β βββ JsonPreview.jsx
β β β βββ MetricCard.jsx
β β β βββ SeriesChart.jsx
β β βββ styles/
β β β βββ global.css
β β βββ utils/
β β βββ metrics.js
βββ README.md
βββ LICENSE
βββ .gitignore
Before starting, ensure you have:
git clone https://github.com/PodSageAI/PodSage-AI.git
cd PodSage-AI/backend
python -m venv venv
source venv/bin/activate
venv\Scripts\activate
pip install -r requirements.txt
cd ../frontend
npm install
uvicorn app.main:app --reload
Backend URL:
http://localhost:8000
Swagger Documentation:
http://localhost:8000/docs
ReDoc Documentation:
http://localhost:8000/redoc
cd frontend
npm run dev
Frontend URL:
http://localhost:5173
curl http://localhost:8000/metrics/cpu
http://localhost:8000/docs
docker compose up --build
docker compose down
docker compose up -d
kubectl apply -f k8s/
kubectl get pods
kubectl port-forward svc/podsage-ai 8000:8000
| Endpoint | Description |
|---|---|
/ |
Root status |
/health |
Health check |
| Endpoint | Description |
|---|---|
/metrics/cpu |
CPU metrics |
/metrics/memory |
Memory metrics |
/metrics/restarts |
Restart metrics |
| Endpoint | Description |
|---|---|
/anomalies |
Detected anomalies |
/insights |
AI-generated insights |
/dependencies |
Dependency mapping |
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {},
"value": [
1778683850.411,
"0.2482235237555631"
]
}
]
}
}
[
{
"type": "High CPU Usage",
"pod": "node-exporter:9100",
"value": 24.82,
"unit": "%"
}
]
[
{
"pod": "node-exporter:9100",
"insight": "Pod node-exporter:9100 is consuming unusually high CPU resources.",
"recommendation": "Consider scaling replicas or optimizing workload."
}
]
Current AI functionality includes:
CPU_THRESHOLD = 0.2
MEMORY_THRESHOLD = 500000000
RESTART_THRESHOLD = 5
PodSage AI automatically falls back to node-level metrics when container-level Kubernetes metrics are unavailable.
This ensures monitoring continuity even in partially configured environments.
Example fallback query:
1 - avg(rate(node_cpu_seconds_total{mode="idle"}[1m]))
PodSage AI was developed as part of the ABB Accelerator 2026 innovation challenge focused on:
Contributions are welcome.
git checkout -b feature/my-feature
git commit -m "Add new feature"
git push origin feature/my-feature
frontend/ appMIT License Β© 2026 PodSage AI
Version: v0.1.4-alpha
Status: Active Development
PodSage AI aims to evolve into a next-generation autonomous infrastructure intelligence platform capable of understanding, predicting, and optimizing Kubernetes environments in real time.
Future versions aim to transition from observability into fully autonomous operational intelligence.
Built with β€οΈ for cloud-native infrastructure intelligence