AGENT MADNESS
THE BRACKETBRACKET ENTRIESBEST OF THE RESTSIGN IN
PUBLIC ENTRY PAGE

NiData

A multi-agent solution that governs and automates the data software delivery lifecycle process from requirements gathering to first delivery.

BACK TO BRACKET ENTRIESVIEW BRACKET
NiData
Builder
Cathy Kiriakos
Build Type
Agent Team
Lifecycle
Working prototype
Consensus Score
82.1
Region
REGION 3
CATEGORIES
Automation / Workflow
Go Deeper
A complete AI platform that captures every question, validates every answer, then takes it to first delivery. Four-phase delivery: 1) Guided streamlit intake collects source systems, requirements, SLAs, stakeholders, DQ rules, and PII strategy — stored in Databricks Unity Catalog or Posgres with full audit trail. 2) Validation: AI agents assess completeness, flag gaps, and engage stakeholders before a single line of DDL is written 3) Architect: Automated architecture design grounded in captured requirements — source-to-target mappings, medallion layers, partition strategies 4) 8-agent pipeline generates DDL scripts, pytest suites, SDLC artifacts, and documentation — first delivery, ready for iteration
Stack Used
Based on the provided documentation and architecture guides, NiData is built on a dual-deployment architecture (Databricks and Docker) that shares a core foundation. Here is the full tech stack broken down by component: **Core Languages & Frontend** * **Python (3.8+):** The primary language used across the platform, driving everything from agent orchestration to the knowledge graph. * **SQL:** Used for both Unity Catalog DDL (Delta Lake) and standard PostgreSQL schemas. * **Streamlit:** Powers the main 9-step wizard web UI, the artifact viewer, and the admin panel for reference data management. **AI Models & Orchestration Engine** * **LLMs:** The default model is Llama 3.3 70B (hosted via Databricks Model Serving). The platform can also be configured to use OpenAI GPT-4 and Anthropic Claude via their APIs. * **Agent Orchestration:** A Python-based, config-driven engine using LangGraph-style orchestration to manage the 8-agent legacy pipeline and the 7-agent sequential delivery pipeline. * **Job Orchestration:** Can run on Databricks Workflows or Apache Airflow. **Deployment Option A: Databricks (Enterprise Cloud-Native)** * **Storage & Governance:** Unity Catalog (26-table schema) and Delta Lake. * **Compute:** Databricks Runtime / Spark. * **Model Registry & Feature Store:** MLflow on Databricks and Databricks Feature Store. * **Security:** OIDC managed authentication and Databricks secrets. **Deployment Option B: Docker (Standalone / Air-Gapped)** * **Containerization:** Multi-service Docker Compose orchestration. * **Database:** PostgreSQL (for platform-agnostic storage) connected via `psycopg2`. * **Infrastructure as Code (IaC):** Terraform used to provision cost-optimized GCP spot instances. * **Web Server:** Nginx acting as a reverse proxy for production profiles. **Business Intelligence (BI) Integration** * **Tableau Parsers & Connectors:** Parses TWB/TWBX files, integrates via the Tableau Server REST API, and connects directly to the internal Tableau PostgreSQL repository on port 8060. * **Power BI Parsers & Connectors:** Extracts DAX measures and connects to the Power BI XMLA endpoint and Azure SQL. **Knowledge Graph & Context Tools** * **Business-Domain Graph:** Custom Python implementation (`agent_knowledge_graph.py` and `graph_query_layer.py`) with parallel internal indexing for lineage tracking and multi-hop impact analysis. * **Code-Structure Graph:** `CodeGraphContext` (cgc) and `Kuzu` are used in the developer tooling to map function calls, class hierarchies, and module dependencies. **CI/CD, Testing & Integrations** * **CI/CD:** GitHub Actions. * **Testing:** `pytest` is used both for testing the platform itself and for generating automated data quality and Gold-layer reconciliation tests as an artifact of the agent pipeline. * **Notifications:** Microsoft Teams webhooks for automated stakeholder sign-offs and notifications.