← All Posts

AI-Assisted Development: ODS Spark Replatform with Agents

2025-05-14
AISparkMigrationAgentsDatabricks

AI-Assisted Development: ODS Spark Replatform with Agents

Replatforming BT Panorama's Spark applications from legacy DSE Spark 2.4 to Databricks Cloud Runtime was a massive undertaking — 377 source files, thousands of test cases, and tight coupling to outdated dependencies. Here's how we used AI agents to dramatically accelerate the effort.

Project Overview

The migration was driven by DSE 6.9.0 approaching end-of-life, with the codebase tightly coupled to unsupported dependencies and outdated APIs. Our modernization targets:

  • Databricks Cloud Runtime 17.3 LTS
  • Spark 3.5.2 (up from 2.4)
  • Scala 2.12.17
  • Replace legacy DSE drivers with cloud-native alternatives
  • Rebuild test infrastructure on local Cassandra with dual cluster connectivity

Benefits

Evergreen platform, scaling flexibility, AI-ready data pipelines, and significant TCO reduction.

Codebase Statistics

TypeCount
Batch Jobs37
Delta Jobs79
Migration Jobs7
Schedule Jobs24
Test Cases2,982
Entity/Case Classes273

The AI Agent Framework

Rather than using a single generalist AI, we built a multi-agent system with specialized roles coordinated by a master orchestrator:

Kiro Agent Framework

Orchestrator Agent — Coordinates phases and tracks state across the pipeline

Phase 1: Code Writer Agent

  • Analyzes existing job patterns
  • Generates migrated code following coding standards
  • Uses spec & design docs as context

Phase 2: Testing Agent

  • Creates comprehensive test suites
  • Runs tests against local Cassandra
  • Validates dual-write phases (local → dual → cloud-only)

Phase 3: Code Review & PR Agent

  • Reviews code quality and compatibility
  • Checks for deprecated APIs and JDK 17 compliance
  • Commits to feature branch and creates PR on Bitbucket

Each agent is equipped with:

  • Skills — Tools & actions (code generation, test runner, SonarQube, Artifactory, Bamboo)
  • Steering — Rules & guidelines (coding standards, test patterns, PR standards)
  • Context — Knowledge & specs (design docs, implementation plans, review criteria)

Development Efficiency

Development Efficiency Analysis

Demo: Full Dev Cycle in 20 Minutes

Demo Flow

A single prompt — *"please refer to AccountJob and write a demo job to auto a full dev cycle"* — triggered the entire pipeline:

  • Code Writer Agent (~4 min) — Analyzed AccountJob pattern, created DemoCustomerJob.scala with transform(), transformWithJoin(), and transformWithLeftJoin() methods

  • Testing Agent (~9.5 min) — Created DemoCustomerJobTest.scala with 3 phases verified:
- Phase 1: Local write ✅ - Phase 2: Dual write ✅ - Phase 3: Astra only ✅

  • Code Review & PR Agent (~6 min) — Reviewed code quality, confirmed no deprecated APIs, verified JDK 17 compatibility, committed to feature branch, pushed, and created PR on Bitbucket

Result: Feature branch → Tests passing (3/3) → PR created | Total: ~20 minutes

Lessons Learned

  • Spec-driven beats vibe coding — Requirements.md + design.md produces more consistent, reproducible results than ad-hoc prompting

  • Specialized agents outperform generalists — Dedicated code-writer, testing, and code-review agents each excel at their domain

  • Steering docs are critical — Code guidelines, test patterns, and context docs reduce hallucination and keep output on-track

  • Context window management matters — Large codebases (377 files) require filtered compilation and targeted context injection

Conclusion

AI-assisted development isn't about replacing engineers — it's about amplifying them. By combining specialized agents with well-defined specs and steering documents, we turned a multi-quarter migration into a streamlined, repeatable process where a full development cycle takes minutes instead of hours.