Runbooks for agentic coding · 2026

Your Gemini
ships production code
through Flownex.

Shared runbooks for engineering teams using AI agents — whatever your team ships. Flownex maps the tools and tech you pick — your agent, your build system, your distribution targets — into runbook nodes. Every engineer runs the same bounded workflow through the agent they already use. Human checkpoints, audit trail, team dashboard.

Built for every agentic-coding startup
Free up to 3 engineers Works with any MCP host No credit card
Gemini Agent · MCP host
flownex
run the android bug fix flow for KUR-42
flownex.start_flow_unit(ticket="KUR-42")
created wu-3df2… · scope: 4 files in app/src/main
flownex.request_checkpoint(phase="PLANNING")
awaiting approval from @tech-lead · open in flownex
approved · proceeding
generating diff via Gemini 2.x…
2 files · +8 / -4 lines · null guard in CrashReporter
flownex.record_build_result(exit_code=0)
./gradlew assembleDebug testDebugUnitTest · BUILD SUCCESSFUL
flownex.create_pr()
PR #173 opened · labeled flownex-generated
live 1 of 13 MCP tools · 3 phases · 1 checkpoint · 0 retries KUR-42 · wu-3df2
2026 · the unfinished seam

What we're fixing in 2026.

Every agentic-coding startup shipped an agent. Nobody shipped the runbook that tells five different engineers' agents how to do the same thing the same way. This is the seam. We live here.

01
Broken in 2025
Your best engineer is 10× faster with Cursor. The other four are guessing. Nobody ships the same way twice.
We fix it in 2026
One team-defined runbook. Every engineer's agent runs the same phases, the same checks, the same PR template.
02
Broken in 2025
Autonomous agents race to main. Regulated teams can't ship that way. Compliance can't prove who approved what.
We fix it in 2026
Human checkpoints at phase boundaries you define. Hash-chained audit log. "Who approved ticket X phase Y" answered in one query.
03
Broken in 2025
Starter templates were either web-only (Cursor / Copilot) or mobile afterthoughts. Backend and cross-platform teams got nothing.
We fix it in 2026
Starter runbooks per stack surface on signup — whether your team ships mobile, web, backend, or cross-platform. The canvas palette grows as you wire more tools. Gradle, xcodebuild, Metro, pnpm, cargo, Terraform, dbt — whatever you use becomes a node.
04
Broken in 2025
Pick one agent, buy one subscription per seat, lock the whole team in. Migrate later = pain.
We fix it in 2026
MCP-native. Gemini, Claude, Cursor, Codex, Copilot — whichever each engineer prefers. One Flownex key, every host, zero lock-in.
For every agentic-coding startup — the workflow, audit, and team layer you didn't have time to build.
How it works

7 phases. One shared runbook.

Every flow unit moves through a template your tech lead defined once — from ticket import to app-store distribution. Each phase has configurable nodes, agent actions, and human checkpoints. Same runbook runs from any MCP host.

Phase 1
Scope
Import a ticket from any connected source — Jira, Linear, GitHub Issues, Sentry, Crashlytics, or manual entry. Flownex detects your stack and tools and returns a scope manifest of editable + read-only paths for the agent.
Jira Import Linear Import Framework Detect
01
Ticket Source
Jira Linear Notion Manual
TEAM-417
Add rate-limit retry to the session handler
bug p1
Stack detected · scope: 4 editable · 12 read-only
Phase 2
Planning
Your agent decomposes the ticket into a step-by-step plan using its native LLM. If a Figma URL is attached, Flownex returns design tokens via get_figma_frame. A checkpoint pauses the flow for human approval before any code is written.
Agent plans Figma tokens Checkpoint
02
Agent · Planning phase
14:23:01 [agent] Decomposing TEAM-417 via Claude…
14:23:04 [agent] Generated 6-step plan:
1. Add retry policy to SessionClient
2. Wire exponential backoff (500ms · 3 tries)
3. Handle 429 + Retry-After header
4. Unit tests for retry matrix
5. Update error-boundary docs
6. Bump minor version
14:23:05 [flownex.request_checkpoint] Waiting for tech lead…
Phase 3
Designing
Pull design tokens and component references from Figma so your agent has them in context before it writes UI code. Output format follows whatever your stack uses — Compose, SwiftUI, React, Vue, plain CSS. Optional phase — skip if the ticket has no design work.
Figma tokens Component refs Checkpoint
03
Design · get_figma_frame
14:23:41 [flownex] fetching frame 12:847…
tokens: 18 colours · 6 spacings · 4 type ramps
components: 3 matched · 1 unmatched
formats: css-vars / swift-ui / compose / tw-config
14:23:44 [agent] context received · ready to author
host LLM writes UI in the format your stack uses
Phase 4
Implementing
Your agent generates the diff via its own LLM. Flownex returns the scope manifest (editable + read-only paths) — the agent writes files locally through its filesystem MCP, then reports diff stats back via record_generated_diff. Code never touches our servers.
Agent writes Git branch Diff stats
04
Agent diff · record_generated_diff
Branch: flownex/wu-team-417-1745368...
record_generated_diff
  files_changed: 4
  additions: +147
  deletions: -23
  summary: "added exponential-backoff retry"
diff stays local · Flownex holds stats only
Phase 5
Reviewing
The agent runs the build locally (Gradle / xcodebuild / npm / cargo — whatever your stack needs) via its shell MCP and reports the outcome via record_build_result. If it fails, the agent fixes and re-runs. A checkpoint gates the PR creation.
Local build Agent fixes Create PR Checkpoint
05
Review · record_build_result
14:24:41 [shell] <your build command>
pnpm test · pytest · go test · cargo test · ./gradlew · xcodebuild · mvn …
14:25:12 [shell] exit=0 · 54 tests passed
14:25:13 [flownex] record_build_result(exit=0, tail=…)
14:25:28 [flownex.create_pr] PR #142 opened
14:25:30 [Checkpoint] Awaiting merge approval…
Phase 6 -- CI
Testing
Run whatever test and static-analysis commands your runbook defines — unit, integration, E2E, lint, type-check, coverage. Agent runs them locally, reports pass/fail counts via record_test_result. CI stays separate from CD so tests must be green before distribution begins.
Unit · Integration · E2E Lint + type-check Coverage gate
06
Test Results
Test Suite PASSED
Unit47/47
Integration12/12
E2E8/8
Lint + type-check0 issues
Coverage: 84% (+3.2% vs main)
Phase 7 -- CD
Distribution
Distribution nodes plug into whatever your team ships to — CI providers (GitHub Actions, CircleCI, Bitrise, GitLab CI), app stores (TestFlight, Play, Firebase AD), web deploy targets (Vercel, Netlify, Cloudflare Pages), container registries (ECR, GHCR), package registries (npm, crates.io, PyPI), or custom HTTP. Notify via Slack / Teams / Discord / email.
Trigger CI Upload artifact Notify
07
Distribution Pipeline
14:30:01 [CI] trigger · <your CI provider>
14:33:45 [CI] artifact ready
14:33:46 [upload] <your distribution target>
TestFlight · Play · Firebase AD · Vercel · npm · Docker · …
14:34:02 [upload] distributed
14:34:03 [Slack] #releases notified
TEAM-417: DONE — ticket to production
7
Workflow phases
13
MCP tools
ANY
Agent host · any LLM
5
Team roles
0
Lines of code sent to Flownex
Platform

Everything your team needs

The workflow, audit, and team layer above whatever agent your team uses. Not another AI code assistant — the scaffolding that makes AI coding reproducible across your team, whatever stack you ship.

Visual Canvas Designer
Design flow templates on a drag-and-drop canvas with 7 swim lanes. Add, remove, and reorder 20+ node types. Configure phases per project.
Kanban Board
Track every flow unit across 9 columns -- 7 phases plus Done and Failed. Switch between kanban and list views. Filter by assignee, template, or status.
AI Code Generation
Your agent (Gemini, Claude, Cursor, Copilot) does the LLM work. Flownex orchestrates the pipeline — scope, checkpoints, build verification, PR creation — so the same workflow runs the same way for every engineer on your team.
BYOK -- Your Keys, Your Code
Your data goes to the LLM provider you choose, using your key. We never see it. PII scanning runs on every request. Choose from 8 providers.
Team Reports
Velocity metrics, per-engineer performance, LLM cost breakdowns by provider and model. PMs see who is shipping. Leads see where to optimize.
Ticket Sources
Import from Jira, Linear, Notion, Sentry, Crashlytics -- or enter manually. Recent tickets pre-loaded. AI enhances manual descriptions.
Team

Role-based access for every team member

Five roles with granular permissions. Each person sees exactly what they need -- no more, no less. Module visibility is enforced across the entire platform.

PM
Read-only kanban
Velocity reports
Comment on flow units
Engineer
Create flow units
Own reports only
Assigned work
Senior Engineer
Create templates
Any flow unit
Approve checkpoints
Tech Lead
All templates
Team reports
Team management
Principal
Org-wide access
All reports
Full admin

Write the runbook once.
Every engineer ships the same way.

Free forever up to 3 engineers. No credit card. Works with every major MCP host. Your agent, our workflow.

Start free View pricing