# Authority you reclaim is authority you run

> Editorial lab · Status: published  
> Published by The CTO Advisor LLC · Layer2C Labs

**Question:** The pitch every cloud-exit deck makes: leave the managed platform and take control back. This lab tests it on one production application. The Virtual CTO Advisor, all-in on a single cloud, migrates to the box (the DGX Spark, retained compute kept below the platform’s abstraction) until the serve path runs with no cloud credentials in the environment. The question is not whether it can run local. It is how much decision authority actually comes home, and what it costs to hold. The one-line loss: every layer you move from Ceded to Retained is a decision you now own and a system you now operate.

**Load:** The production advisory application, migrated as a clone: the document and vector store on CloudNativePG and pgvector, embeddings on a local model, generation on vLLM, identity on Keycloak, orchestration on k3s. Production never touched. One workload, one box, one owner sitting next to the evidence.

## Executive Summary

The application moved to retained compute one layer at a time, and each move was recorded as a shift in decision authority. The document and vector store left Firestore for CloudNativePG and pgvector. Embeddings left a hosted API for a local model. Generation left Vertex for vLLM. Session state left Firestore for Postgres. Identity left Firebase for Keycloak. At the end the serve path ran end to end with no cloud credentials present.

Every step was validated by a probe that executed, not by a claim. The vector store survived a force-kill wipe-and-restore drill with all 3,489 chunks intact and a two-second recovery. The reasoning loop returned a grounded, cited answer over the ingress. Identity issued a real signed token and the application accepted it, refusing the request without one. The migration is the evidence, and that evidence is what a bounded Fourth Cloud assessment of Kubernetes was built from: the canon row bounded-kubernetes-fourthcloud, now live, scored on twenty-six functions earned by execution.

## DAPM Table — Authority Verdict

| Layer | Was | Now |
| --- | --- | --- |
| layer0 | Ceded | Ceded |
| layer1a | Ceded | Retained |
| layer1b | Ceded | Retained |
| layer2a | Ceded | Retained |
| layer2b | Ceded | Retained |
| layer2c | Retained | Retained |

## Detailed Writeup

The migration was executed as authority accounting, not as a lift-and-shift. Each managed dependency was replaced by a self-hosted equivalent and then proven by a probe. The store swap went behind the application’s existing retrieval provider seam, so the ranking logic validated on the cloud is the same code path on pgvector. The thread and session store had no such seam and had to be given one, which surfaced the real cost of retention: an eager cloud client in the import path blocked a cloud-free boot until the coupling was made lazy behind a store interface.

The kill-the-cloud test is the load-bearing proof. With no cloud credentials in the environment and the model cache offline, the application retrieved from pgvector, embedded locally, generated on vLLM, and answered in the owner’s voice. Identity was the last dependency to come home: the verify path swapped from Firebase to OpenID Connect against a Keycloak realm, and the application accepted a signed token and refused a request without one.

The single cession is the substrate. GPU access works through a runtime-class injection; GPU accounting does not, because the driver cannot report unified memory. That gap is closeable in the open-source device plugin, so it is Retained authority left unbuilt by choice, not a vendor lock. Every other layer is owned outright, and every owned layer added an operational bill that the assessment records as a priced gap rather than a silent assumption.

## Method and Disclosure

Self-funded, no sponsor. The application was migrated as a clone and production was never touched. The serve path was proven with no cloud credentials in the environment.

What ships: the substrate, the authority movements, the operational bill per layer, and the raw lab detail. What stays proprietary: the corpus contents, the retrieval tuning, and the owner-authored assessment thresholds.

The quantitative record is the Fourth Cloud assessment bounded-kubernetes-fourthcloud, live in the canon at cloud.layer2c.com. This lab is the story of how a workload became the boundary that made that assessment possible.

---
*Layer2C Labs · The CTO Advisor LLC · labs.layer2c.com*
