Platform Engineering in Practice: Building Your Internal Developer Platform

Platform Engineering in Practice: Building Your Internal Developer Platform

Author: Abdulkader Safi

Position: Software Engineer

Read Time: 13 min read

How long does it take a new developer to make their first commit at your organization? If the answer is measured in days or weeks rather than hours, you have a platform engineering problem.

As a Lead Engineer managing multiple projects across teams, I've witnessed firsthand how infrastructure complexity crushes developer productivity. Teams spend more time fighting Kubernetes configurations, waiting for cloud resources, and navigating approval workflows than actually writing code. Platform engineering changes this equation fundamentally.

In this guide, I'll walk you through building an Internal Developer Platform (IDP) that abstracts infrastructure complexity, enables developer self-service, and maintains the governance your organization requiresβ€”all while reducing cognitive load and accelerating delivery.


What is Platform Engineering?

Platform engineering is the discipline of designing and building self-service capabilities for software engineering organizations. It creates a foundationβ€”an Internal Developer Platformβ€”that enables developers to provision infrastructure, deploy applications, and manage services without deep infrastructure expertise or waiting on operations teams.

Platform Engineering vs DevOps

DevOps brought developers and operations closer together, but it also shifted significant operational burden onto development teams. The "you build it, you run it" philosophy, while valuable, created cognitive overload as developers needed to master Kubernetes, Terraform, CI/CD pipelines, observability stacks, and security policies alongside their actual application code.

Platform engineering addresses this by:

  • Abstracting complexity: Developers interact with simplified interfaces, not raw infrastructure
  • Encoding best practices: Security, compliance, and operational standards are built into the platform
  • Enabling self-service: No more waiting for tickets or approvals for standard operations
  • Reducing cognitive load: Developers focus on business logic, not infrastructure details

The Business Case

The numbers speak for themselves:

  • Organizations with mature IDPs report 60%+ reduction in developer onboarding time
  • Self-service platforms reduce infrastructure provisioning from days to minutes
  • Golden paths decrease configuration errors by standardizing deployments
  • Platform teams typically serve 10-15 development teams effectively
  • Gartner predicts 80% of engineering organizations will have dedicated platform teams by 2026

Core Principles of Platform Engineering

Before diving into implementation, internalize these principles that separate successful platforms from expensive failures.

1. Treat Your Platform as a Product

Your developers are your customers. The platform exists to make them productive, not to showcase infrastructure sophistication.

What this means in practice:

  • Conduct user research with your development teams
  • Prioritize features based on developer pain points
  • Measure adoption and satisfaction, not just technical metrics
  • Iterate based on feedback, not assumptions
  • Market the platform internallyβ€”if developers don't know it exists, they won't use it

2. Enable Self-Service Without Sacrificing Governance

The tension between developer autonomy and organizational control is at the heart of every IDP conversation. The solution isn't choosing one over the otherβ€”it's building guardrails that make the right thing easy and the wrong thing hard.

Self-service with guardrails:

  • Developers can provision resources instantly
  • All resources automatically comply with security policies
  • Cost controls are enforced by default
  • Audit trails are captured automatically
  • Non-compliant configurations are rejected before deployment

3. Start with Golden Paths, Not Golden Cages

Golden paths are opinionated, supported routes to production that encode your organization's best practices. They're recommendations, not requirements.

Golden path characteristics:

  • Cover 80%+ of common use cases
  • Significantly easier than DIY alternatives
  • Well-documented with clear benefits
  • Allow escape hatches for legitimate edge cases
  • Continuously improved based on usage patterns

4. Measure Developer Experience

You can't improve what you don't measure. Implement a Developer Experience Score (DXS) that tracks:

  • Time to first commit for new developers
  • Time from code commit to production deployment
  • Self-service success rate (requests completed without human intervention)
  • Developer satisfaction surveys (quarterly NPS)
  • Platform adoption rate across teams

IDP Architecture: The Platform Layer Cake

A well-architected Internal Developer Platform consists of five interconnected planes, each serving a specific purpose.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   DEVELOPER INTERFACE PLANE                 β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚ β”‚  Developer  β”‚  β”‚    CLI      β”‚  β”‚  IDE        β”‚           β”‚
β”‚ β”‚  Portal     β”‚  β”‚  Tools      β”‚  β”‚  Plugins    β”‚           β”‚
β”‚ β”‚  (Backstage)β”‚  β”‚  (kubectl+) β”‚  β”‚  (VS Code)  β”‚           β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                    PLATFORM ORCHESTRATOR                    β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚  Score / Humanitec / Crossplane / Custom Orchestration  β”‚ β”‚
β”‚ β”‚  - Workload specifications    - Dynamic configuration   β”‚ β”‚
β”‚ β”‚  - Resource dependencies      - Environment management  β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                      SECURITY PLANE                         β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚ β”‚  Secrets β”‚  β”‚  Policy  β”‚  β”‚  RBAC    β”‚  β”‚  Supply  β”‚      β”‚
β”‚ β”‚  Mgmt    β”‚  β”‚  Engine  β”‚  β”‚  & IAM   β”‚  β”‚  Chain   β”‚      β”‚
β”‚ β”‚ (Vault)  β”‚  β”‚  (OPA)   β”‚  β”‚          β”‚  β”‚  Securityβ”‚      β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                    DELIVERY PLANE                           β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚ β”‚  CI/CD   β”‚  β”‚  GitOps  β”‚  β”‚  Artifactβ”‚  β”‚  Feature β”‚      β”‚
β”‚ β”‚ Pipelinesβ”‚  β”‚  (ArgoCD)β”‚  β”‚  Registryβ”‚  β”‚  Flags   β”‚      β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                  INFRASTRUCTURE PLANE                       β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚ β”‚Kubernetesβ”‚  β”‚ Databasesβ”‚  β”‚ Message  β”‚  β”‚ Cloud    β”‚      β”‚
β”‚ β”‚ Clusters β”‚  β”‚ (RDS/etc)β”‚  β”‚ Queues   β”‚  β”‚ Services β”‚      β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                   OBSERVABILITY PLANE                       β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚ β”‚  Metrics β”‚  β”‚  Logging β”‚  β”‚  Tracing β”‚  β”‚  Alerts  β”‚      β”‚
β”‚ β”‚(Promethe)β”‚  β”‚  (Loki)  β”‚  β”‚ (Jaeger) β”‚  β”‚(PagerDut)β”‚      β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1. Developer Interface Plane

This is what developers actually interact with. The interface should meet developers where they are:

Developer Portal (Backstage) A single pane of glass for service catalogs, documentation, and self-service workflows. Backstage, originally from Spotify, has become the de facto standard.

CLI Tools Command-line interfaces for developers who prefer terminal workflows. Wrap complex operations in simple commands.

IDE Plugins Bring platform capabilities directly into VS Code or JetBrains IDEs. Reduce context switching.

2. Platform Orchestrator

The brain of your IDP. Platform orchestrators handle the complex logic of translating developer intent into infrastructure reality.

Key capabilities:

  • Workload specifications (what the developer wants)
  • Resource matching (what infrastructure to provision)
  • Dynamic configuration (environment-specific settings)
  • Dependency management (ordering and relationships)

Popular options:

  • Score: Open-source workload specification
  • Humanitec: Commercial platform orchestrator
  • Crossplane: Kubernetes-native infrastructure abstraction
  • Custom solutions: Built on Kubernetes operators

3. Security Plane

Security must be embedded, not bolted on. The security plane ensures every resource provisioned through the platform meets organizational standards.

Components:

  • Secrets Management: Vault, AWS Secrets Manager, or similar
  • Policy Engine: Open Policy Agent (OPA) for policy-as-code
  • RBAC & IAM: Fine-grained access control
  • Supply Chain Security: SBOM generation, image signing, vulnerability scanning

4. Delivery Plane

Automates the path from code to production:

  • CI/CD Pipelines: GitHub Actions, GitLab CI, Jenkins, or cloud-native options
  • GitOps Controllers: ArgoCD or Flux for declarative deployments
  • Artifact Registries: Container registries with scanning
  • Feature Flags: LaunchDarkly, Unleash, or similar

5. Infrastructure Plane

The underlying compute, storage, and services:

  • Kubernetes Clusters: EKS, GKE, AKS, or self-managed
  • Managed Databases: RDS, Cloud SQL, managed Redis
  • Message Queues: SQS, Pub/Sub, Kafka
  • Cloud Services: Integrated via Crossplane or Terraform

6. Observability Plane

You can't manage what you can't see:

  • Metrics: Prometheus + Grafana
  • Logging: Loki, Elasticsearch, or cloud-native
  • Tracing: Jaeger, Tempo, or commercial APM
  • Alerting: PagerDuty, Opsgenie integration

Building Your First Golden Path

Let's build a practical golden path for deploying a web service. This covers 80% of what development teams need.

Step 1: Define the Developer Experience

What should the developer do to deploy a new service?

# Ideal developer experience
$ platform create service my-api \
    --template nodejs-api \
    --environment staging

βœ“ Repository created: github.com/org/my-api
βœ“ CI/CD pipeline configured
βœ“ Kubernetes namespace provisioned
βœ“ Database (PostgreSQL) provisioned
βœ“ Secrets configured in Vault
βœ“ Monitoring dashboards created
βœ“ Service registered in Backstage

Your service is ready! 
Dashboard: https://backstage.internal/catalog/my-api
Deployment URL: https://my-api.staging.internal

Step 2: Create the Service Template

Using Backstage scaffolder:

# templates/nodejs-api/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: nodejs-api-template
  title: Node.js API Service
  description: Create a production-ready Node.js API with all infrastructure
  tags:
    - recommended
    - nodejs
    - api
spec:
  owner: platform-team
  type: service

  parameters:
    - title: Service Information
      required:
        - name
        - description
        - owner
      properties:
        name:
          title: Service Name
          type: string
          description: Unique name for your service
          pattern: '^[a-z0-9-]+$'
        description:
          title: Description
          type: string
        owner:
          title: Owner Team
          type: string
          ui:field: OwnerPicker
          ui:options:
            allowedKinds:
              - Group

    - title: Infrastructure Options
      properties:
        database:
          title: Database
          type: string
          default: postgresql
          enum:
            - postgresql
            - mysql
            - none
        cacheEnabled:
          title: Enable Redis Cache
          type: boolean
          default: false

  steps:
    # Create repository from template
    - id: fetch-base
      name: Fetch Base Template
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          description: ${{ parameters.description }}
          owner: ${{ parameters.owner }}

    # Create GitHub repository
    - id: publish
      name: Create Repository
      action: publish:github
      input:
        repoUrl: github.com?repo=${{ parameters.name }}&owner=org
        description: ${{ parameters.description }}
        defaultBranch: main
        protectDefaultBranch: true

    # Provision infrastructure via Crossplane
    - id: provision-infra
      name: Provision Infrastructure
      action: kubernetes:apply
      input:
        manifest:
          apiVersion: platform.company.io/v1alpha1
          kind: ServiceEnvironment
          metadata:
            name: ${{ parameters.name }}
          spec:
            database:
              type: ${{ parameters.database }}
              size: small
            cache:
              enabled: ${{ parameters.cacheEnabled }}
            namespace:
              create: true

    # Register in catalog
    - id: register
      name: Register in Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

  output:
    links:
      - title: Repository
        url: ${{ steps.publish.output.remoteUrl }}
      - title: Service Dashboard
        url: https://backstage.internal/catalog/${{ parameters.name }}

Step 3: Define Infrastructure as Code

Using Crossplane for Kubernetes-native infrastructure:

# crossplane/composite-resource-definition.yaml
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
  name: serviceenvironments.platform.company.io
spec:
  group: platform.company.io
  names:
    kind: ServiceEnvironment
    plural: serviceenvironments
  versions:
    - name: v1alpha1
      served: true
      referenceable: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                database:
                  type: object
                  properties:
                    type:
                      type: string
                      enum: [postgresql, mysql, none]
                    size:
                      type: string
                      enum: [small, medium, large]
                cache:
                  type: object
                  properties:
                    enabled:
                      type: boolean
                namespace:
                  type: object
                  properties:
                    create:
                      type: boolean
# crossplane/composition.yaml
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: serviceenvironment-aws
spec:
  compositeTypeRef:
    apiVersion: platform.company.io/v1alpha1
    kind: ServiceEnvironment
  
  resources:
    # Kubernetes Namespace
    - name: namespace
      base:
        apiVersion: kubernetes.crossplane.io/v1alpha1
        kind: Object
        spec:
          forProvider:
            manifest:
              apiVersion: v1
              kind: Namespace
              metadata:
                labels:
                  platform.company.io/managed: "true"
      patches:
        - fromFieldPath: metadata.name
          toFieldPath: spec.forProvider.manifest.metadata.name

    # PostgreSQL Database (RDS)
    - name: database
      base:
        apiVersion: database.aws.crossplane.io/v1beta1
        kind: RDSInstance
        spec:
          forProvider:
            region: us-east-1
            dbInstanceClass: db.t3.micro
            engine: postgres
            engineVersion: "15"
            masterUsername: admin
            skipFinalSnapshotBeforeDeletion: true
            publiclyAccessible: false
          writeConnectionSecretToRef:
            namespace: crossplane-system
      patches:
        - fromFieldPath: metadata.name
          toFieldPath: metadata.name
          transforms:
            - type: string
              string:
                fmt: "%s-db"
        - fromFieldPath: spec.database.size
          toFieldPath: spec.forProvider.dbInstanceClass
          transforms:
            - type: map
              map:
                small: db.t3.micro
                medium: db.t3.small
                large: db.t3.medium

Step 4: CI/CD Pipeline Template

# .github/workflows/deploy.yaml (generated for each service)
name: Build and Deploy

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  SERVICE_NAME: ${{ github.event.repository.name }}
  REGISTRY: ghcr.io/${{ github.repository_owner }}

jobs:
  build:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
      security-events: write

    steps:
      - uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Run tests
        run: npm test

      - name: Run security scan
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          scan-ref: '.'
          format: 'sarif'
          output: 'trivy-results.sarif'

      - name: Upload scan results
        uses: github/codeql-action/upload-sarif@v2
        with:
          sarif_file: 'trivy-results.sarif'

      - name: Build and push container
        if: github.ref == 'refs/heads/main'
        run: |
          echo "${{ secrets.GITHUB_TOKEN }}" | docker login ghcr.io -u $ --password-stdin
          docker build -t $REGISTRY/$SERVICE_NAME:${{ github.sha }} .
          docker push $REGISTRY/$SERVICE_NAME:${{ github.sha }}

  deploy-staging:
    needs: build
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: staging

    steps:
      - name: Update deployment
        uses: company/platform-deploy-action@v1
        with:
          service: ${{ env.SERVICE_NAME }}
          environment: staging
          image-tag: ${{ github.sha }}
          argocd-server: ${{ secrets.ARGOCD_SERVER }}
          argocd-token: ${{ secrets.ARGOCD_TOKEN }}

  deploy-production:
    needs: deploy-staging
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production

    steps:
      - name: Update deployment
        uses: company/platform-deploy-action@v1
        with:
          service: ${{ env.SERVICE_NAME }}
          environment: production
          image-tag: ${{ github.sha }}
          argocd-server: ${{ secrets.ARGOCD_SERVER }}
          argocd-token: ${{ secrets.ARGOCD_TOKEN }}

Policy as Code: Guardrails That Scale

Security and compliance can't rely on manual reviews. Implement policy-as-code to enforce standards automatically.

Open Policy Agent (OPA) Example

# policy/kubernetes.rego
package kubernetes.admission

# Deny containers running as root
deny[msg] {
    input.request.kind.kind == "Pod"
    container := input.request.object.spec.containers[_]
    container.securityContext.runAsUser == 0
    msg := sprintf("Container '%v' must not run as root", [container.name])
}

# Require resource limits
deny[msg] {
    input.request.kind.kind == "Pod"
    container := input.request.object.spec.containers[_]
    not container.resources.limits.memory
    msg := sprintf("Container '%v' must specify memory limits", [container.name])
}

# Enforce approved registries
deny[msg] {
    input.request.kind.kind == "Pod"
    container := input.request.object.spec.containers[_]
    not starts_with(container.image, "ghcr.io/company/")
    not starts_with(container.image, "gcr.io/company/")
    msg := sprintf("Container '%v' uses unapproved registry", [container.name])
}

# Require labels
deny[msg] {
    input.request.kind.kind == "Deployment"
    not input.request.object.metadata.labels["app.kubernetes.io/name"]
    msg := "Deployments must have 'app.kubernetes.io/name' label"
}

# Enforce cost allocation tags
deny[msg] {
    input.request.kind.kind == "Namespace"
    not input.request.object.metadata.labels["cost-center"]
    msg := "Namespaces must have 'cost-center' label for cost allocation"
}

Integrating with Kubernetes

# gatekeeper-config.yaml
apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
  name: config
  namespace: gatekeeper-system
spec:
  sync:
    syncOnly:
      - group: ""
        version: "v1"
        kind: "Namespace"
      - group: "apps"
        version: "v1"
        kind: "Deployment"
---
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-cost-center
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels:
      - key: cost-center
      - key: owner

Building the Developer Portal with Backstage

Backstage serves as the front door to your platform. Here's how to set it up effectively.

Installation and Configuration

# Create a new Backstage app
npx @backstage/create-app@latest

cd my-backstage-app

# Install useful plugins
yarn add --cwd packages/backend @backstage/plugin-kubernetes-backend
yarn add --cwd packages/app @backstage/plugin-kubernetes
yarn add --cwd packages/backend @backstage/plugin-scaffolder-backend

Service Catalog Configuration

# catalog-info.yaml (in each service repo)
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: my-api
  description: My API service
  annotations:
    github.com/project-slug: company/my-api
    backstage.io/kubernetes-id: my-api
    backstage.io/techdocs-ref: dir:.
  tags:
    - nodejs
    - api
  links:
    - url: https://my-api.staging.internal
      title: Staging
    - url: https://my-api.production.internal
      title: Production
spec:
  type: service
  lifecycle: production
  owner: team-a
  system: ecommerce
  dependsOn:
    - resource:default/my-api-db
  providesApis:
    - my-api
---
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
  name: my-api-db
  description: PostgreSQL database for my-api
spec:
  type: database
  owner: team-a
  system: ecommerce

Custom Actions for Scaffolder

// packages/backend/src/plugins/scaffolder/actions/provision-infrastructure.ts
import { createTemplateAction } from '@backstage/plugin-scaffolder-node';
import { Config } from '@backstage/config';

export const createProvisionInfraAction = (config: Config) => {
  return createTemplateAction<{
    serviceName: string;
    environment: string;
    database: string;
    cacheEnabled: boolean;
  }>({
    id: 'platform:provision-infrastructure',
    description: 'Provisions infrastructure for a new service',
    schema: {
      input: {
        required: ['serviceName', 'environment'],
        type: 'object',
        properties: {
          serviceName: {
            type: 'string',
            title: 'Service Name',
          },
          environment: {
            type: 'string',
            title: 'Environment',
            enum: ['development', 'staging', 'production'],
          },
          database: {
            type: 'string',
            title: 'Database Type',
            enum: ['postgresql', 'mysql', 'none'],
          },
          cacheEnabled: {
            type: 'boolean',
            title: 'Enable Redis Cache',
          },
        },
      },
    },
    async handler(ctx) {
      const { serviceName, environment, database, cacheEnabled } = ctx.input;

      ctx.logger.info(`Provisioning infrastructure for ${serviceName}`);

      // Call your platform orchestrator API
      const response = await fetch(
        `${config.getString('platform.orchestrator.url')}/provision`,
        {
          method: 'POST',
          headers: {
            'Content-Type': 'application/json',
            Authorization: `Bearer ${config.getString('platform.orchestrator.token')}`,
          },
          body: JSON.stringify({
            serviceName,
            environment,
            database,
            cacheEnabled,
          }),
        }
      );

      if (!response.ok) {
        throw new Error(`Failed to provision infrastructure: ${response.statusText}`);
      }

      const result = await response.json();
      
      ctx.logger.info(`Infrastructure provisioned: ${JSON.stringify(result)}`);
      
      // Output for subsequent steps
      ctx.output('namespaceUrl', result.namespaceUrl);
      ctx.output('databaseConnectionString', result.databaseConnectionString);
    },
  });
};

Measuring Platform Success

Key Metrics Dashboard

Track these metrics to measure platform effectiveness:

# grafana-dashboard.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: platform-metrics-dashboard
  labels:
    grafana_dashboard: "1"
data:
  platform-metrics.json: |
    {
      "title": "Platform Engineering Metrics",
      "panels": [
        {
          "title": "Developer Onboarding Time",
          "type": "stat",
          "targets": [
            {
              "expr": "avg(developer_first_commit_time_hours)"
            }
          ]
        },
        {
          "title": "Self-Service Success Rate",
          "type": "gauge",
          "targets": [
            {
              "expr": "sum(rate(platform_requests_success[24h])) / sum(rate(platform_requests_total[24h])) * 100"
            }
          ]
        },
        {
          "title": "Time to Production",
          "type": "stat",
          "targets": [
            {
              "expr": "avg(deployment_lead_time_minutes)"
            }
          ]
        },
        {
          "title": "Platform Adoption",
          "type": "graph",
          "targets": [
            {
              "expr": "count(kube_namespace_labels{platform_managed='true'})"
            }
          ]
        }
      ]
    }

Developer Experience Score (DXS)

Implement quarterly surveys with these questions:

  1. How easy is it to deploy a new service? (1-10)
  2. How often do you wait for infrastructure provisioning? (Never/Sometimes/Often)
  3. How clear is the platform documentation? (1-10)
  4. Would you recommend the platform to other teams? (NPS)
  5. What's the biggest pain point in your development workflow? (Open text)

Calculate DXS as a weighted average, with deployment ease and wait time weighted higher.


Common Pitfalls and How to Avoid Them

1. Building for Power Users First

Mistake: Designing the platform for your most sophisticated engineers.

Solution: Start with the 80% case. Build for the developer who just wants to deploy a standard web service. Add complexity only when clearly needed.

2. Mandatory Adoption

Mistake: Forcing teams to use the platform through mandates.

Solution: Make the platform so good that teams choose it voluntarily. If they don't, you have a product problem.

3. Ignoring Existing Workflows

Mistake: Requiring developers to completely change how they work.

Solution: Meet developers where they are. Support their existing tools (Git, VS Code, CLI) and gradually introduce new capabilities.

4. Over-Engineering the MVP

Mistake: Trying to build the perfect platform before launching.

Solution: Start with a Thinnest Viable Platform (TVP). Solve one painful problem well before expanding scope.

5. No Escape Hatches

Mistake: Forcing all use cases through standardized paths.

Solution: Provide escape hatches for legitimate edge cases. Document when and why to use them.


Getting Started: Your 90-Day Plan

Days 1-30: Foundation

  1. Interview developers: Identify top 3 pain points
  2. Choose your first golden path: Pick the most common service type
  3. Set up Backstage: Basic installation with service catalog
  4. Implement one self-service workflow: Service creation from template

Days 31-60: Expansion

  1. Add CI/CD integration: Automated pipelines for golden path services
  2. Implement basic policies: Security guardrails via OPA
  3. Set up observability: Basic dashboards for platform metrics
  4. Onboard pilot team: Get real feedback from early adopters

Days 61-90: Scale

  1. Add second golden path: Based on pilot team feedback
  2. Implement cost visibility: Show infrastructure costs per service
  3. Create documentation: Golden path guides and troubleshooting
  4. Expand adoption: Onboard 2-3 additional teams

Conclusion

Platform engineering isn't about building the most sophisticated infrastructureβ€”it's about removing friction from developer workflows. The best platforms are invisible; developers simply get their work done without thinking about the underlying complexity.

Start small, measure everything, and treat your developers as customers. The organizations that master platform engineering will have a significant competitive advantage: their developers will spend more time building products and less time fighting infrastructure.

The tools are mature. The patterns are proven. The only question is whether you'll be the one who builds the platform your organization needs.


Resources and Further Reading

Official Documentation

Community Resources

Books

  • "Team Topologies" by Matthew Skelton & Manuel Pais
  • "Platform Strategy" by Gregor Hohpe
  • "Building Evolutionary Architectures" by Neal Ford et al.

Frequently Asked Questions

How many engineers do I need for a platform team? Start with 2-3 engineers. A mature platform team typically serves 10-15 development teams, so plan for roughly 1 platform engineer per 5 dev teams.

Should we build or buy? Start with open-source tools (Backstage, Crossplane, ArgoCD) and build integrations. Consider commercial platform orchestrators only when you hit scale limitations.

How do we handle legacy applications? Don't force legacy apps onto the platform. Focus on new services and migrations that teams choose voluntarily.

What if developers resist the platform? Resistance usually means the platform doesn't solve real problems. Go back to user research and fix the product.

How do we measure ROI? Track developer hours saved, infrastructure costs, deployment frequency, and time-to-production. Most organizations see 30-50% improvement in developer productivity within the first year.


Have questions about platform engineering? Connect with me on LinkedIn or explore more at abdulkadersafi.com.


© Abdulkader Safi - SITEMAP - Privacy Policy