How I Built This

A technical deep-dive into the infrastructure and CI/CD pipeline powering this portfolio

Architecture Overview

Developergit push
git push
GitHub Actions CI/CD — OIDC-authenticated (no static AWS keys)
Lint & Audit
Build Image
Push to Hub
Trivy Scan
Update Helm
parallel: build-ami.yml →
Packer Build
Tag AMI
Retain 3
📦Docker Hubmulti-arch image
argocd-app-of-appsparent Application
eks-helm-charts4 child charts
ArgoCDGitOps Engine
AWS EC2 t4g.medium (ARM Graviton) — k3s — Packer-built AMI — IMDSv2 required
🔒
Envoy Gateway “public” — TLS terminate
multi-SAN cert: fuhriman.org · www.fuhriman.org · argocd.fuhriman.org
HTTPRoute fuhriman-website
HTTPRoute argocd-server
Kubernetes Workloads
🔒cert-managerLet's Encrypt
🌐envoy-gatewaycontroller
🎯external-dnsRoute53 sync
fuhriman-websiteNext.js
Visitor → Route53 (fuhriman.org) → EIP 52.37.95.130 → Envoy Gateway → HTTPRoute → Workload
Admin → AWS SSM Session Manager → EC2 shell / kubectl tunnel — no SSH, no public k8s API
https://fuhriman.org

Cost-Optimized Design: A single t4g.medium EC2 (ARM Graviton, 4GB RAM, ~$24/mo) runs k3s instead of managed EKS. Plus an Elastic IP, Route53 zone, S3 state backend, monthly EBS snapshots, and Packer-baked AMI storage. Total: ~$31/mo, well under the $40/mo budget alert and ~60% cheaper than EKS at this scale.

Technology Stack

Frontend

Next.js

React framework for the website

Container

Docker

Multi-stage build → distroless runtime, multi-arch (amd64 + arm64)

Orchestration

k3s

Lightweight Kubernetes on a single EC2

Networking

Envoy Gateway

Gateway API implementation handling TLS termination and routing

TLS

cert-manager

Let's Encrypt via gatewayHTTPRoute solver

DNS

ExternalDNS

Writes Route53 records from Gateway/HTTPRoute annotations

GitOps

ArgoCD

App-of-Apps continuous deployment from Git

IaC

Terraform

AWS infrastructure with S3 state + native locking

Immutable Infra

Packer

Pre-bakes k3s + helm into a versioned AMI; ~60s cold-start

CI/CD

GitHub Actions

OIDC-authenticated builds; multi-arch image + AMI pipelines

Admin Access

AWS SSM

Session Manager: no SSH, no inbound 22/6443

Compute

AWS Graviton

t4g.medium (ARM) — ~20% cheaper than equivalent x86

Backups

AWS DLM

Native EBS snapshot lifecycle — monthly × 3 retention

DNS

Route53

Public zone; Squarespace delegates via NS records

Infrastructure as Code

The entire AWS infrastructure is defined in Terraform, organized into reusable modules:

terraform/
terraform/
├── tf-modules/
│   ├── aws-vpc/                # VPC, public subnet, IGW, route table
│   ├── aws-k3s/                # EC2 t4g.medium, EIP, SG, IAM role w/ SSM,
│   │                           #   user_data.sh runtime bootstrap
│   └── aws-dns/                # Route53 public hosted zone for fuhriman.org
├── packer/
│   ├── k3s-portfolio.pkr.hcl   # AL2023 arm64 + k3s + helm + ssm-agent
│   └── scripts/                # Provisioner scripts
├── .github/workflows/
│   └── build-ami.yml           # OIDC-auth Packer builds + 3-AMI retention
├── docs/plans/                 # Architecture design + manual-steps docs
├── main.tf                     # Module composition + IAM policies
├── backend.tf                  # S3 + native use_lockfile (no DynamoDB)
├── budget.tf                   # $40/mo AWS budget alert
├── dlm.tf                      # Monthly EBS snapshots × 3 retention
├── oidc.tf                     # GitHub Actions OIDC trust + Packer IAM role
├── providers.tf                # AWS provider ~> 6.31 with default_tags
└── variables.tf                # Configuration with validation blocks

VPC Module

Simple VPC (10.0.0.0/16) with a single public subnet in one AZ. No NAT Gateway needed — everything runs in the public subnet.

k3s Module

Single t4g.medium (4GB ARM Graviton) launched from a Packer-baked AMI with k3s, helm, and the SSM Agent pre-installed. The runtime user_data.shis just ~55 lines: fetch the public IP from IMDSv2, wire k3s's --tls-san, install ArgoCD, hand off to App-of-Apps. Cold-start to argocd-server Running: ~60 seconds.

DNS Module

A single Route53 public hosted zone for fuhriman.org. Squarespace is the registrar only — NS records delegate to Route53. ExternalDNS in-cluster manages records automatically based on HTTPRoute hostnames.

Zero-Trust Admin Access

There's no SSH server reachable from the internet. There's no public kube-apiserver. Admin happens entirely through AWS Systems Manager Session Manager.

Security Group

Inbound: only 80 and 443. No 22 (SSH), no 6443 (k8s API), no NodePort 30443. The EC2 instance has no aws_key_pair resource at all.

Interactive Shell

aws ssm start-session --target $INSTANCE_ID. IAM-authenticated, CloudTrail-audited. The EC2 instance role has AmazonSSMManagedInstanceCore attached.

kubectl via SSM Tunnel

start-session --document-name AWS-StartPortForwardingSession forwards localhost:6443 over SSM to the in-cluster API server. Local kubectl then runs against a kubeconfig that points at localhost. No public k8s API needed.

IMDSv2 Enforced

Instance metadata requires http_tokens=required withhttp_put_response_hop_limit=2. Defeats SSRF-style attacks that could otherwise reach IMDS via a compromised pod.

Routing with Gateway API

The cluster doesn't use Ingress resources at all. Routing is handled by the Kubernetes Gateway API(GA since 1.29) implemented by Envoy Gateway. ExternalDNS reads HTTPRoute resources and publishes Route53 records automatically; cert-manager issues Let's Encrypt certs via the gatewayHTTPRoute HTTP-01 solver.

GatewayClass + Gateway

Single GatewayClass named envoy, controlled by Envoy Gateway. One shared Gateway named public in envoy-gateway-system with HTTP :80 and HTTPS :443 listeners that terminate TLS using a multi-SAN cert.

HTTPRoute per Service

The website chart declares fuhriman.org + www.fuhriman.org as HTTPRoute hostnames attaching to the public Gateway. The ArgoCD chart adds argocd.fuhriman.org the same way. ExternalDNS picks both up.

Multi-SAN Let's Encrypt Cert

One Certificate resource covers all three hostnames. cert-manager issues via HTTP-01, creating a temporary HTTPRoute through the public Gateway for the ACME challenge. Auto-renews 30 days before expiry. R13 intermediate.

klipper-lb + EIP Override

k3s ships klipper-lbas its default Service LoadBalancer, which advertises the node's private IP — not what we want ExternalDNS publishing to Route53. The fix is one annotation on the Gateway: external-dns.alpha.kubernetes.io/target: 52.37.95.130 (the Elastic IP). One line, no extra LoadBalancer controller.

GitOps with ArgoCD

ArgoCD implements the GitOps pattern where Git is the single source of truth for the desired cluster state.

1

App of Apps Pattern

A parent Application bootstrapped by user_data.sh manages four child Applications: cert-manager, envoy-gateway, external-dns, fuhriman-website.

2

Sync Waves

cert-manager (-2) installs first (it owns the cert CRDs). envoy-gateway (-1) follows. external-dns + fuhriman-website (0) deploy together. Wave numbers guarantee dependency order.

3

Auto-Sync & Self-Heal

ArgoCD automatically applies Git changes and reverts any manual cluster modifications back to the declared state.

CI/CD Pipeline

Every push to main triggers a fully automated build and deployment pipeline:

1

Quality Gates

Six jobs run in parallel: Biome 2 (lint + format), TypeScript, Vitest with a 95% coverage gate, Next.js build, Playwright smoke, and Lighthouse CI. All must pass.

2

Build & Push (Multi-Arch)

Multi-stage Docker build with QEMU emulation produces a multi-arch image (linux/amd64 + linux/arm64) → distroless runtime. Pushed to Docker Hub with a timestamp tag (ga-YYYY.MM.DD-HHMM) and latest.

3

Scan

Trivy v0.69.3 (SHA-pinned) scans the pushed image for CRITICAL and HIGH CVEs with ignore-unfixed enabled. The pipeline fails if any fixable vulnerabilities surface.

4

Update

yq updates fuhriman-chart/values.yaml in eks-helm-charts with the new image tag; ArgoCD detects the commit and syncs the change to the k3s cluster.

.github/workflows/build-deploy.yaml
name: Build and Deploy
on:
  push:
    branches: [main]

permissions:
  contents: read           # Least-privilege security

jobs:
  # Six parallel quality gates — all must pass before docker runs
  lint:        # biome check (lint + format)
  typecheck:   # tsc --noEmit
  test:        # vitest run --coverage  (95/95/95/95 gate)
  build:       # next build --output standalone
  e2e:         # playwright against built standalone
  lighthouse:  # perf >= 90, a11y >= 0.95, BP >= 95, SEO >= 95

  docker:
    needs: [lint, typecheck, test, build, e2e, lighthouse]
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@11bd7190...           # SHA-pinned
      - uses: pnpm/action-setup@a7487c7e...          # corepack-pinned pnpm
      - run: echo "tag=ga-$(date +'%Y.%m.%d-%H%M')" >> $GITHUB_OUTPUT

      - uses: docker/login-action@650006c6...
      - uses: docker/setup-qemu-action@49b3bc8e...   # arm64 emulation
      - uses: docker/setup-buildx-action@d7f5e7f5...
      - uses: docker/build-push-action@f9f3042f...
        with:
          push: true
          platforms: linux/amd64,linux/arm64         # Multi-arch for Graviton
          tags: furryman/fuhriman-website:${{ steps.tag.outputs.tag }},latest

      # Trivy v0.69.3 (binary pinned; addresses GHSA-69fq-xp46-6x23)
      - uses: aquasecurity/trivy-action@a9c7b0f0...  # SHA-pinned
        with:
          severity: CRITICAL,HIGH
          ignore-unfixed: true
          exit-code: 1

  deploy:
    needs: docker
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@11bd7190...
        with:
          repository: furryman/eks-helm-charts
          token: ${{ secrets.GH_PAT }}              # repo scope — fires downstream workflows
      - run: yq -i '.image.tag = "..."' fuhriman-chart/values.yaml
      - run: git commit -am "Bump image" && git push   # ArgoCD picks this up

Immutable Infrastructure with Packer

The EC2 instance launches from a custom AMI built by Packer. k3s, helm, the SSM Agent, and Helm repo caches are pre-baked. user_data.shonly does what depends on the running instance's identity.

~60-Second Cold-Start

k3s, helm, ssm-agent, and the helm repo cache are pre-baked into the AMI. First boot is purely runtime-specific: fetch the IMDSv2 public IP for k3s's --tls-san, install ArgoCD, hand off to App-of-Apps. Instance launch to argocd-server Running: ~60 seconds. Full convergence with all certs issued: ~3 minutes.

OIDC, Not Long-Lived Keys

The build-ami.yml workflow assumes an IAM role via GitHub OIDC. github-actions-packer has a trust policy scoped to repo:furryman/terraform:*. No AWS access keys live in GitHub Secrets.

3-AMI Retention

The workflow's last step lists Packer-tagged AMIs and deregisters everything beyond the 3 most recent (snapshot cleanup included). Storage stays bounded at ~$0.30/mo for AMI snapshots, dedup'd against existing DLM snapshots.

Backups & Observability Tradeoffs

A single-node portfolio cluster doesn't need everything a production fleet does. Two deliberate calls: keep backups cheap and visible, and don't pay for observability that nothing acts on.

AWS DLM — Monthly × 3 EBS Snapshots

aws_dlm_lifecycle_policy (Data Lifecycle Manager, native AWS — no third-party scheduler) takes a snapshot of the root EBS volume on the 1st of each month at 04:00 UTC and retains the 3 most recent. Cost: pennies per month. Recovery time: a few minutes to launch a new instance from a chosen snapshot. DLM was picked over AWS Backup for cost (DLM has no per-protected-resource pricing) and over Velero/restic for simplicity (no in-cluster moving parts).

$40/mo Budget Alert

An AWS Budget watches actual spend against the cost model (~$31/mo target, $40/mo alert). If anything regresses — orphaned EIPs, runaway DLM snapshots, an instance-type drift — email lands before the AWS bill does.

No Prometheus (by choice)

A Prometheus + Grafana stack would add ~512 MiB of memory pressure to a 4 GB node and would never be acted on for a portfolio site. Chart values disable Prometheus metric emitters across envoy-gateway and external-dns. If something genuinely breaks, kubectl logs and CloudWatch Container Insights for the ec2-level signals are enough. This is a deliberate tradeoff, not negligence.

Kubernetes Resources

The website runs as a Deployment with associated Service and HTTPRoute resources:

Deployment

  • 1 replica (sufficient for single-node cluster)
  • Resource limits: 100m CPU, 128Mi memory
  • Liveness and readiness probes on port 3000
  • Rolling update strategy with health checks
  • Multi-arch image — runs on the Graviton instance

Service

  • ClusterIP type for internal access
  • Port 80 → target port 3000
  • Label selector for pod discovery

HTTPRoute (Gateway API)

  • Attaches to the shared public Gateway via parentRefs
  • Hostnames: fuhriman.org, www.fuhriman.org
  • TLS terminates at the Gateway (not the Service)
  • Path prefix / → backend Service port 80
  • ExternalDNS reads the hostnames and writes Route53 A records

Repository Structure

The project is organized across 4 repositories following separation of concerns:

Key DevOps Principles

Infrastructure as Code

All infrastructure is version-controlled in Terraform, enabling reproducible deployments and peer review of changes. Variables have validation blocks; providers are pinned with ~> constraints.

GitOps

Git is the single source of truth. All changes flow through commits, providing audit trails and rollback capabilities. ArgoCD's self-heal reverts manual cluster edits automatically.

Immutable Infrastructure

Both container images and the host AMI are immutable artifacts with versioned tags. Packer rebuilds the AMI; user_data_replace_on_change=true means a new bootstrap script always lands on a fresh instance.

Declarative Configuration

Desired state is declared in YAML (HTTPRoutes, Applications, Certificates, Gateways). Kubernetes and ArgoCD continuously reconcile actual state to match.

Cost Optimization

k3s on a single t4g.medium (ARM Graviton) keeps the bill at ~$31/mo vs ~$80+ for managed EKS at equivalent scale. ARM saves ~20% over x86 at the same memory tier with no observable performance loss for this workload.

Least Privilege & Zero Trust

SSM-only admin (no SSH, no public k8s API). IMDSv2 enforced. GitHub Actions OIDC instead of long-lived keys. IAM policies scoped narrowly (ExternalDNS to one zone, Packer to specific EC2 actions).