About the role: We’re hiring our first SRE to build the operational foundation of our platform. You’ll own compliance readiness, security posture, and production reliability across our AWS + Kubernetes environment and our application stack (Next.js/Vercel, Sentry, Postgres). We deploy and manage services using Porter ( porter.run ) and infrastructure-as-code via Terraform .This is a hands-on role for someone who can set direction, implement guardrails, and build scalable systems and processes without slowing product delivery. Core responsibilitiesCompliance (Core) Lead audit readiness for frameworks such as SOC 2 (and HIPAA-aligned controls as needed): define controls, implement them, and run evidence collection. Establish repeatable processes for access reviews, change management, incident management, vendor risk management, and secure SDLC practices. Automate compliance workflows where possible (continuous controls monitoring, evidence generation, audit trails, policy templates). Security (Core) Own cloud security architecture in AWS and Kubernetes : least-privilege IAM/RBAC, network segmentation, encryption standards, secrets management, and secure defaults. Harden Kubernetes workloads: cluster baseline security, namespace boundaries, pod security standards, image provenance/scanning, and secure service-to-service communication. Implement and tune security monitoring and incident response: centralized logging, actionable alerts, runbooks, on-call workflows, and post-incident reviews. Drive vulnerability management across infra and app dependencies: patching, dependency scanning, container image scanning, and configuration drift detection. Partner with engineering on threat modeling for major features and high-risk changes. Reliability (Core) Define and own SLIs/SLOs, establish operational KPIs, and introduce error budgets where appropriate. Improve observability across AWS + Kubernetes + apps using Sentry and monitoring best practices (metrics, logs, tracing, dashboards, alert routing). Own production operations for Postgres : backups/restores, replication strategy, migration safety, performance tuning, and capacity planning. Build resilience: disaster recovery planning, recovery testing, high-availability patterns, and graceful degradation. Infrastructure, Kubernetes & Delivery Enablement Own infrastructure-as-code using Terraform : module standards, environment structure, state management, reviews, and guardrails. Own the platform layer around Kubernetes and Porter ( porter.run ) : cluster lifecycle practices, environment management, deployment workflows, and reliability of the delivery pipeline. Improve CI/CD and deployment safety: progressive delivery, rollbacks, environment parity, and release observability.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed