About The Position

We're looking for a Senior Infrastructure Engineer to lead the design, implementation, and ongoing evolution of our cloud infrastructure. This is a role for someone who brings both technical depth and genuine curiosity — someone who sees an opportunity to improve something and goes after it, who takes problems end-to-end, and who makes the engineers around them better just by being present. You'll partner closely with engineering leadership and cross-functional teams to ensure our infrastructure strategy stays ahead of our product ambitions. You'll also be a key voice in shaping how we build — our standards, our reliability posture, and our operational culture.

Requirements

  • 5–7+ years of hands-on experience in cloud infrastructure, DevOps, or Site Reliability Engineering (SRE)
  • Expert-level AWS knowledge: EC2, ELB, ASG, RDS, S3, SQS, Lambda, IAM, VPC, CloudFormation, CDK, and Route 53
  • Solid understanding of Linux administration and networking concepts (VPNs, VPC peering, NAT, DNS, firewalls)
  • Deep experience with Infrastructure as Code (IaC) for container orchestration at scale using CloudFormation, AWS CDK, Docker, and Kubernetes
  • Excellent written and verbal English communication — you can translate tradeoffs for engineers and stakeholders alike
  • Comfortable working remotely and independently
  • Moderate-level Azure experience, with comfort operating across multi-cloud and multi-region environments
  • Expertise in CI/CD pipelines (AWS CodePipeline, GitLab CI, or similar)
  • Strong proficiency in scripting and automation using Python, Bash, and Ansible
  • Deep understanding of monitoring and logging with systems and strategies (AWS Cloudwatch, Datadog, Azure Monitor)
  • Hands-on experience with high-availability architectures and auto-scaling strategies
  • Solid grasp of AWS security best practices: IAM, encryption, Secrets Manager, and security auditing
  • Experience with databases (MySQL, Postgres, Redshift)
  • Familiarity with serverless architectures (AWS Lambda, Fargate)
  • Knowledge of database replication strategies

Nice To Haves

  • Experience in a high-growth, regulated industry (Fintech), specifically in architecting and scaling infrastructure to maintain reliability and compliance under rapid user and transaction volume growth
  • Proven technical leadership experience, including improving infrastructure processes
  • AWS Professional-level certifications (Solutions Architect Professional, DevOps Engineer Professional)
  • Knowledge of event-driven architectures (SNS, SQS, EventBridge)
  • Experience in cost optimization strategies for AWS environments
  • Experience in compliance frameworks (SOC, NIST, ISO, CCPA, GDPR)
  • Experience integrating AWS services with third-party tools for observability and security
  • Experience with single-tenant and multi-tenant architectures, as well as client on-premises deployment systems
  • Familiarity with scalable, reproducible ML pipelines, tools, and frameworks (Kubeflow, MLflow, Amazon SageMaker)
  • Data engineering experience and large-scale data processing and storage
  • Knowledge of Jira and Confluence, including best practices for ticket management and KPI tracking

Responsibilities

  • Lead the design and implementation of scalable, secure, and resilient cloud infrastructure across AWS and Azure, supporting both Candidly's AI and SaaS products
  • Drive the architectural vision and strategy, ensuring alignment with long-term business goals and surfacing risks before they become problems
  • Own and enforce best practices for infrastructure as code (IaC), CI/CD, and automated deployments
  • Take the lead on automating and accelerating SDLC processes — identifying bottlenecks in how we build and ship, and designing solutions that make the whole pipeline faster and smoother, whether that involves AI-assisted tooling or traditional automation
  • Serve as a subject matter expert on cloud architecture, containerization, and observability
  • Lead incident response and post-mortems with a focus on systemic improvement, not just immediate fixes
  • Proactively identify and close gaps before they compound
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service