The ideal candidate will have 7+ years of experience in Linux systems and software management, expertise with Terraform, Ansible, and cloud platforms like AWS, Azure, and GCP. Experience with large-scale distributed systems, monitoring/alerting systems (Prometheus, Grafana), CI/CD pipelines, container orchestration (Docker, Kubernetes), and programming languages (Go, Java, Python) is essential. Because we are an AI-first company, this role also heavily involves engineering scalable infrastructure for machine learning workloads, including GPU provisioning and MLOps integrations. A background in implementing security controls, automating deployments, and troubleshooting complex systems is also required.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
101-250 employees