DevOps engineer

Centific

About The Position

Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. We harness the power of an integrated solution ecosystem—comprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 markets—to create contextual, multilingual, pre-trained datasets; fine-tuned, industry-specific LLMs; and RAG pipelines supported by vector databases. Our zero-distance innovation™ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster. Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale, helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.

Requirements

  • Strong experience operating and supporting Azure-based production platforms.
  • Proven expertise in Site Reliability Engineering (SRE) or DevOps practices.
  • Advanced PowerShell scripting skills.
  • Hands-on experience designing and operating CI/CD pipelines, using Azure DevOps or equivalent tooling.
  • Experience designing and implementing automation for infrastructure and operations at enterprise scale.
  • Strong background in Windows-based platforms and infrastructure.
  • Experience with SQL Server and Azure SQL.
  • Ability to design and build internal tools or services.
  • Experience with monitoring, logging, and telemetry systems.
  • Strong collaboration and communication skills.

Responsibilities

  • Operating and supporting Azure-based production platforms, including ownership of mission-critical services.
  • On-Call rotations, live-site incident response, and post-incident remediation (RCA).
  • SLA/SLO management, reliability engineering, performance tuning, and proactive resiliency improvements.
  • Building secure, scalable automation frameworks and tooling used across large environments.
  • Designing and operating CI/CD pipelines, using Azure DevOps or equivalent tooling, to support deployment, patching, remediation, and operational workflows.
  • Designing and implementing automation for infrastructure and operations, such as VM patching, system remediation, service validation, or data fixes, at enterprise scale.
  • Windows-based platforms and infrastructure, including Windows Server, IIS, system patching, monitoring, and troubleshooting in production environments.
  • SQL Server and Azure SQL, including availability groups, managed instance migrations, and operational support or remediation workflows.
  • Designing and building internal tools or services, using technologies such as PowerShell, WPF, or web frameworks, to reduce operational toil and improve engineering efficiency.
  • Monitoring, logging, and telemetry systems, including diagnosing issues using logs and metrics, and reducing operational noise through alert optimization or suppression.
  • Collaborating cross-functionally with Security, Compliance, and partner engineering teams to deliver secure, compliant, and reliable solutions.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service