DevOps Engineer

4MindsAIDallas, TX
12h$140,000 - $160,000Onsite

About The Position

Mission 4Minds is an enterprise AI fine-tuning platform that transforms how organizations build and operate private, domain-specific AI. Unlike static systems, 4Minds’s AI platform learns continuously from live data in real time and can be deployed on-prem or your cloud provider. Our patented technologies scale existing engineering teams and empower new AI teams, enabling rapid AI deployment, adaptation, and ROI. Through 4Minds’s automated data pipeline and proprietary knowledge graph, enterprises can connect all their data sources, including Microsoft, Databricks, AWS and Google, creating adaptive AI that surpasses the capabilities of conventional RAG-based systems. Role Overview We're seeking a DevOps Engineer to build and maintain the infrastructure that powers our enterprise AI platform across cloud and on-premises environments. You'll design scalable deployment pipelines, ensure system reliability, and enable our engineering teams to ship faster while maintaining enterprise-grade security and compliance standards. You'll take on the infrastructure lifecycle from provisioning through monitoring for our frontend and backend of our platform and support our AI teams to optimize how we build, deploy, and run AI workloads at scale. Our hybrid deployment model, supporting both cloud and on-prem installations, creates unique challenges that require creative solutions. Reporting to our CTO, you'll have significant autonomy to collaborate on and establish DevOps practices, select tooling, and shape how 4Minds delivers reliable, secure AI infrastructure to enterprise customers.

Requirements

  • BS in Computer Science, Engineering, or related technical field
  • 5+ years of experience in DevOps, SRE, or infrastructure engineering roles
  • Strong proficiency with cloud platforms (AWS, GCP, or Azure), including compute, networking, and security services
  • Hands-on experience with Kubernetes in production environments, including deployment, scaling, and troubleshooting
  • Expertise with infrastructure-as-code tools (Terraform, Pulumi, CloudFormation, or similar)
  • Experience building and maintaining CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or similar)
  • Strong scripting skills in Python, Bash, or Go for automation
  • Solid understanding of networking fundamentals, including DNS, load balancing, and firewalls
  • Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
  • Ability to work autonomously and drive technical decisions in a fast-paced environment
  • Clear technical communication with both technical and non-technical stakeholders
  • Deep ownership mindset: you care about outcomes, not job titles

Nice To Haves

  • MS in Computer Science, Engineering, or related technical field
  • 7+ years of experience in DevOps, SRE, or infrastructure engineering roles
  • Experience supporting AI/ML infrastructure, including GPU clusters and model serving
  • Background with on-premises or hybrid cloud deployments for enterprise customers
  • Experience with data pipeline infrastructure (Kafka, Airflow, or similar)
  • Familiarity with security compliance frameworks (SOC 2, HIPAA, FedRAMP)
  • Track record of establishing DevOps practices and culture on engineering teams
  • Experience with service mesh technologies (Istio, Linkerd)
  • Contributions to open-source infrastructure projects
  • Previous enterprise software or B2B SaaS experience

Responsibilities

  • Design, implement, and maintain CI/CD pipelines for automated building, testing, and deployment of AI platform components
  • Manage infrastructure-as-code across AWS, GCP, Azure, and on-premises environments using Terraform, Pulumi, or similar tools
  • Build and maintain Kubernetes clusters optimized for AI/ML workloads, including GPU scheduling and resource management
  • Implement monitoring, logging, and alerting systems to ensure platform reliability and rapid incident response
  • Develop and enforce security best practices, including secrets management, access controls, and compliance automation
  • Collaborate with engineering teams to containerize applications and optimize deployment workflows
  • Create and maintain documentation for infrastructure, deployment procedures, and runbooks
  • Automate operational tasks to reduce toil and improve team velocity
  • Support enterprise customer deployments, including on-premises installations with unique infrastructure requirements
  • Optimize infrastructure costs while maintaining performance and reliability standards

Benefits

  • Comprehensive medical, dental, and vision coverage (80% employer-paid)
  • 401(k) plan with company match
  • Unlimited PTO policy with 15 days minimum
  • 11 paid company holidays
  • Flexible Spending Account (FSA) and Health Savings Account (HSA) options.
  • Annual training and certification budget
  • Access to online learning platforms
  • Conference attendance opportunities
  • Regular internal technical workshops and knowledge sharing sessions
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service