About The Position

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The Enterprise Information Security (EIS) team is responsible for cybersecurity across the organization, supporting the business and members by reducing risk, rapidly responding to threats, focusing on business resiliency, and securing new acquisitions. The Principal AI / Machine Learning Data Engineer role focuses on designing and building scalable data platforms that enable advanced analytics, machine learning, and AI-driven solutions. This role will support the development of intelligent systems that process large-scale event and operational data, enabling faster insights, automation, and decision-making across the organization. This position sits at the intersection of data engineering, machine learning, and AI, with an emphasis on building modern data pipelines and enabling production-grade AI capabilities.

Requirements

  • Bachelor’s degree or equivalent experience
  • 5+ years of experience designing, building, and operating production data pipelines and platforms
  • 5+ years of hands-on development with Python (preferred) and/or Java, including code reviews, packaging, and deployment
  • 5+ years of experience with Spark (PySpark) and Databricks (or similar distributed data processing platform)
  • 2+ years of experience leveraging and deploying Generative AI use cases to production environments
  • Solid SQL skills and experience working with data lakes and warehouses (e.g., Databricks, Snowflake)
  • Experience building ingestion frameworks for structured and unstructured data (e.g., event/log, semi-structured JSON), including parsing and enrichment patterns
  • Experience designing and scaling ELT/ETL frameworks with orchestration tools such as Airflow (or equivalent)
  • Experience implementing data quality, observability, and monitoring practices (e.g., data quality checks, pipeline SLAs/SLOs, alerting)
  • Experience with metadata, lineage, and governance concepts and tooling (e.g., data catalogs, lineage, access controls)
  • Experience with data modeling best practices for analytics and ML use cases
  • Experience with DevOps and CI/CD practices and tools (e.g., GitHub Actions), containerization, and infrastructure-as-code (e.g., Docker, Kubernetes, Terraform)
  • Experience supporting ML/AI workflows (feature engineering, data preparation, and model deployment enablement); exposure to MLOps practices is a plus
  • Demonstrated ability to partner with cross-functional stakeholders, translate requirements into technical solutions, and lead through influence

Nice To Haves

  • Experience with cloud platforms such as AWS, Azure, or Google Cloud, including managed data services
  • Experience with streaming and event-driven architectures (e.g., Kafka, Kinesis, Event Hubs)
  • Experience with data quality and validation frameworks (e.g., Great Expectations, Deequ) and/or data observability tooling
  • Experience enabling MLOps practices (e.g., feature stores, model registries, experiment tracking, deployment automation)
  • Experience with lakehouse architectures, Delta Lake, and advanced Spark optimization/performance tuning
  • Experience with data visualization tools and libraries such as Plotly, seaborn, and Chartjs
  • Experience with machine learning and predictive analytics
  • Familiarity with security and privacy concepts for data platforms (e.g., least privilege, PII/PHI handling) and working with compliance partners

Responsibilities

  • Design, develop, and maintain scalable data pipelines and data platforms supporting analytics, machine learning, and AI use cases
  • Build and optimize ingestion frameworks for large-scale structured and unstructured data, including streaming and event-driven sources
  • Partner with cross-functional stakeholders to understand evolving data and AI needs and define long-term technical solutions
  • Enable and support machine learning and AI workflows, including feature engineering, data preparation, and model deployment support
  • Drive strategic initiatives around Generative AI, data quality, observability, lineage, and governance
  • Develop and maintain frameworks that support rapid experimentation and deployment of AI/ML solutions
  • Introduce and evolve best practices in data modeling, orchestration, testing, and monitoring
  • Identify and champion opportunities for platform scalability, performance optimization, and cost efficiency
  • Collaborate with product, analytics, and infrastructure teams to deliver high-impact data and AI solutions
  • Build and maintain reusable parsing, enrichment, analytic, and service libraries to accelerate delivery across teams
  • Work comfortably under time-sensitive conditions while ensuring thoroughness
  • Maintain high ethical standards and the ability to remain objective and confidential

Benefits

  • comprehensive benefits package
  • incentive and recognition programs
  • equity stock purchase
  • 401k contribution

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Principal

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service