About The Position

The GEHC Advanced Visualization Solutions (AVS) segment, a fast-growing business in GE HealthCare, is the global leader in ultrasound medical devices and solutions. The portfolio spans the continuum of care to enable customers with ultrasound screening, diagnosis, treatment and monitoring of diseases. Our customers are seeking to improve efficiency in radiology and beyond and increase user confidence to provide better clinical outcomes continues to grow. Consequently, the need for AI, digital solutions, and automation, connecting devices and software in one seamless ecosystem continues to proliferate. The Lead DevOps Engineer architects, secures, and operates multi-cloud infrastructure (GCP and AWS) that powers ML research, model training/inference, and production software for ultrasound image analysis. This Engineer is the technical owner for our cloud platform—designing scalable environments, enabling high-throughput data operations, optimizing cost/performance, and partnering closely with ML researchers, data engineers, and application teams. This role combines hands-on engineering with technical leadership, with strong emphasis on data governance, security/compliance (e.g., HIPAA), and ML platform reliability.

Requirements

  • 7+ years in DevOps/SRE/Platform roles, including multi-cloud (AWS/Azure/GCP) experience.
  • Deep proficiency with Terraform, CI/CD (GitHub Actions/GitLab/CodeBuild/Cloud Build), and Kubernetes (EKS/GKE).
  • Hands-on experience with GPU workloads for ML training/inference and object storage patterns for large image datasets.
  • Proven track record in data migration (cloud-to-cloud), structured data ingestion (e.g., BigQuery/Redshift/Postgres), and schema/governance.
  • Strong security mindset: IAM, secrets, KMS, network isolation, private endpoints, encryption, auditability.
  • Demonstrated cost optimization (FinOps) across compute/storage/networking with measurable savings.
  • Excellent cross-functional communication; ability to lead architectural direction and mentor engineers.

Nice To Haves

  • Experience with Vertex AI and/or SageMaker
  • Knowledge of medical imaging formats (DICOM), de-identification, and regulated environments (HIPAA, SOC 2).
  • Observability stacks: Cloud Monitoring/Logging, Prometheus/Grafana, OpenTelemetry.
  • Container security and supply chain: SBOMs, image signing (Cosign), policy enforcement (OPA/Gatekeeper).
  • Proven ability to sunset legacy environments and perform compliant archival and data retention.
  • Scripting and tooling in Python; CLIs and SDK automation for AWS/GCP.

Responsibilities

  • Partner with ML research, data engineering, and application teams to translate requirements into reliable, secure, and cost-effective platform capabilities.
  • Lead design reviews, RFCs, and proof-of-concepts; mentor team members on cloud, Kubernetes, and data best practices.
  • Own incident response for platform components and drive continuous improvement through automation and standards.
  • Design and implement secure, scalable, multi-cloud (GCP + AWS) configurations
  • Establish and maintain infrastructure as code (IaC) standards with Terraform
  • Lead cloud-to-cloud data migration (e.g., GCS ↔ S3) including secure transfer planning, checksum/manifest validation, parallelization, and cutover strategy.
  • Implement robust ingestion pipelines for medical images and metadata into structured data stores (e.g., BigQuery/Redshift/Postgres) with schema management, versioning, and data lineage.
  • Create tools/services for dataset definition, preprocessing, curation, de-identification, and data quality checks.
  • Architect and manage GPU/CPU clusters for distributed training and batch inference using managed services (e.g., SageMaker) and/or Kubernetes (EKS with autoscaling).
  • Optimize storage tiers (S3/GCS, Glacier/Archive, Filestore/FSx, EBS/PersistentDisk) and caching strategies for high-throughput image workloads.
  • Establish cost observability (per team/project/workload) with budgets, alerts, showback/chargeback, and automated idle resource cleanup.
  • Right-size compute/storage, leverage reserved/committed usage, spot/preemptible strategies, and data lifecycle policies.
  • Partner with ML teams to optimize training job efficiency (e.g., mixed precision, checkpointing strategies, data locality, sharding) and autoscaling.
  • Own permissions and access management across clouds (AWS IAM, GCP IAM) with least privilege, role/attribute-based access, and service identities.
  • Implement secrets management (e.g., AWS Secrets Manager, GCP Secret Manager, HashiCorp Vault) and key management (KMS).
  • Support compliance and security controls relevant to healthcare/PHI (e.g., HIPAA, SOC 2): encryption in transit/at rest, audit logging, VPC Service Controls, private endpoints, and incident response runbooks.
  • Plan and execute winddown and exit from prior cloud providers: data egress, dependency mapping, app cutover, contract/savings plan termination, and archival with retention policies.
  • Validate post-migration integrity and performance; document the final state and reduce operational surface area.
  • Stand up and maintain managed ML platforms (Vertex AI, SageMaker) or managed Kubernetes clusters (GKE/EKS) with CI/CD for pipelines, images, and deployments.
  • Provide platform abstractions (templates, Helm charts, Terraform modules) for ML engineering and app teams to self-serve safely.
  • Partner with data/ML teams to codify data management practices: versioned datasets, reproducible preprocessing, clear lineage, and documentation.
  • Build internal tools/CLIs to automate data prep, dataset validation, and catalog updates; integrate with governance/catalog platforms where applicable.

Benefits

  • GE HealthCare offers a competitive benefits package, including not but limited to medical, dental, vision, paid time off, a 401(k) plan with employee and company contribution opportunities, life, disability, and accident insurance, and tuition reimbursement.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service