Principal Data Engineer – Safety Analytics (Global Medical Safety)

Johnson & Johnson Innovative MedicineHorsham, PA
Hybrid

About The Position

We are seeking a Principal Data Engineer to provide technical leadership within Global Medical Safety (GMS), supporting the Safety Analytics organization. This role is focused on building and enabling modern safety analytics tools using AI, Machine Learning, and GenAI, underpinned by robust, compliant, and scalable data engineering on Google Cloud Platform (GCP). The Principal Data Engineer is responsible for end-to-end ownership of safety analytics data engineering, spanning data intake, data quality and continuity, pipeline and architecture design, automation, performance optimization, and compliance. The role enables advanced analytical, machine learning, and predictive capabilities for pharmacovigilance and serves as a technical data engineering leader within Global Medical Safety. This is a Principal-level individual contributor role with broad technical influence, working closely with safety scientists, analytics teams, data scientists, IT, and platform partners to deliver trusted, production-grade analytics capabilities for safety decision-making.

Requirements

  • Master’s degree in Computer Science, Engineering, or a related field (or equivalent experience) is required.
  • 5+ years of experience in data engineering or analytics engineering with increasing responsibilities.
  • Proficient programming skills in Python and SQL.
  • Deep understanding of data architecture for analytics and ML (e.g., batch/streaming, modeling, performance optimization).
  • Proven ability to translate complex problems into clear, concise, and testable programming code/tools.
  • Experience implementing data contracts, data validation, schema versioning, and governance practices, as well as a solid understanding of leading cloud concepts (GCP preferred).
  • Experience designing and operating APIs and microservices-based architectures.
  • Excellent written and verbal communication, customer service, interpersonal, and teamwork skills to foster a collaborative team environment.
  • Solid understanding of SDLC and Agile methodologies, alongside basic project management skills.

Nice To Haves

  • Experience building production workloads on Google Cloud Platform (GCP) is preferred.
  • Experience provisioning infrastructure using Terraform (Infrastructure as Code) and building CI/CD pipelines (e.g., Jenkins) is preferred.
  • Experience in pharmaceuticals, life sciences, healthcare, or a related regulated domain is preferred.
  • GCP certification is preferred.
  • Experience enabling AI/ML and GenAI workflows (e.g., feature engineering, RAG patterns, semantic retrieval) for analytical applications is preferred.

Responsibilities

  • Design and maintain production-grade data pipelines and curated datasets that directly support pharmacovigilance activities, including safety monitoring, analytics, and regulatory reporting.
  • Ensure data engineering solutions produce reproducible, explainable, and trusted analytics outputs suitable for safety decision support and inspection readiness.
  • Enable AI/ML and GenAI workflows for safety analytics, including: Feature engineering and feature store enablement, Embeddings, vectorized representations, and semantic retrieval, Retrieval-Augmented Generation (RAG) patterns for safety analytics tools.
  • Own the end-to-end data lifecycle for safety analytics, from source system intake through transformation, serving, and downstream analytical consumption, ensuring data continuity, traceability, and integrity.
  • Lead architectural decisions across ingestion, transformation, storage, and serving layers on GCP (e.g., BigQuery, Dataform, object storage).
  • Design, implement, and automate scalable, reusable data pipelines and architectures to support evolving safety analytics needs.
  • Establish and enforce data quality, validation, lineage, and observability standards for safety analytics datasets.
  • Define and implement data governance practices, including data contracts, schema versioning, access control, stewardship, and lifecycle management.
  • Ensure safety analytics data and systems meet Global Medical Safety requirements for reliability, auditability, and regulatory use.
  • Apply GxP validation expertise to data pipelines, analytics services, and supporting infrastructure.
  • Partner with quality and compliance teams to implement CSV/CSA-aligned controls, audit trails, documentation, and organizational change.
  • Balance delivery velocity and innovation with the rigor required for regulated pharmacovigilance systems.
  • Design and build APIs and microservices-based architectures to operationalize safety analytics and ML capabilities (e.g., feature serving, retrieval services, analytics backends).
  • Deploy and operate services on GCP (e.g., Cloud Run, GKE) with a strong focus on security, scalability, and observability.
  • Enforce contract-first integration patterns between producing and consuming systems to ensure reliability and safe evolution.
  • Provision and manage cloud infrastructure using Terraform (Infrastructure as Code) on GCP.
  • Build and maintain CI/CD pipelines (e.g., Jenkins) for data pipelines, analytics services, feature pipelines, and ML data assets.
  • Continuously optimize the performance and cost efficiency of data and analytics infrastructure while maintaining compliance and reliability standards.
  • Serve as a technical authority and data engineering leader for Safety Analytics within Global Medical Safety.
  • Review and influence designs across pipelines, services, feature stores, and AI/ML integrations to maintain a high technical bar.
  • Collaborate closely with safety scientists, epidemiologists, biostatisticians, analytics teams, IT, and platform partners to translate safety needs into scalable technical solutions.
  • Communicate complex technical concepts and tradeoffs clearly to both technical and non-technical stakeholders.
  • Enable and upskill teams through mentorship, guidance, and knowledge sharing on modern data, cloud, and AI technologies.

Benefits

  • Vacation –120 hours per calendar year
  • Sick time - 40 hours per calendar year; for employees who reside in the State of Colorado –48 hours per calendar year; for employees who reside in the State of Washington –56 hours per calendar year
  • Holiday pay, including Floating Holidays –13 days per calendar year
  • Work, Personal and Family Time - up to 40 hours per calendar year
  • Parental Leave – 480 hours within one year of the birth/adoption/foster care of a child
  • Bereavement Leave – 240 hours for an immediate family member: 40 hours for an extended family member per calendar year
  • Caregiver Leave – 80 hours in a 52-week rolling period
  • 10 days Volunteer Leave – 32 hours per calendar year
  • Military Spouse Time-Off – 80 hours per calendar year
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service