Staff Data Engineer - AIOps

American ExpressPhoenix, AZ
8h

About The Position

At American Express, our culture is built on a 175-year history of innovation, shared values and Leadership Behaviors, and an unwavering commitment to back our customers, communities, and colleagues. From delivering differentiated products to providing world-class customer service, we operate with a strong risk mindset, ensuring we continue to uphold our brand promise of trust, security, and service. As part of Team Amex, you'll experience this powerful backing with comprehensive support for your holistic well-being and many opportunities to learn new skills, develop as a leader, and grow your career. Here, your voice and ideas matter, your work makes an impact, and together, you will help us define the future of American Express.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, Data Science, or equivalent practical experience; advanced degree preferred
  • Strong knowledge of machine learning fundamentals, including supervised/unsupervised learning, time-series, NLP, and model evaluation techniques
  • Hands-on knowledge of Generative AI and LLM ecosystems, including transformers, embeddings, vector databases, prompt engineering, RAG patterns, and agentic frameworks
  • Deep understanding of data platform and storage technologies, including relational, NoSQL, columnar, graph, and vector stores
  • Knowledge of distributed systems and cloud-native architectures, including containerization, orchestration, and service-based design
  • Familiarity with model governance, explainability, bias detection, and AI risk management in enterprise environments
  • Strong understanding of data formats and APIs (JSON, Parquet, Avro, XML), schema management, and metadata systems
  • Significant experience in data engineering, ML engineering, or AI platform engineering roles
  • Strong hands-on programming experience in Python (required); experience with Java, Scala, or similar languages is a plus
  • Experience building and operating ML pipelines and AI platforms using tools such as Airflow, Kubeflow, MLflow, SageMaker, Vertex AI, or equivalent
  • Experience with GenAI frameworks and tooling (e.g., LangChain, LlamaIndex, OpenAI/Vertex APIs, vector databases like Pinecone, FAISS, or similar)
  • Experience designing and scaling large-scale data systems across technologies such as BigQuery, Spanner, Hive, HBase, NoSQL stores, relational databases, and streaming platforms
  • Experience with cloud-based data and AI platforms (AWS, GCP, Azure), including cost optimization and performance tuning for AI workloads
  • Proven experience leading, mentoring, and influencing senior engineers and cross-functional teams
  • Experience integrating AI solutions into infrastructure, observability, reliability engineering, or operational platforms is strongly preferred
  • Experience with production-grade CI/CD, monitoring, and automation for data and AI systems

Responsibilities

  • Leads and mentors engineers across Data Engineering, ML Engineering, and AI Ops, fostering a culture of technical excellence, experimentation, and production-grade AI delivery at scale
  • Designs, builds, and operates end-to-end AI Ops platforms supporting machine learning, generative AI, and agentic workflows, from data ingestion and feature engineering through model training, deployment, monitoring, and lifecycle management
  • Hands-on development of AI-enabled systems, including ML pipelines, LLM-based applications, retrieval-augmented generation (RAG), prompt pipelines, agent orchestration, and model inference services
  • Defines and implements scalable data and feature pipelines optimized for AI/ML workloads, ensuring high data quality, lineage, reproducibility, and compliance with enterprise governance standards
  • Leads MLOps and LLMOps practices, including CI/CD for models, automated testing and validation, model versioning, experiment tracking, drift detection, performance monitoring, and rollback strategies
  • Oversees integration of diverse structured and unstructured data sources (batch and streaming) to support analytics, ML, and GenAI use cases across global infrastructure operations
  • Partners closely with infrastructure, platform, security, and product teams to embed AI capabilities into operational systems, observability platforms, reliability engineering, and automation workflows
  • Conducts architecture and design reviews for AI platforms, data systems, and ML pipelines, ensuring solutions meet scalability, reliability, security, and cost-efficiency requirements
  • Drives AI Ops automation initiatives, leveraging ML and GenAI to improve incident detection, root cause analysis, capacity forecasting, anomaly detection, and self-healing infrastructure
  • Monitors and optimizes AI and data workflows, ensuring adherence to delivery timelines, sprint commitments, and best practices in DevOps, DataOps, and AI Ops
  • Influences enterprise AI strategy by evaluating emerging AI/ML technologies, frameworks, and platforms, and guiding their adoption in a regulated, production environment

Benefits

  • Competitive base salaries
  • Bonus incentives
  • 6%25 Company Match on retirement savings plan
  • Free financial coaching and financial well-being support
  • Comprehensive medical, dental, vision, life insurance, and disability benefits
  • Flexible working model with hybrid, onsite or virtual arrangements depending on role and business need
  • 20+ weeks paid parental leave for all parents, regardless of gender, offered for pregnancy, adoption or surrogacy
  • Free access to global on-site wellness centers staffed with nurses and doctors (depending on location)
  • Free and confidential counseling support through our Healthy Minds program
  • Career development and training opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service