Senior Data/ML Engineer (AWS)

CapNexusNew York, NY
Remote

About The Position

Capnexus is looking for a highly skilled Senior AWS Data/ML Engineer to lead data architecture, pipeline development, and data integrations. This is an exciting opportunity to apply advanced cloud data engineering skills on a platform that leverages generative AI to automate and modernize enterprise workflows.

Requirements

  • 5+ years of data engineering or ML engineering experience, with at least 2+ years in AWS cloud environments.
  • Strong proficiency in Python and SQL; experience with AWS data services including S3, Glue, Athena, Kinesis, and Step Functions.
  • Hands-on experience with Amazon SageMaker for model development, training, tuning, and endpoint deployment.
  • Working knowledge of Amazon Bedrock for integrating and applying foundation models in production-grade pipelines.
  • Experience designing and implementing multi-zone data lake architectures on Amazon S3, including lifecycle policies and Lake Formation governance.
  • Familiarity with Kiro CLI or comparable AI-assisted/agentic development tooling.
  • Experience with entity resolution, deduplication, or master data management concepts and tools.
  • Solid understanding of data modeling, feature engineering, data quality practices, and ML integration testing.
  • Experience with AWS Lambda and AWS Step Functions for serverless workflow orchestration.
  • Familiarity with Amazon API Gateway for exposing data services and model endpoints.
  • Strong analytical, problem-solving, and communication skills; comfortable working in Agile/Scrum teams alongside AWS Professional Services.

Nice To Haves

  • Experience with Azure Data Lake, Azure Data Factory, or Azure Synapse — particularly in cloud-to-cloud migration contexts.
  • Familiarity with Amazon Entity Resolution for customer identity and deduplication use cases.
  • Experience with MLOps practices including model monitoring, drift detection, and automated retraining on SageMaker.
  • Experience with LLM prompt engineering, RAG architectures, or fine-tuning workflows on Amazon Bedrock.
  • Knowledge of Amazon QuickSight for analytics dataset preparation and embedded dashboard development.
  • AWS Certification (Machine Learning Specialty, Data Analytics Specialty, or Solutions Architect).
  • Background in real estate, property management, marketing technology, or insurance industries.

Responsibilities

  • Participate in data discovery workshops to inventory source systems including property management platforms, marketing channels, and CRM data, and translate findings into data lake architecture requirements.
  • Design and implement a multi-zone enterprise data lake on Amazon S3 (raw, conformed, enriched, aggregated) with ingest, cleansing, and business layers aligned to the SOW architecture.
  • Build batch and streaming data ingestion pipelines using AWS Glue, Amazon Kinesis, and AWS Data Pipeline across CDP, marketing, and property management data sources.
  • Implement data transformation and orchestration frameworks using AWS Glue ETL and AWS Step Functions, including AWS Glue Data Catalog for metadata management and discovery.
  • Configure Amazon Athena for serverless SQL querying across the data lake; support QuickSight integration with curated data sets for business analytics.
  • Develop and deploy ML models on Amazon SageMaker for lead scoring, predictive maintenance, intelligent underwriting risk scoring, and AI-powered audience segmentation.
  • Integrate Amazon Bedrock foundation models to enable generative AI capabilities including customer profile enrichment, hyper-personalization, and intelligent marketing automation.
  • Use Kiro CLI to accelerate AI-assisted development workflows, spec-driven pipeline implementation, and automated code generation tasks.
  • Design and implement entity resolution pipelines using Amazon Entity Resolution to identify, deduplicate, and merge customer records into unified golden records.
  • Implement real-time and batch data synchronization pipelines between source systems and the Customer Data Platform (CDP).
  • Support Azure data lake migration: conduct discovery, assess schemas and transformation logic, provision AWS target environments, execute migration via AWS DataSync, and perform data validation and reconciliation.
  • Implement data lake security using AWS Lake Formation, including row-level security and column-level encryption.
  • Build and maintain data models to support Customer 360 views, ML feature stores, and executive analytics dashboards.
  • Ensure data quality, validation, and integrity across all pipeline stages and ML model outputs; support UAT for data-dependent features.
  • Collaborate with Full Stack, DevOps/MLOps, and AWS engagement teams; contribute to architecture documentation, pipeline runbooks, and data governance documentation.

Benefits

  • Remote work
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service