Sr. Data Infrastructure Engineer

Evolv Technologies HoldingsWaltham, MA
2hOnsite

About The Position

Join Evolv as Senior Data Infrastructure Engineer in the Machine Learning & Sensors organization, responsible for building and operating the scalable, secure, and reliable data pipelines that power our AI/ML research and production systems. In this role, you will own the end‑to‑end data lifecycle—from collection on thousands to millions of edge devices, through cloud ingestion and processing, into a centralized data factory enabling model training, evaluation, and continuous improvement. Data is the backbone of our mission to deliver best‑in‑class AI‑based weapon detection systems. You will ensure that data flows seamlessly across geographies, devices, and cloud systems while meeting strict requirements for quality, privacy, security, and scale. This role is ideal for someone who thrives at the intersection of distributed systems, cloud pipelines, and ML‑driven data needs.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Software Engineering, or related field.
  • 2-3+ years of experience building production data pipelines and data platforms that support AI/ML models.
  • Strong proficiency in Python, C++ and distributed data processing frameworks.
  • Hands‑on experience with AWS services including S3, EC2, SageMaker, and Glue.
  • Experience designing data systems that support large‑scale ML training and experimentation.
  • Knowledge of data governance, access control, and lifecycle management.
  • Experience collaborating with ML, data science, operations, and cloud teams.

Nice To Haves

  • Experience building pipelines spanning edge devices and cloud systems.
  • Background working with large‑scale sensor, image or IoT data.
  • Familiarity with data labeling tools and annotation workflows.
  • Experience implementing dataset versioning, lineage, and reproducibility systems.
  • Understanding of privacy, compliance, or regulated data environments.
  • Experience supporting global, multi‑region data platforms.

Responsibilities

  • End‑to‑End Data Pipeline Ownership: Design, build, and maintain both research and production data pipelines spanning edge devices, cloud services, and centralized data platforms.
  • Own the full data lifecycle: collection, ingestion, processing, obfuscation, versioning, access, retention, and retirement.
  • Edge‑to‑Cloud Data Flow: Develop resilient ingestion pipelines capable of handling variable connectivity and device heterogeneity.
  • Support secure data transfer from the field to cloud storage systems.
  • Collaborate with field ops to enhance data coverage, observability, and operational robustness.
  • Data Quality, Governance & Compliance: Implement privacy‑preserving transformations and obfuscation pipelines.
  • Build automated cleaning/validation steps to remove duplicates, detect corruption, and validate metadata.
  • Establish data lineage, retention policies, and access controls ensuring compliance and traceability.
  • Data Services for AI/ML: Provide scalable data services for model training, evaluation, and research experimentation.
  • Support continuous data refresh and retraining workflows.
  • Integrate with data labeling services and annotation workflows.
  • Enable efficient access patterns for large‑scale ML workloads.
  • AWS‑Based Cloud Infrastructure: Build and optimize pipelines using AWS services (S3, EC2, SageMaker, Lambda, Glue, Step Functions).
  • Design for cost‑efficiency, performance, and reliability at scale.
  • Collaboration & Feedback Loops: Partner with AI/ML engineers, scientists, and data scientists to understand data requirements.
  • Translate feedback into automated improvements in data collection, labeling, and consumption.
  • Support cross‑functional teams in exploratory analysis and debugging data issues.
  • Scaling the Data Factory: Design and manage data schema, data versioning and data factory updates
  • Architect systems that scale globally across millions of devices.
  • Ensure the data platform remains flexible for research and reliable for production operations.

Benefits

  • Equity as part of your total compensation package
  • Medical, dental, and vision insurance
  • Health Savings Account (HSA)
  • A 401(k) plan (and 2% company match)
  • Flexible Paid Time Off (PTO)- take the time you need to recharge, with manager approval and business needs in mind
  • Quarterly stipend for perks and benefits that matter most to you
  • Tuition reimbursement to support your ongoing learning and development
  • Subscription to Calm
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service