Data Infrastructure & ML Engineer (Hybrid Role)

Axcelis TechnologiesBeverly, MA
$122,133 - $183,200Hybrid

About The Position

We are seeking a Senior Data Infrastructure & Machine Learning Engineer to design and implement scalable data systems and pipelines that support advanced analytics and machine learning workflows. This is a hybrid role where the primary focus is on data pipeline engineering and Python-based data processing, supported by strong database design and management expertise. Role Focus (Approximate Split) Data Pipeline Engineering & Data Flow (Critical): ~50% Python & Machine Learning Data Processing: ~30% Database Design & Management: ~20%

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field with 5+ years of experience.
  • Strong experience in database design and SQL-based systems.
  • Hands-on experience with distributed systems, partitioning, and sharding.
  • Proven experience building data pipelines (ETL/ELT).
  • Strong proficiency in Python for data processing.
  • Experience working with log-based and semi-structured data (e.g., JSON).
  • Understanding of data traceability, validation, and governance.

Nice To Haves

  • Experience with time-series or log analytics systems.
  • Exposure to real-time/streaming architectures (e.g., Kafka).
  • Experience with cloud platforms (Azure, AWS, or GCP).
  • Familiarity with machine learning workflows and lifecycle.
  • Domain experience in semiconductor or high-throughput systems (nice to have).

Responsibilities

  • Design and build end-to-end data pipelines (ETL/ELT) for ingesting, processing, and transforming data.
  • Handle multiple data sources including: Tool-generated logs (e.g., AT log files) JSON and semi-structured data
  • Ensure full data traceability, enabling backward tracking of all data points.
  • Implement validation, monitoring, and error handling to ensure data quality and reliability.
  • Design and manage scalable database schemas.
  • Support both single-node and distributed database environments.
  • Implement tablespaces, partitioning, and sharding strategies to ensure performance and scalability.
  • Optimize queries and maintain high performance for large-scale datasets.
  • Develop data processing workflows using Python.
  • Work extensively with dataframes for transformation and analysis.
  • Utilize libraries such as: Pandas, NumPy for data manipulation Plotly (or similar) for visualization and exploratory analysis
  • Automate data workflows and integrate them into pipelines.
  • Prepare and transform datasets for machine learning models.
  • Collaborate with data scientists and engineers to support model training and deployment workflows.
  • Enable scalable data foundations for AI/ML integration into production systems.

Benefits

  • Axcelis Team Incentive bonus plan
  • Comprehensive benefits package (for regular employees working 20+ hours a week)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service