Staff Engineer

Newfire Global Partners•Boston, MA

47d•$163,547 - $208,287•Remote

About The Position

Newfire Global Partners is a leading technology firm that specializes in building transformative software solutions for some of the world’s most innovative companies. With a presence across four continents, Newfire Global brings deep expertise in digital healthcare, AI-driven analytics, and enterprise technology. The firm’s track record of delivering scalable, high-impact solutions has made it a trusted partner for organizations seeking to drive meaningful change through technology. We are passionate about the purpose-driven mission to help improve the quality of care for patients and are building a collaborative, innovative, and inclusive culture. We are a fully funded company founded by serial entrepreneurs with a stable client base. Opportunity for impact Newfire Global Partners, a leader in developing disruptive healthcare technology, collaborates with Fortune 500 companies and start-ups to drive transformation. Newfire is seeking a Staff Engineer with SaaS experience to drive the modernization and spearhead the development of a unified, next-generation clinical technology platform for a current healthcare client. This new platform will serve as the foundation for all clinical data and operations across every line of business. By creating a robust and innovative solution, this leader will enable enhanced care management, utilization management, and data-driven decision-making—ultimately working to improve healthcare outcomes for millions of Americans. Role & responsibilities The Staff Engineer on this transformative project is responsible for building and leading a team of collaborative. The ideal candidate brings deep expertise in Python and Apache Spark, a strong foundation in cloud-based data infrastructure, and a proven ability to architect the pipelines and platforms that power machine learning model training, deployment, and monitoring. Based in the US with East Coast availability preferred, this engineer will play a foundational role.

Requirements

Deep expertise in Python — including data engineering libraries, pipeline development, testing, and production-grade code quality.
Strong hands-on experience with Apache Spark for large-scale distributed data processing, optimization, and performance tuning.
Proven experience designing and maintaining data platforms including data lakes, lakehouses, or data warehouse architectures (e.g., Delta Lake, Iceberg, Hudi).
Experience building and orchestrating data pipelines using tools such as Apache Airflow, Prefect, Dagster, or equivalent.
Solid understanding of ML platform concepts — feature stores, training data pipelines, model registries, and experiment tracking (e.g., MLflow, Feast).
Proficiency with cloud data platforms, preferably Azure (Azure Data Factory, Azure Databricks, Azure Synapse, ADLS) or equivalent AWS/GCP services.
Strong knowledge of data modeling, schema design, and data warehousing principles for both analytical and ML workloads.
Experience with data quality frameworks and observability tooling (e.g., Great Expectations, Monte Carlo, dbt tests).
Familiarity with infrastructure as code and DevOps practices — Terraform, Docker, Kubernetes, or equivalent.
Solid understanding of data security, access controls, and compliance requirements in regulated industries.

Responsibilities

Design, build, and maintain scalable, reliable data pipelines using Python and Apache Spark to support data science and ML workflows.
Architect and own the data platform infrastructure—including data lakes, data warehouses, and feature stores—ensuring performance, quality, and governance at scale.
Partner closely with data scientists and ML engineers to build and maintain the data foundations required for model training, validation, and deployment.
Define and implement data engineering best practices including pipeline orchestration, data quality frameworks, lineage tracking, and observability.
Lead the design of reusable data assets—feature engineering pipelines, curated datasets, and domain-specific data models—that accelerate ML experimentation and production readiness.
Collaborate with platform and DevOps teams to operationalize data infrastructure through CI/CD pipelines, infrastructure as code, and automated testing.
Evaluate and introduce modern data tooling and frameworks, driving continuous improvement in the data engineering ecosystem.
Establish and enforce data governance, security, and compliance standards aligned with HIPAA and healthcare data requirements.
Conduct design reviews and technical mentorship for senior and mid-level data engineers across the organization.
Serve as a cross-functional technical authority, aligning data engineering direction with product, clinical, and analytics stakeholders.