Data Engineer : W2 Onsite role

Haramain SystemsJersey City, NJ
12hOnsite

About The Position

We are looking for a highly skilled Senior Data Engineer with strong expertise in PySpark, Python, SQL, and database technologies, along with exposure to Data Science, AI/ML techniques. The ideal candidate will design and optimize scalable data pipelines, collaborate with cross-functional teams, and contribute to the development of analytical and machine learning–driven solutions.

Requirements

  • 5+ years of experience in Data Engineering with strong hands-on work in PySpark.
  • Strong proficiency in Python, including libraries for data processing.
  • Advanced knowledge of SQL and performance optimization techniques.
  • Experience with distributed data systems (Spark, Databricks, Hive, or similar).
  • Exposure to AI/ML workflows, including model deployment or MLOps.
  • Solid understanding of data modeling, warehousing concepts, and ETL/ELT architectures.

Nice To Haves

  • US Healthcare domain experience (HIPAA, claims data, EHR/EMR, HL7, FHIR, etc.).
  • Experience with cloud platforms (Azure, AWS, GCP).
  • Knowledge of MLflow, Airflow, or similar tools.

Responsibilities

  • Design, develop, and optimize large-scale ETL/ELT pipelines using PySpark and distributed data processing frameworks.
  • Build high-performance data ingestion workflows from diverse structured and unstructured sources.
  • Implement scalable data models, data marts, and warehousing solutions.
  • Write clean, modular, and optimized code using Python for data processing and automation.
  • Develop complex SQL queries, stored procedures, and performance-tuned database operations.
  • Work with relational and NoSQL databases (e.g., MySQL, PostgreSQL, SQL Server, MongoDB, etc.).
  • Partner with Data Science teams to productionize ML models and enable ML-driven pipelines.
  • Contribute to model deployment, feature engineering, and ML workflow optimization.
  • Integrate ML models into scalable data platforms.
  • Ensure data quality, reliability, lineage, and governance across data workflows.
  • Drive best practices in coding, testing, CI/CD, and cloud-based deployments.
  • Work with cross‑functional teams to translate business requirements into robust data solutions.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service