Data Scientist Analyst

MedReviewAustin, TX
3hRemote

About The Position

MedReview is seeking a talented Data Scientist/ Analyst to join our team and help build a robust, scalable feature store. This individual will bridge the gap between data analysis, feature engineering, and MLOps, ensuring that features for machine learning models are discoverable, consistent, and readily available for both training and real-time serving. The ideal candidate is organized, self-motivated, and able to work in an environment which is dynamic, with a global workforce and a rapidly expanding business. This position will sit in Austin, Texas. However, for the right fit, we may consider remote.

Requirements

  • Bachelor's or Master's degree in Computer Science, Data Science, Statistics, or a related quantitative field
  • 3+ years of experience in data science, data engineering, or MLOps roles, with a focus on data infrastructure and machine learning workflows.
  • 1+ years of experience in a health care industry company or familiarity with health care claims, coding and clinical decision making
  • Strong programming skills in Python, R and SQL.
  • Experience with data processing frameworks (e.g., Spark, Flink, Airflow).
  • Hands-on experience with feature store technologies (e.g., Feast, SageMaker Feature Store, Tecton, or custom implementations) is preferred.
  • Familiarity with cloud platforms (AWS, Azure, or GCP) and related data services.
  • Knowledge of machine learning fundamentals, statistical modeling, and data visualization tools.
  • Experience working in an Agile environment

Nice To Haves

  • Familiarity with Clickhouse or similar technologies is a plus

Responsibilities

  • Feature Store Development & Management: Collaborate with ML engineers and data platform teams to design, implement, and maintain the feature store architecture for both offline (batch) and online (real-time) use cases.
  • Feature Engineering & Analysis: Identify, define, and engineer high-quality features from various raw data sources (databases, APIs, streaming data) using statistical analysis and domain knowledge.
  • Collaboration & Governance: Partner with database administrators, clinical experts, data scientists, ML engineers, and business stakeholders to promote feature reuse, define governance standards, track feature lineage, and ensure data consistency across models.
  • Data Pipeline Integration: Build and optimize ingestion and transformation pipelines using distributed data frameworks to populate the feature store, ensuring data accuracy, reliability, and freshness.
  • Model Support: Generate training and testing datasets from the feature store and work with ML engineers to ensure seamless feature serving for model inference in production environments.
  • Monitoring & Quality Assurance: Develop monitoring and alerting frameworks to track feature data quality, integrity, and latency, proactively identifying and resolving issues.
  • Documentation & Communication: Document feature definitions, data sources, and usage best practices, and effectively communicate complex technical concepts and insights to technical and non-technical audiences.

Benefits

  • This is a high-impact role where you will directly contribute to accelerating our machine learning capabilities and data-driven decision-making processes. You will work on cutting-edge technologies and collaborate with a talented team to solve complex, real-world problems.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service