Senior Machine Learning Engineer

C the Signs

52d•Hybrid

About The Position

The Machine Learning Engineer will be responsible for the end-to-end development and deployment of Large language and machine learning models, with a primary focus on data preprocessing, model training, and fine-tuning using large-scale healthcare datasets. This role requires a strong understanding of Large language models, machine learning principles, data engineering, and experience working with sensitive healthcare data. Key Responsibilities Data Preprocessing: Clean, transform, and prepare large, complex healthcare datasets for machine learning model development. This includes handling missing values, outlier detection, feature engineering, and data normalization. Identify, collect, and curate relevant, industry-specific datasets for model retraining. Format data appropriately for the chosen LLM and training pipeline Model Training & Fine-Tuning: Design, train, and fine-tune various LLMs on extensive healthcare data to solve specific clinical or operational problems. Set up and manage the training environment, including GPU instances and required software. Train and fine-tune pre-trained LLMs on the custom dataset to achieve specific goals. Experiment with and fine-tune hyperparameters such as learning rate, batch size, and training epochs to optimize model performance. Integration of structured + unstructured data (multi-modal/multi-input models) Model Evaluation & Optimization: Evaluate model performance using appropriate metrics, identify areas for improvement, and implement optimization strategies. Pipeline Development: Develop and maintain robust and scalable data and ML pipelines for model training, inference, and deployment. Collaboration: Work closely with data scientists, clinicians, and software engineers to understand requirements, integrate models into production systems, and ensure data privacy and security compliance. Research & Development: Stay up-to-date with the latest advancements in machine learning and healthcare AI, and explore new technologies and methodologies to enhance our solutions. Documentation: Maintain clear and comprehensive documentation of models, data pipelines, and experimental results.

Requirements

5+ years of experience in Machine Learning Engineering or a similar role.
Proven experience with large-scale data preprocessing, LLM/model training, and fine-tuning.
Experience with distributed training (PyTorch Distributed, DeepSpeed, Ray, Hugging Face Accelerate).
Experience with GPU/TPU optimization, memory management for large language models.
Proficiency in Python and relevant ML libraries (e.g., TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy).
Strong understanding of various machine learning algorithms,Large Language Models, and deep learning architectures.
Excellent problem-solving and analytical skills.
Strong communication and collaboration abilities.
Ability to work independently and as part of a team in a fast-paced environment.
Must be a US Citizen, Green Card holder, or currently in the US have valid H1B visa
Bachelor's or Master's degree in Computer Science, Machine Learning, Artificial Intelligence, or a related quantitative field.

Nice To Haves

Experience working with healthcare data is highly desirable.
Experience with cloud platforms (e.g., GCP, AWS) and distributed computing frameworks (e.g., Spark) is a plus.
Familiarity with MLOps practices and tools.

Responsibilities

Data Preprocessing: Clean, transform, and prepare large, complex healthcare datasets for machine learning model development. This includes handling missing values, outlier detection, feature engineering, and data normalization. Identify, collect, and curate relevant, industry-specific datasets for model retraining. Format data appropriately for the chosen LLM and training pipeline
Model Training & Fine-Tuning: Design, train, and fine-tune various LLMs on extensive healthcare data to solve specific clinical or operational problems. Set up and manage the training environment, including GPU instances and required software. Train and fine-tune pre-trained LLMs on the custom dataset to achieve specific goals. Experiment with and fine-tune hyperparameters such as learning rate, batch size, and training epochs to optimize model performance. Integration of structured + unstructured data (multi-modal/multi-input models)
Model Evaluation & Optimization: Evaluate model performance using appropriate metrics, identify areas for improvement, and implement optimization strategies.
Pipeline Development: Develop and maintain robust and scalable data and ML pipelines for model training, inference, and deployment.
Collaboration: Work closely with data scientists, clinicians, and software engineers to understand requirements, integrate models into production systems, and ensure data privacy and security compliance.
Research & Development: Stay up-to-date with the latest advancements in machine learning and healthcare AI, and explore new technologies and methodologies to enhance our solutions.
Documentation: Maintain clear and comprehensive documentation of models, data pipelines, and experimental results.

Benefits

Competitive salary and benefits package.
Flexible working arrangements (remote or hybrid options available).
The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity.
Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume