Data Scientist I - Cancer Center

Cleveland Clinic

1d•Remote

About The Position

At Cleveland Clinic Health System, we believe in a better future for healthcare. And each of us is responsible for honoring our commitment to excellence, pushing the boundaries and transforming the patient experience, every day. We all have the power to help, heal and change lives — beginning with our own. That’s the power of the Cleveland Clinic Health System team, and The Power of Every One. Job Details This position is open to Ohio-based candidates only Join Cleveland Clinic’s Main Campus where research and surgery are advanced, technology is leading-edge, patient care is world-class, and caregivers are family. Cleveland Clinic is recognized as one of the top hospitals in the country. You will work alongside passionate and dedicated caregivers, receive endless support and appreciation, and build a rewarding career with one of the most respected healthcare organizations in the world. The Data Scientist applies statistical and machine learning techniques, along with modern data architectures, to solve moderately complex analytical problems involving large, unstructured datasets. This role supports advanced analytics initiatives with a strong focus on large language models (LLMs), contributing to the development of AI agents, LLM pipelines, and reusable data abstractions. The Data Scientist partners with senior and lead data scientists to deliver high-impact analytical solutions that support clinical, operational, and research goals within the Cancer Center. This role collaborates closely with other data scientists and software engineers, and occasionally program managers, to translate analytical concepts into scalable, production-ready systems. Top performers bring strong Python development skills, hands-on experience in data science and LLM-based workflows, and healthcare domain expertise preferably in oncology combined with the ability to work effectively in cross-functional, highly technical teams. A caregiver in this position works days, from 8:00am—5:00pm.

Requirements

Bachelor’s Degree in Statistician, Actuarial Science, Econometrics, Physics, Biostatistics, Computer Science, Applied Mathematics, Engineering, Business Analytics, Economics, Finance or related field
Excellent written, verbal, and presentation skills in English and ability to explain the value of Machine Learning (ML) and Artificial Intelligence (AI) to business leaders
18 months of related experience working with relational databases and/or distributed computing platforms, and their query interfaces, such as SQL, Teradata, MapReduce, PIG, and Hive OR Master’s Degree can substitute for experience
Experience working with a variety of statistical languages/packages, e.g., SAS, R, Python, Spark, and/or SPSS
Knowledge applying advanced statistics to complex business problems required (e.g., modeling, AI, ML, Deep Learning [DL], and/or Natural Language Processing [NLP])
Familiarity with additional programming languages, including Python, Java, or C/C++
Experience leveraging visualization software and techniques and business intelligence (BI) software
Technical knowledge of distributed computing platforms, and common data process flows from data instrumentation & generation, to ETL, to the data warehouse itself
Demonstrated leadership qualities, including presentation, influencing and negotiation

Nice To Haves

Master’s or Ph. D
Certification or fellowship in analytics, big data, data science or related subject
Technology partner certification (in technology, Big data, business or advanced analytics, data science) – Microsoft, Oracle, Teradata, IBM, EMC, Cloudera, Hortonworks, Informatica, Tableau, SAS, R, Python
Healthcare or life science experience
Lead Experience with project management and change management methodologies
Experience building and optimizing machine learning or large language models (LLMs) enabled pipelines within modern data platforms such as Snowflake or DataBricks
Experience fine-tuning, evaluating, deploying large language models (LLMs) for domain specific use cases
Python experience

Responsibilities

Gather requirements and program model development under direction of other Data Scientists to inform problem formulation.
Participate in model building.
Utilize methods particularly in modeling, Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL), Natural Language Processing (NLP) and experience designing/implementing solutions.
Provide insight and recommendations to solve business problems and help inform business decisions.
Document best practices and solution frameworks.
Replicate and scale solutions to business units with common/similar business needs.
Use expertise in defining (with internal clients) the business challenge that needs to be solved, conducting the analysis/modeling/experiment and then providing the “answer” in a clear and business-friendly manner for our clients to take actionable steps.
Participate with more senior data scientists or managers in peer review model assessment and feedback communication.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume