Compute Platform Engineer I

GlaxoSmithKline

340d•$90,750 - $151,250

About The Position

The Onyx Research Data Tech organization is GSK's Research data ecosystem which has the capability to bring together, analyze, and power the exploration of data at scale. We partner with scientists across GSK to define and understand their challenges and develop tailored solutions that meet their needs. The goal is to ensure scientists have the right data and insights when they need it to give them a better starting point for and accelerate medical discovery. Ultimately, this helps us get ahead of disease in more predictive and powerful ways. We are a full-stack shop consisting of product and portfolio leadership, data engineering, infrastructure and DevOps, data / metadata / knowledge platforms, and AI/ML and analysis platforms, all geared toward: Building a next-generation, metadata- and automation-driven data experience for GSK's scientists, engineers, and decision-makers, increasing productivity and reducing time spent on “data mechanics”. Providing best-in-class AI/ML and data analysis environments to accelerate our predictive capabilities and attract top-tier talent. Aggressively engineering our data at scale, as one unified asset, to unlock the value of our unique collection of data and predictions in real-time. Our Compute Platform Engineering team is building a first-in-class platform of toolchains and workflows that accelerate application development, scale up computational experiments, and integrate all computation with project metadata, logs, experiment configuration and performance tracking over abstractions that encompass Cloud and High-Performance Computing (HPC). This metadata-forward, CI/CD-driven platform represents and enables the entire application and analysis lifecycle including interactive development and explorations (notebooks), large-scale batch processing, observability, and production application deployments.

Requirements

Bachelors degree in Data Engineering, Computer Science, Software Engineering or related discipline
2+ years of professional experience
1+ years of professional experience with a Masters
New PhD
Experience with Python
Experience with Cloud
Experience with High Performance Compute (HPC)

Nice To Haves

Knowledge and use of at least one common programming language: e.g., Python, C++, Scala, Java, including toolchains for documentation, testing, and operations / observability
Expertise in modern software development tools / ways of working (e.g. git/GitHub, devops tools, metrics / monitoring, …)
Cloud expertise (e.g., AWS, Google Cloud, Azure), including infrastructure-as-code tools and scalable compute technologies, such as Google Batch and Vertex
Experience with CI/CD implementations using git and a common CI/CD stack (e.g., Azure DevOps, CloudBuild, Jenkins, CircleCI, GitLab)
Expertise with Docker, Kubernetes, and the larger CNCF ecosystem including experience with application deployment tools such as Helm
Experience with low level application builds tools (make, CMake) as well as automated build systems such as spack or easybuild
Experience in workflow orchestration with tools such as Argo Workflow, Airflow, and scientific workflow tools such as Nextflow, Snakemake, VisTrails, or Cromwell
Experience with application performance tuning and optimization, including in parallel and distributed computing paradigms and communication libraries such as MPI, OpenMP, Gloo, including deep understanding of the underlying systems (hardware, networks, storage) and their impact on application performance.
Demonstrated excellence with agile software development environments using tools like Jira and Confluence.
Deep familiarity with the tools, techniques, optimizations in high-performance applications space, including engagement with the opensource community (and potentially making contributions to such tools)

Responsibilities

Designs, builds, and operates tools, services, workflows, etc that deliver high value through the solution to key business problems.
Responsible for development of key components of a hybrid on-prem/cloud compute platform for both interactive and scalable batch computing and establishing of processes and workflows to transition existing HPC users and teams to this platform.
Responsible for code-driven environment, applications, and container/image builds as well as CI/CD driven application deployments.
Consult science users on application scalability to PBs of data by having a deep understanding of software engineering, algorithms, and underlying hardware infrastructure and their impact on performance.
Confidently optimizes design and execution of complex solutions within large-scale distributed computing environments.
Produces well-engineered software, including appropriate automated test suites, technical documentation, and operational strategy.
Ensure consistent application of platform abstractions to ensure quality and consistency with respect to logging and lineage.
Fully versed in coding best practices and ways of working, and participates in code reviews and partnering to improve the team's standards.
Adhere to QMS framework and CI/CD best practices and helps to guide improvements to them that improve ways of working.

Benefits

Health care and other insurance benefits (for employee and family)
Retirement benefits
Paid holidays
Vacation
Paid caregiver/parental and medical leave

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Industry

Chemical Manufacturing

Education Level

Bachelor's degree

Compute Platform Engineer I

About The Position

Requirements

Nice To Haves

Responsibilities

Benefits

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company