Senior Infrastructure Engineer

GSKUpper Providence, PA
3d

About The Position

The Onyx Research Data Platform organization represents a major investment by GSK R&D and Digital & Tech, designed to deliver a step-change in our ability to leverage data, knowledge, and prediction to find new medicines. We are a full-stack shop consisting of product and portfolio leadership, data engineering, infrastructure and DevOps, data / metadata / knowledge platforms, and AI/ML and analysis platforms, all geared toward: Building a next-generation data experience for GSK’s scientists, engineers, and decision-makers, increasing productivity and reducing time spent on “data mechanics” Providing best-in-class AI/ML and data analysis environments to accelerate our predictive capabilities and attract top-tier talent Aggressively engineering our data at scale to unlock the value of our combined data assets and predictions in real-time A Senior Infrastructure Engineer is a leading technical contributor who can consistently take a loosely defined business or technical requirement, architect and build it to a well-defined specification, and execute on it at a high level. They have a strong focus on metrics, both for the impact of their work and for its inner workings / operations. A Senior Infrastructure Engineer should be deeply familiar with the tools of their specialization and of their customers and engaged with the open-source community surrounding them – potentially, even to the level of contributing pull requests.

Requirements

  • Bachelor’s degree in Computer Science, Software Engineering, or related discipline
  • 4+ years of Linux Systems Administration experience
  • Experience with modern software development tools / ways of working (e.g. git/GitHub, DevOps tools, metrics / monitoring…)
  • Experience with enterprise Linux hardware and software or troubleshooting and maintenance.

Nice To Haves

  • Deep knowledge and use of at least one common programming and scripting language: e.g., Python, Bash, including toolchains for documentation, testing, and operations / observability
  • Application experience of CI/CD implementations using git and a common CI/CD stack (e.g. GitLab, Azure DevOps)
  • Demonstrated excellence with agile software development environments using tools like Jira and Confluence

Responsibilities

  • Broad and deep knowledge of Linux systems administration, storage (e.g. Weka/ZFS) and network configuration
  • Design, build, and operate tools, services, workflows etc that deliver high value, through the solution to key business problems
  • Work to transition from classical HPC ways of working to a modern "private cloud" approach that focuses on Infrastructure-as-Code, DevOps-driven workflows, and containerization, and helps put site reliability engineering and automation at the core of every running Onyx service
  • Implement and maintain observability stack and processes for infrastructure (Grafana, Prometheus, InfluxDB) across all of HPC and other computing components
  • Own the schedulers (e.g. Slurm) and processes that continually improve users' access to resources in the most efficient way
  • Provide input into the roadmaps of teams representing upstream dependencies, to help improve the overall program of work
  • Fully versed in coding best practices and ways of working, and participates in code reviews/partnering to improve the team's standards
  • Design innovative strategy beyond the current enterprise way of working to create a better environment for the end users, and be able to construct a coordinated, stepwise plan to bring others along with the change curve
  • Provide thought leadership to team members to help others get the job done right, first time
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service