Staff Platform Engineer

University of ChicagoChicago, IL
$100,000 - $140,000

About The Position

The Center for Translational Data Science (CTDS) at the University of Chicago is a research center whose mission is to develop the discipline of translational data science to impactful problems in biology, medicine, healthcare, and the environment. We envision a world in which researchers have ready access to the data needed and the tools required to make data driven discoveries that increase our scientific knowledge and improve the quality of life. We architect ecosystems of large-scale commons of research data, computing resources, applications, tools, and services for the broader research community to use data at scale to pursue scientific inquiry and accelerate discovery. Learn more at https://gdc.cancer.gov/, https://gen3.org/, https://stats.gen3.org/, and https://ctds.uchicago.edu/. The job works independently to perform a variety of activities relating to software support and/or development. Analyzes, designs, develops, debugs, and modifies computer code for end user applications, beta general releases, and production support. Guides development and implementation of applications, web pages, and user-interfaces using a variety of software applications, techniques, and tools. Solves complex problems in administration, maintenance, integration, and troubleshooting of code and application ecosystem currently in production. Staff Platform Engineers provide production support, production monitoring, CI/CD design & implementation, & security automation across the open-source software platforms CTDS develops and operates for translational data science. Production support includes triaging, researching, communicating, and addressing production incidents. For monitoring, staff wrangle disparate system monitoring assets and develop common analytics to inform optimization, define benchmarks and confidence intervals and to forecast, proactively mitigating production incidents. CI/CD pipelines are for hybrid cloud architecture on-premises and in commercial cloud providers like Amazon, Google, and Microsoft. This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance.

Requirements

  • Minimum requirements include a college or university degree in related field.
  • Minimum requirements include knowledge and skills developed through 5-7 years of work experience in a related job discipline.

Nice To Haves

  • Advanced degree in computer science, mathematics, statistics, engineering, or a relevant quantitative field strongly preferred.
  • 6+ years professional experience as a system or DevOps engineer or demonstrated skills and qualifications through projects, initiatives, or outstanding performance.
  • Hands-on scripting experience (Bash, Python, or other dynamic language).
  • Unix/Linux programming or system administration experience.
  • Experience with OpenStack and AWS p(EC2/S3) cloud technologies.
  • Experience with configuration management utility (Chef, Puppet, Ansible).
  • Experience with F5 or other load balancing technologies (Nginx, AWS ELB/ALB, etc.).
  • Experience with source control and build systems (SVN, Git, Jenkins, etc.).
  • Experience with container based deployment (Docker, Kubernetes).
  • Experience with log aggregation tools (ELK stack, Splunk).
  • Experience with security frameworks (FISMA, NIST, FIPS).
  • Experience with cloud platforms (AWS, GCP, Openstack), CI/CD, and Agile methodologies.
  • Experience leading DevOps initiatives and process improvement.
  • Experience provisioning and managing GPU-enabled infrastructure (NVIDIA GPUs, CUDA, multi-GPU systems) in cloud and/or on-prem environments.
  • Familiarity with GPU orchestration in Kubernetes (e.g., NVIDIA device plugin, GPU scheduling, MIG, node affinity).
  • Experience optimizing GPU utilization, memory management, and cost efficiency for compute-intensive workloads.
  • Ability to collaborate with team members and help define guidelines and best practices and ensuring accountability for deliverables and outcomes.
  • Ability to take and provide constructive and helpful input and feedback on technical issues.
  • Ability to negotiate complex decisions, present options and persuasively advocate for optimal technical solutions, internally and externally.
  • In-depth knowledge in most technical areas of major projects and the core DevOps technology scope.
  • Considered a Subject Matter Expert of most DevOps technology and solutions.
  • Ability to take multiple complex tasks and break them into smaller ones, estimating the effort needed to complete them, prioritizing them appropriately, and ensuring the completion of each task, meeting the required level of quality.
  • Ability to work in a collaborative team and ensure accountability for deliverables and outcomes.
  • Ability to prioritize and manage workload to meet project milestones and deadlines.

Responsibilities

  • Responsible for design and implementation of top priority technical tasks and timely delivery of such tasks, meeting and helping define the required level of quality.
  • Participation in complex and challenging activities, including design and implementation.
  • Responsible for a scope of significant size critical to the team’s success.
  • Provide technical leadership and effectively mentor interns and less experienced members.
  • Actively participate in the hiring process and provide fair and productive interview feedback.
  • Negotiate complex decisions, present options, and persuasively advocate for optimal technical solutions, internally and externally.
  • Designs new systems, features, and tools.
  • Solves complex problems and identifies opportunities for technical improvement and performance optimization.
  • Reviews and tests code to ensure appropriate standards are met.
  • Utilizes technical knowledge of existing and emerging technologies, including public cloud offerings from Amazon Web Services, Microsoft Azure, and Google Cloud.
  • Acts as a technical consultant and resource for faculty research, teaching, and/or administrative projects.
  • Performs other related work as needed.

Benefits

  • The University of Chicago offers a wide range of benefits programs and resources for eligible employees, including health, retirement, and paid time off. Information about the benefit offerings can be found in the Benefits Guidebook.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service