Staff Software Engineer

Pfizer•New York City, NY

63d•Hybrid

About The Position

ROLE SUMMARY Pfizer is committed to the application of computational science in the areas of drug discovery and development and has recently initiated a large-scale migration of computational infrastructure to cloud. This role leverages extensive experience in scientific computing to deliver robust high-performance solutions supporting computational workloads across the organization. We are seeking an experienced individual contributor to own project migrations, user support, onboarding, documentation, communication, and training efforts related to the High-Performance Computing (HPC) environment. You will work with a team of engineers to ensure robust, scalable, high-performance cloud native infrastructure that underpins modernization of the scientific computing platform. You will be the primary bridge between our R&D scientists and cloud infrastructure. Critical business capabilities that utilize Pfizer HPC resources include in silico drug discovery, protein folding/structure prediction, quantum chemistry, bioinformatics, pharmacokinetics and pharmacodynamics (PK/PD) modeling, ML/AI, and fluid dynamics. Experience in one or more of these scientific domains is highly desirable. ROLE RESPONSIBILITIES Lead solution design and migration strategy for R&D teams transitioning legacy scientific workloads to cloud based HPC platforms, ensuring alignment with performance, scalability, security, and cost objectives. Partner with scientific stakeholders to translate research needs into platform level infrastructure requirements, providing senior technical guidance on compute, storage, and parallelization approaches. Serve as the senior technical authority for complex HPC operational issues, defining troubleshooting frameworks, escalation paths, and long term remediation strategies for scheduler, dependency, and workflow failures. Own the strategy, quality, and governance of HPC documentation and knowledge assets, ensuring documentation remains accurate, accessible, and aligned with platform standards, onboarding needs, and evolving best practices. Lead platform level communications and stakeholder engagement related to HPC operations, including maintenance, capacity changes, and upgrades, ensuring transparency, predictability, and minimal disruption to scientific workloads. Define and oversee user enablement and training strategy for HPC platforms, ensuring researchers are equipped to use cloud resources efficiently, responsibly, and in accordance with platform best practices. Own the end to end lifecycle strategy for scientific software platforms, including selection, deployment models, upgrade planning, and deprecation, to ensure reliability, reproducibility, and broad usability across research domains. Establish containerization standards and adoption models for scientific workflows, overseeing the transition of complex applications to container based execution environments and ensuring consistency across teams and platforms. Set and govern application performance optimization standards across cloud instance types, guiding workload placement decisions to maximize performance, scalability, and cost efficiency.

Requirements

B.S. with 7+ years or Ph.D. with 3+ years of experience in high performance computing, cloud computing, and life sciences.
Deep Linux systems expertise supporting the design, standardization, automation, and reliable operation of scientific computing platforms and services.
Excellent written and verbal communication skills with the ability to clearly communicate complex technical concepts to scientific, technical, and platform stakeholders.
Demonstrated ability to lead resolution of complex technical issues while providing clear status updates and communication back to scientific stakeholders.
Deep foundational technical expertise across HPC and cloud platforms, including Linux, Slurm, Kubernetes, GitOps, Google Cloud Platform, Spack, Cluster Toolkit, infrastructure-as-code-tooling, GPU architectures, and related scientific software ecosystems.

Nice To Haves

Advanced experience with at least one of AWS and GCP, including knowledge of core compute and storage services relevant to HPC.
Experience designing, operating, or supporting distributed computing environments, including Kubernetes-based environments such as GKE.
Prior experience with HPC deployment utilities including Google Cluster Toolkit
Candidate demonstrates a breadth of diverse leadership experiences and capabilities including: the ability to influence and collaborate with peers, develop and coach others, oversee and guide the work of other colleagues to achieve meaningful outcomes and create business impact.

Responsibilities

Lead solution design and migration strategy for R&D teams transitioning legacy scientific workloads to cloud based HPC platforms, ensuring alignment with performance, scalability, security, and cost objectives.
Partner with scientific stakeholders to translate research needs into platform level infrastructure requirements, providing senior technical guidance on compute, storage, and parallelization approaches.
Serve as the senior technical authority for complex HPC operational issues, defining troubleshooting frameworks, escalation paths, and long term remediation strategies for scheduler, dependency, and workflow failures.
Own the strategy, quality, and governance of HPC documentation and knowledge assets, ensuring documentation remains accurate, accessible, and aligned with platform standards, onboarding needs, and evolving best practices.
Lead platform level communications and stakeholder engagement related to HPC operations, including maintenance, capacity changes, and upgrades, ensuring transparency, predictability, and minimal disruption to scientific workloads.
Define and oversee user enablement and training strategy for HPC platforms, ensuring researchers are equipped to use cloud resources efficiently, responsibly, and in accordance with platform best practices.
Own the end to end lifecycle strategy for scientific software platforms, including selection, deployment models, upgrade planning, and deprecation, to ensure reliability, reproducibility, and broad usability across research domains.
Establish containerization standards and adoption models for scientific workflows, overseeing the transition of complex applications to container based execution environments and ensuring consistency across teams and platforms.
Set and govern application performance optimization standards across cloud instance types, guiding workload placement decisions to maximize performance, scalability, and cost efficiency.

Benefits

participation in Pfizer’s Global Performance Plan with a bonus target of 17.5% of the base salary
eligibility to participate in our share based long term incentive program
a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution
paid vacation, holiday and personal days
paid caregiver/parental and medical leave
health benefits to include medical, prescription drug, dental and vision coverage

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume