Platform (aka DevOps) Engineers provide production support, production monitoring, CI/CD design & implementation, security automation, & AI/ML infrastructure management across the open-source software platforms CTDS develops and operates for translational data science. Production support includes triaging, researching, communicating, and addressing production incidents. For monitoring, staff wrangle disparate system monitoring assets and develop common analytics to inform optimization define benchmarks and confidence intervals and to forecast, proactively mitigating production incidents. CI/CD pipelines are for hybrid cloud architecture on-premises and in commercial cloud providers like Amazon, Google, and Microsoft. Additionally, the position is responsible for AI/ML research infrastructure, including managing and optimizing on-premises GPU resources and AWS cloud services such as Bedrock and SageMaker. Responsibilities include deploying, monitoring, and maintaining machine learning models for inference, optimizing model and hardware performance, troubleshooting AI/ML solutions, and integrating them within the broader application environment to support research and production workflows.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
Associate degree