Senior Software Engineer, Machine Learning Platform

PlayStation Global•San Diego, CA

8d•$187,363 - $265,900•Hybrid

About The Position

Sony Interactive Entertainment LLC seeks a Senior Software Engineer, Machine Learning Platform in San Diego, CA to architect and lead the design of next-generation ML inference infrastructure to support globally distributed, multi-tenant model serving with high availability, scalability, and cost efficiency.

Requirements

Master’s degree in Computer Science or related field or equivalent and five (5) years of experience designing and deploying low-latency, high-throughput inference systems supporting computer vision and multimodal models including CNNs, segmentation and object detection
building large-scale distributed systems in Scala using reactive frameworks, multithreading, and Guice dependency injection
integrating enterprise feature stores to ensure consistent, low-latency feature delivery
developing ML inference services within Bazel-based monorepos ensuring hermetic builds and seamless integration with secure enterprise CI/CD pipelines (Prow)
extending GitHub Prow workflows by writing Golang plugins/controllers to automate reviews, enforce policies, and coordinate multi-repo build/test/deploy pipelines
utilizing Pulumi to provision and manage scalable, reproducible, and cloud-agnostic inference environments
applying formal methods and computational theory to enhance scalability, reliability, and performance of distributed inference algorithms
optimizing ML algorithms in Scala, Java, and Go to improve memory efficiency, hardware utilization, and runtime speed
designing advanced observability frameworks with distributed tracing, model drift detection, and performance analytics
standardizing ML platform practices while mentoring engineers and aligning architecture with MLOps, compliance (GDPR, SOC2), and scalability standards

Responsibilities

Architect and lead the design of next-generation ML inference infrastructure to support globally distributed, multi-tenant model serving with high availability, scalability, and cost efficiency.
Designing and deploying low-latency, high-throughput inference systems supporting computer vision and multimodal models including CNNs, segmentation and object detection.
Building large-scale distributed systems in Scala using reactive frameworks, multithreading, and Guice dependency injection.
Integrating enterprise feature stores to ensure consistent, low-latency feature delivery.
Developing ML inference services within Bazel-based monorepos ensuring hermetic builds and seamless integration with secure enterprise CI/CD pipelines (Prow).
Extending GitHub Prow workflows by writing Golang plugins/controllers to automate reviews, enforce policies, and coordinate multi-repo build/test/deploy pipelines.
Utilizing Pulumi to provision and manage scalable, reproducible, and cloud-agnostic inference environments.
Applying formal methods and computational theory to enhance scalability, reliability, and performance of distributed inference algorithms.
Optimizing ML algorithms in Scala, Java, and Go to improve memory efficiency, hardware utilization, and runtime speed.
Designing advanced observability frameworks with distributed tracing, model drift detection, and performance analytics.
Standardizing ML platform practices while mentoring engineers and aligning architecture with MLOps, compliance (GDPR, SOC2), and scalability standards.