Senior Software Engineer - Cloud Infrastructure & Automation

Sony Playstation NetworkSan Diego, CA
47d

About The Position

PlayStation isn't just the Best Place to Play - it's also the Best Place to Work. Today, we're recognized as a global leader in entertainment producing The PlayStation family of products and services including PlayStation5, PlayStation4, PlayStationVR, PlayStationPlus, acclaimed PlayStation software titles from PlayStation Studios, and more. PlayStation also strives to create an inclusive environment that empowers employees and embraces diversity. We welcome and encourage everyone who has a passion and curiosity for innovation, technology, and play to explore our open positions and join our growing global team. The PlayStation brand falls under Sony Interactive Entertainment, a wholly-owned subsidiary of Sony Group Corporation. Ready to take your career to the next level? Join PlayStation as a Senior Software Engineer - Cloud Infrastructure & Automation and be a key player in driving innovation in the dynamic field of interactive entertainment. You'll collaborate with an elite engineering team dedicated to crafting seamless, scalable experiences for a worldwide player base. At PlayStation, we're recognized not only for providing outstanding gaming experiences but also for encouraging a top-tier engineering environment centered on innovation, creativity, and technical excellence. We encourage passionate engineers who thrive on solving complex problems and are aligned with our vision of crafting the future of play. A Senior Software Engineer - Cloud Infrastructure & Automation is crucial for developing, automating, and maintaining scalable data platforms, emphasizing Infrastructure as Code (IaC) and cloud-native technologies. This role centers on the reliability and automation of NoSQL, Streaming, and Caching services within AWS and GCP environments. You'll compose durable automation frameworks, maintain high availability, and collaborate with product and platform teams to provide resilient, high-performance infrastructure for real-time data services. Embrace Dev & SRE principles, prioritize automation, and use AI/ML to boost system performance. Work closely with platform and product teams to ensure the seamless integration and delivery of high-performance solutions for global PlayStation experiences. Your efforts will directly impact the scalability and reliability of our global data solutions platform.

Requirements

  • Bachelor's degree in Computer Science or a related field, or relevant experience.
  • 6+ years of software development and SRE experience, with at least 3+ years specializing in Go and Infrastructure As Code with a focus on automation.
  • Deep proficiency in Go (Golang), with the ability to write performant, idiomatic, and maintainable code for production-scale systems.
  • Established track record crafting modular, architecture-focused frameworks in Go, supporting large and complex backend services.
  • Expertise with infrastructure-as-code tools such as Terraform, Ansible.
  • Expertise in operations: scaling, consistency tuning, compaction, repair, and backup/recovery of databases.
  • Familiarity with NoSQL, caching, and streaming platforms (e.g., Apache Kafka, Redis, AWS MSK).
  • Solid understanding of Linux internals, networking, and storage systems.
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes.
  • Cloud experience (AWS, GCP, or Azure), with knowledge of managed services (e.g., DynamoDB, ElastiCache, MSK or equivalent experience).
  • Strong problem-solving and analytical skills, with a passion for automation and distributed systems reliability.
  • Excellent communication and collaboration skills, with experience mentoring and influencing peers across diverse teams.

Nice To Haves

  • Having worked with Go for infrastructure automation, control plane services, or SRE-focused tooling before is beneficial.
  • Experience bringing to bear AI/ML on IAC and automation, provisioning, anomaly detection, predictive scaling, or intelligent incident response is a plus.
  • Possessing certification in relevant technologies (e.g., AWS Certified Database - Specialty) is advantageous.

Responsibilities

  • Develop and implement Infrastructure as Code (IaC) and automate the provisioning, monitoring, scaling, and lifecycle management of NoSQL, Streaming, and Caching platforms (e.g., Cassandra, Kafka, Redis).
  • Drive end-to-end automation to enable repeatable, reliable, and self-service deployment of data services across cloud and hybrid environments.
  • Guarantee the platform data solutions are always available, scalable, and resilient.
  • Define and enforce SLIs, SLOs, and error margins for data platforms to drive reliability engineering practices.
  • Develop highly efficient, self-repairing systems, automated redundancy, and scalability solutions for databases and streaming platforms.
  • Develop observability solutions (metrics, logging, tracing) for Cassandra, Redis, and Kafka/MSK to ensure proactive issue detection.
  • Collaborate with engineering and platform teams to deliver dependable, scalable, and high-performing data services.
  • Lead incident response for critical database/caching/streaming issues and drive root cause analysis with permanent automated fixes.
  • Explore and apply automation methods (e.g., anomaly detection, predictive scaling, automated remediation) to improve operational efficiency.
  • Drive and implement guidelines, procedures, and operational playbooks to facilitate knowledge sharing and support continuous improvement across global teams.
  • Mentor junior engineers and influence guidelines in automation, distributed systems, and database reliability.

Benefits

  • medical
  • dental
  • vision
  • matching 401(k)
  • paid time off
  • wellness program
  • employee discounts for Sony products
  • bonus package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service