Member of Technical Staff (Software Engineer)

Cerebras SystemsSunnyvale, CA
Remote

About The Position

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation. Cerebras Systems Inc. has multiple openings for Member of Technical Staff (Software Engineer)

Requirements

  • Master’s degree or foreign equivalent degree in Computer Science, or a related field and 1 year of experience as Software Developer, Student/Intern (Software Developer), Member of Technical Staff (Software Engineer), Software Engineer, or a related occupation required.
  • Docker and Kubernetes
  • Java or C++
  • ActiveMQ and Kafka
  • Python or Groovy
  • JavaScript or TypeScript
  • Linux
  • SQL, OracleDB, and Redis
  • Git

Responsibilities

  • Implement infrastructure to support high-performance, low-latency inference service.
  • Deploy and configure Kubernetes services to ensure scalability and reliability of inference workloads.
  • Optimize resource allocation and auto-scaling policies to handle variable inference demand while minimizing operational costs.
  • Integrate inference services with containerized environments using Docker and Kubernetes for orchestration.
  • Ensure high availability and fault tolerance by implementing multi-region deployments and disaster recovery strategies.
  • Develop Python-based scripts and APIs to streamline data preprocessing, inference execution, and post-processing for real-time inference tasks.
  • Collaborate with machine learning engineers to validate inference accuracy and performance against functional and latency requirements.
  • Triage and resolve defects in the service by analyzing logs, metrics, and distributed traces.
  • Debug issues related to model deployment, container orchestration, or networking configurations, documenting steps to reproduce and root-cause defects.
  • Collaborate with cross-functional teams to address performance regressions, scalability issues, or integration failures in the inference pipeline.
  • Develop automated scripts to detect and mitigate common failure modes, improving system reliability.
  • Author detailed technical documentation for infrastructure configurations, inference workflows, and APIs, ensuring clarity for internal teams and external customers.
  • Work with product management and user experience teams to define requirements for inference service interfaces, including configuration, monitoring, and event logging.
  • Document and track defects, enhancements, and release notes using tools like Jira and Git, ensuring version control and traceability.
  • Participate in release planning and prioritization discussions to align infrastructure development with customer needs and business objectives.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service