Senior Software Engineer (Python + Distributed systems)

Scribd•San Francisco, CA

49d•Hybrid

About The Position

Scribd, Inc. is seeking a Senior Software Engineer with deep experience in building event-driven, distributed, and scalable systems in Python. This role is part of the ML Data Engineering team, which powers metadata extraction, enrichment, and content understanding across all Scribd brands. The team processes vast amounts of data, including hundreds of millions of documents and billions of images, to deliver high-quality metadata for content discovery and trust. The systems operate at a massive scale, supporting diverse datasets like user-generated content, ebooks, and audiobooks. This role involves working at the intersection of machine learning, data engineering, and distributed systems, collaborating with applied research and product teams to deploy scalable ML and LLM-powered solutions. The engineer will design and optimize large-scale data and service pipelines on AWS, supporting content enrichment and metadata systems, and work with cross-functional teams to design reliable backend services integrating ML and LLM components. This is an opportunity to work on cutting-edge generative AI and metadata enrichment problems at a global scale.

Requirements

7+ years of professional software engineering experience with a focus on backend or distributed systems development.
Strong proficiency in Python (5+ years).
Expertise in designing and architecting large-scale event-driven and distributed systems.
Strong cloud expertise with AWS services (ECS, Lambda, SQS, SNS, CloudWatch, etc.).
Experience with infrastructure-as-code tools like Terraform.
Solid understanding of system performance, profiling, and optimization.
Experience leading technical projects and mentoring engineers.
Bachelor’s degree in Computer Science or equivalent professional experience.

Nice To Haves

Experience with Scala is a plus.
Familiarity with data processing frameworks (Spark, Databricks) and workflow orchestration tools.
Experience integrating ML or LLM-based models into production systems.

Responsibilities

Provide technical leadership, mentorship, and guidance to engineers across the organization, driving secure coding best practices.
Lead the design, implementation, and scaling of event-driven, distributed systems to extract, enrich, and process metadata from large-scale document and media datasets.
Partner with Data Science, Infrastructure, ML Engineering, and Product teams to architect and deliver robust systems that balance scalability, high performance, and rapid iteration.
Contribute to the team’s engineering strategy, identifying gaps, proposing new initiatives, and improving existing frameworks.
Build and maintain scalable APIs and backend services for high-throughput content processing.
Leverage AWS services (ECS, Lambda, SQS, ElastiCache, CloudWatch) to design and deploy resilient, high-performance systems.
Optimize and refactor existing backend systems for scalability, reliability, and performance.
Ensure system health and data integrity through monitoring, observability, and automated testing.

Benefits

Scribd Flex (flexible work model)
Comprehensive health, dental, and vision coverage
Mental health support and disability coverage
Generous paid time off, including vacation, sick time, holidays, winter break, volunteer time, and sabbaticals
Paid parental leave and family support benefits
Retirement matching and employee equity
Learning and development programs and professional growth opportunities
Wellness and home office stipends
Complimentary access to the Scribd, Inc. suite of products
Enterprise access to leading AI tools