Software Engineer, Big Data Infrastructure

Benchling•San Francisco, CA

5d•Hybrid

About The Position

Benchling's mission is to unlock the power of biotechnology. The world's most innovative biotech companies use Benchling's R&D Cloud to power the development of breakthrough products and accelerate time to milestone and market. Benchling's customers generate a rich and variety of science data. To keep up our innovation, Benchling need a highly scalable and extensible data platform that can serve both its customers and internal application team. As one of Benchling’s Data Platform engineers, you’ll join a rapidly growing, premier engineering team and form the foundation of our data pillar, encompassing customer-facing data products, internal analytics, and the customer-facing data warehouse. The Big Data Infrastructure team is responsible for enabling customers access to their Benchling data for analytics & AI. You will build the next generation of Data Platform services that enables ingress and egress data access so that Benchling can seamlessly integrate with customer data lakes. Benchling is growing really quickly, and you’ll be setting the bar for high quality data and a metrics-driven culture as we scale. You’ll serve as a key input and thought leader, and work closely with the product teams to deliver data driven capabilities to our internal and external customers.

Requirements

Have 2+ years of experience or a proven track record in software engineering
Strong experience in backend engineering and distributed systems
Strong experience with scripting language (such as Python)
Experience with deployment and configuration management frameworks such as Terraform, Ansible, or Chef and container management systems such as Kubernetes or Amazon ECS
Driven by creating positive impact for our customers and Benchling's business, and ultimately accelerating the pace of research in the Life Sciences
Comfortable with complexity in the short term but can build towards simplicity in the long term
Strong communicator with both words and data - you understand what it takes to go from raw data to something a human understands
Willing to work onsite in our SF office 3 days a week

Nice To Haves

Experience with data analytics and warehouse solutions (e.g. Snowflake, Delta Lake), data processing technologies (e.g. Kafka, Spark), schema design, and SQL are a plus!

Responsibilities

Build next generation Data Platform with scalable data ingress/egress for internal and external customers
Define and design data transformations and pipelines for cross-functional datasets, while ensuring that data integrity and data privacy are first-class concerns regarded proactively, instead of reactively
Define the right Service Level Objectives for the batch & streaming pipelines, and optimize their performance
Designing and creating CI/CD pipelines for platform provisioning, full lifecycle management. Building the platform control panel to operate the fleet of systems efficiently
Work closely with the team across Application and Platform to establish best practices around usage of our data platform

Benefits

Competitive total rewards package
Broad range of medical, dental, and vision plans for employees and their dependents
Fertility healthcare and family-forming benefits
Four months of fully paid parental leave
401(k) + Employer Match
Commuter benefits for in-office employees and a generous home office set up stipend for remote employees
Mental health benefits, including therapy and coaching, for employees and their dependents
Monthly Wellness stipend
Learning and development stipend
Generous and flexible vacation
Company-wide Winter holiday shutdown
Sabbaticals for 5-year and 10-year anniversaries