Senior Staff Engineer - Data Lakehouse Platform

Geico Insurance-posted about 1 month ago

$110,000 - $260,000/Yr

Full-time • Mid Level

Richardson, TX

5,001-10,000 employees

Insurance Carriers and Related Activities

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

GEICO is seeking an experienced Engineer with a passion for building high-performance, low maintenance, zero-downtime platforms, and core data infrastructure. You will help drive our insurance business transformation as we transition from a traditional IT model to a tech organization with engineering excellence as its mission, while co-creating the culture of psychological safety and continuous improvement. Our Staff/Senior Staff Engineer is a key member of the engineering staff working across the organization to innovate and bring the best open-source data infrastructure and practices into GEICO as we embark on a greenfield project to implement a core Data Lakehouse for all Geico's core data use-cases across each of the company's business verticals.

Scope, design, and build scalable, resilient Data Lakehouse components
Lead architecture sessions and reviews with peers and leadership
Spearhead new software evaluations and innovate with new tooling
Design and lead the development & implementation of Compute Efficiency projects like Smart Spark Auto-Tuning Feature.
Drive performance regression testing, benchmarking, and continuous performance profiling.
Accountable for the quality, usability, and performance of the solutions
Determine and support resource requirements, evaluate operational processes, measure outcomes to ensure desired results, and demonstrate adaptability and sponsoring continuous learning
Collaborate with customers, team members, and other engineering teams to solve our toughest problems
Be a role model and mentor, helping to coach and strengthen the technical expertise and know-how of our engineering community
Consistently share best practices and improve processes within and across teams
Share your passion for staying on top of the latest open-source projects, experimenting with, and learning recent technologies, participating in internal and external OSS technology communities, and mentoring other members of the engineering community

Deep knowledge ofâ¯Spark internals, including Catalyst, Tungsten, AQE, CBO, scheduling, shuffle management, and memory tuning.
Proven experience inâ¯tuning and optimizing Spark jobsâ¯on Hyper-Scale Spark Compute Platforms.
Mastery of Spark configuration parameters, resource tuning, partitioning strategies, and job execution behaviors.
Experience buildingâ¯automated optimization systemsâ¯- from config auto-tuners to feedback loops and adaptive pipelines.
Strong software engineering skills inâ¯Scala,â¯Java, and python are required.
Ability to build tooling to surface meaningful performance insights at scale.
Deep understanding of auto-scaling and cost-efficiency strategies in cloud-based Spark environments.
Exemplary ability to design and develop, perform experiments, and influence engineering direction and product roadmap
Advanced experience developing new and enhancing existing open-source based Data Lakehouse platform components
Experience cultivating relationships with and contributing to open-source software projects.
Experience with open-source table formats (Apache Iceberg, Delta, Hudi or equivalent)
Advanced experience with open-source compute engines (Apache Spark, Apache Flink, Trino/Presto, or equivalent)
Experience with cloud computing (AWS, Microsoft Azure, Google Cloud, Hybrid Cloud, or equivalent)
Expertise in developing distributed systems that are scalable, resilient, and highly available
Experience in container technology like Docker and Kubernetes platform development
Experience with continuous delivery and infrastructure as code
In-depth knowledge of DevOps concepts and cloud architecture
Experience in Azure Network (Subscription, Security zoning, etc.) or equivalent
10+ years of professional experience in data software development, programming languages and developing with big data technologies
8+ years of experience with architecture and design
6+ years of experience with distributed systems, with at least 3 years focused onâ¯Apache Spark.
6+ years of experience in open-source frameworks
4+ years of experience with AWS, GCP, Azure, or another cloud service
Bachelor's or Master's degree in Computer Science, Software Engineering, or related field like physics or mathematics.

Active or pastâ¯Apache Spark Committerâ¯(or significant code contributions to OSS Apache Spark).
Experience withâ¯ML-based optimization techniquesâ¯(e.g., reinforcement learning, Bayesian tuning, predictive models).
Contributions to other big data/open-source projects (e.g., Delta Lake, Iceberg, Flink, Presto, Trino).
Background in designing performance regression frameworks and benchmarking suites.
Deep understanding of Spark accelerators (Spark RAPIDS, Apache Gluten, Apache Comet, Apache Auron, etc.) committer status in one or more project is a plus.
Skilled in documenting methodologies and producing publication-style papers, whitepapers, and internal research briefs.

Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family's overall well-being.
Financial benefits including market-competitive compensation; a 401K savings plan vested from day one that offers a 6% match; performance and recognition-based incentives; and tuition assistance.
Access to additional benefits like mental healthcare as well as fertility and adoption assistance.
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year.

Track Jobs with Teal

Job Search Resources

•

AI Resume Builder

•

Senior Data Engineer Resume Examples

•

Senior Data Engineer Cover Letter Examples

Senior Staff Engineer - Data Lakehouse Platform

Job Search Resources

Tools

Career Hubs

Guides

Company