Software Engineer

State Farm•Dunwoody, GA

22h•Hybrid

About The Position

State Farm is hiring a Software Engineer to design and build end-to-end data pipelines that pull from source systems including Entra ID, ServiceNow, Dynatrace, GitLab, Agility, and more—transforming messy, inconsistent tool data into clean canonical entities using bronze/silver/gold (medallion-style) layers. You’ll tackle problem solving at the core of our platform by delivering deterministic identity resolution (linking the same person/service/deployment across different names, IDs, and schemas), and by building and populating a knowledge graph that connects commits → deployments → incidents → service ownership. You’ll also compute trusted DORA metrics (with confidence scores), define and evolve durable entity models across a 14-domain canonical data model, and build the retrieval surface (APIs, query interfaces, and AI-agent access patterns) so both humans and AI can reliably consume what the platform knows. This is a green-field implementation of an architecture that’s already been validated end-to-end. While the core architectural approach is established, you’ll have significant ownership in building and evolving the implementation from the ground up. You’ll need strong engineering fundamentals with hands-on experience in event streaming (e.g., Kafka) and lakehouse/data platform tooling (e.g., Databricks), along with a strong data modeling, schema design, and transformation pipeline background. Java and Python are both explicitly needed for the platform’s pipelines and services, and you should be comfortable across the stack (ingestion, transformation, storage, computation, and serving). Bonus experience includes orchestrating event-driven architectures, working with graph/knowledge graph technologies, integrating with CI/CD and observability tools as data sources, and building retrieval/grounding patterns for AI; most importantly, you’ll bring the problem-solving mindset required to handle ghost records, stale data, mismatched org structures, and schema durability through source system migrations (e.g., GitLab→GitHub, Agility→Jira).

Requirements

Strong software and data engineering background: You’ve designed and shipped end-to-end pipelines that ingest from multiple source systems, handle messy/inconsistent data, and produce results that are correct and dependable (not just fast).
Hands-on expertise in data modeling and transformations: You’re comfortable defining and evolving canonical entity models using patterns like bronze/silver/gold (medallion architecture), and building transformation pipelines that can survive upstream schema changes and migrations.
Full-stack data platform experience: You can work across the pipeline end-to-end—ingestion, transformation, storage, computation, and serving—including building reliable interfaces for downstream consumers.
Proficiency in core engineering languages and architectures: Experience with Java and/or Python (or Scala) plus event-driven architectures and pipeline orchestration tools.
Knowledge graph / retrieval experience: You’ve built or operated systems involving graph databases or knowledge graph technologies, and you understand how to power query and retrieval surfaces for both humans and AI.
Proven problem-solving with identity resolution and data quality: You’ve solved challenges where the same entity appears with different names/IDs/schemas across tools, and where data is incomplete, inconsistent, or wrong.
Clear communication and cross-team collaboration: You work effectively across multiple tracks and stakeholders because this role requires coordination across the organization.
Experience integrating engineering lifecycle and observability sources: You’ve used Git platforms, CI/CD, incident management, and observability tools as data inputs to pipelines—understanding their data models, edge cases, and failure modes.
Must live within a 180-mile radius of a hub location: Bloomington, IL; Richardson, TX; Atlanta, GA; or Tempe, AZ.
Must be eligible to lawfully work in the U.S. immediately; employer will not sponsor applicants for U.S. work authorization.

Nice To Haves

AI/ML integration experience: Building retrieval surfaces, grounding LLMs in structured data, and/or using RAG patterns to reduce hallucinations.
Lakehouse and streaming experience: Familiarity with lakehouse architectures and/or event streaming (e.g., Kafka) and large-scale data platform engineering.
Platform mindset: You’ve worked on a system where the data model and metrics definitions mattered more than the UI.
Experience in event streaming (e.g., Kafka).
Experience with lakehouse/data platform tooling (e.g., Databricks).
Orchestrating event-driven architectures.
Working with graph/knowledge graph technologies.
Integrating with CI/CD and observability tools as data sources.
Building retrieval/grounding patterns for AI.

Responsibilities

Design and build end-to-end data pipelines that pull from source systems including Entra ID, ServiceNow, Dynatrace, GitLab, Agility, and more—transforming messy, inconsistent tool data into clean canonical entities using bronze/silver/gold (medallion-style) layers.
Tackle problem solving at the core of our platform by delivering deterministic identity resolution (linking the same person/service/deployment across different names, IDs, and schemas).
Build and populate a knowledge graph that connects commits → deployments → incidents → service ownership.
Compute trusted DORA metrics (with confidence scores).
Define and evolve durable entity models across a 14-domain canonical data model.
Build the retrieval surface (APIs, query interfaces, and AI-agent access patterns) so both humans and AI can reliably consume what the platform knows.
Have significant ownership in building and evolving the implementation from the ground up.
Handle ghost records, stale data, mismatched org structures, and schema durability through source system migrations (e.g., GitLab→GitHub, Agility→Jira).

Benefits

Compensation is based on our standard 38:45-hour work week.
Potential starting salary range: $105,000 - $130,000
Starting salary will be based on skills, background, and experience.
High end of the range limited to applicants with significant relevant experience.
Potential yearly incentive pay up to 15% of base salary.
Annual raise and bonus.
Robust health and wellbeing programs.
State Farm pays most of your healthcare premium.
Multiple healthcare plan options, including a high deductible plan.
All medical plans provide 100% coverage for in-network preventative care.
Access to vision, dental, telemedicine, 24/7 mental health professionals, and much more.
Educational benefits like industry leading training programs.
Top-notch tuition assistance programs.
Employee resource groups.
Mentoring.
Fertility/IVF/adoption assistance.
College coaching.
National discount programs.
Interactive monthly financial workshops.
Free financial coaching.
Ability to start a savings account or consider financing through our State Farm Federal Credit Union.
Generous time off policies.
Opportunity to initially earn up to 20 days annually plus parental leave.
Paid holidays.
Celebration day.
Life leave (40 hours/year).
Bereavement leave.
Community service/education support days.
Matching Gift Program.
Good Neighbor Grant Program.
Employee Assistance Fund.
Free financial advisors.
401(k) plan with company contributions of up to 7% of your salary.