Senior Infrastructure Kafka Engineer

Technologent•Phoenix, AZ

1d•Hybrid

About The Position

The Opportunity: We are seeking a Senior Infrastructure - Kafka Engineer to join a high-performing data engineering team supporting large-scale, event-driven data platforms. This role is ideal for a seasoned engineer with deep experience in Apache Kafka / Confluent Kafka , messaging platforms , SQL/NoSQL databases , and cloud infrastructure , who can lead engineering, operations, and automation efforts across complex enterprise environments. This is a 6-month contract-to-hire opportunity supporting a hybrid work model in Phoenix, AZ . The ideal candidate is a hands-on infrastructure engineer with strong experience designing resilient Kafka environments, building real-time data pipelines, and supporting production systems in fast-paced enterprise settings. Role: Senior Infrastructure - Kafka Engineer Experience: 7+ Years Work Location: Phoenix, AZ (Hybrid - 4 days onsite / 1 day remote) Project Duration: 6-Month Contract-to-Hire

Requirements

7+ years of experience in infrastructure engineering with a strong focus on:
Kafka administration across on-prem and cloud environments
Kafka ecosystem components including brokers, topics, consumer groups, replication, and failover
Messaging systems such as MQ
SQL and NoSQL database integration
Proven experience designing, deploying, and scaling Kafka clusters and connector infrastructure in production and DR environments.
Hands-on experience building real-time data pipelines using Kafka producers and streaming consumers such as Spark Streaming.
Strong proficiency with at least one major cloud platform: AWS, GCP, or Azure.
Experience with event-driven architectures, containerization, and DevOps practices.
Experience with observability and monitoring tools such as Splunk, Datadog, and Grafana.
Solid understanding of networking, Linux/Windows operating systems, and core diagnostic tools.
Proficiency with source control tools such as SVN and Git.
Scripting and programming experience with tools such as PowerShell, Bash, Python, or Perl.
Demonstrated ability to analyze complex issues, make sound decisions with limited information, and drive issues through resolution.
Strong communication, customer service, and collaboration skills with the ability to work effectively across cross-functional technical teams.

Nice To Haves

Experience with additional enterprise monitoring and infrastructure support tools.
Experience working in highly regulated enterprise environments.
Prior exposure to large-scale data engineering or integration platforms.

Responsibilities

Administer, configure, and troubleshoot Kafka clusters across on-prem and cloud environments, including broker and cluster configuration, partitioning, and performance tuning.
Design and implement scalable, highly available Kafka infrastructure, including disaster recovery and multi-environment strategies.
Integrate Kafka with upstream and downstream systems using Kafka Connect and related connectors, including MQ, MongoDB, Oracle, SQL Server, PostgreSQL, and MySQL.
Build and support real-time data pipelines using Kafka producers and streaming consumers such as Spark Streaming and Kafka Streams.
Automate infrastructure provisioning and configuration across environments using Terraform and modern DevOps practices.
Deploy and manage Kafka components and clients in production and disaster recovery environments, ensuring resilience and recoverability.
Lead a small team of engineers and technicians in monitoring, diagnosis, and remediation of infrastructure issues.
Implement and maintain comprehensive monitoring, logging, and alerting using tools such as Splunk, Datadog, and Grafana.
Perform proactive health checks and capacity planning to identify and resolve issues before they impact service.
Serve as a primary point of contact for daily operations, major incidents, and escalations related to Kafka and associated infrastructure.
Develop, maintain, and continuously improve runbooks and playbooks for incident response, maintenance, and recurring operational tasks.
Analyze support trends and incident patterns to reduce downtime and drive root-cause resolution.
Ensure infrastructure and platform changes comply with internal standards, security policies, and applicable regulatory requirements.
Partner with security, networking, application, and data engineering teams to design and operate secure, compliant, event-driven architectures.
Contribute to standards, best practices, and technical documentation for Kafka, messaging, and integration patterns.
Participate in agile ceremonies and help influence technical direction for streaming and integration platforms.