Production Support Engineer (Kafka)

FiservBerkeley Heights, NJ
20hOnsite

About The Position

Calling all innovators - find your future at Fiserv. We're Fiserv, a global leader in Fintech and payments, and we move money and information in a way that moves the world. We connect financial institutions, corporations, merchants and consumers to one another millions of times a day - quickly, reliably, and securely. Any time you swipe your credit card, pay through a mobile app, or withdraw money from the bank, we're involved. If you want to make an impact on a global scale, come make a difference at Fiserv. Job Title Production Support Engineer (Kafka) About your role You will be a technology professional contributing across multiple stages of the Software Development Lifecycle. Your primary focus will be production support—covering incidents, change management, problem management, certification support, and monthly patching for production environments. You will support and maintain Kafka‑based data pipelines and job scheduling workflows to ensure reliable and timely processing. Experience with Chronos is preferred but not required—candidates may alternatively bring experience with other enterprise schedulers (e.g., Airflow, Control‑M, Autosys, Kubernetes CronJobs) with willingness to learn Chronos. You will help create production run books, maintain change registers, perform impact analysis, and ensure the overall stability of platforms built on Kafka and scheduling technologies. What you'll do Participate in core production management functions including incident, change, and problem management, collaborating closely with cross‑functional teams to support metrics‑based tracking, planning, risk mitigation, and certification readiness for top‑tier clients. Support Kafka clusters, topics, consumer groups, and message flows to maintain high availability and consistent data processing across production systems. Oversee and troubleshoot job scheduling workflows—with Chronos preferred but optional—ensuring SLA adherence, timely batch execution, dependency handling, and automated recovery for failed tasks. Perform plan reviews and submit high‑quality changes that improve platform resiliency, change hygiene, and operational success for both Kafka and scheduling workloads. Collaborate with production assurance teams to maintain production run books, change registers, automation assets, and process improvements; conduct detailed impact and root‑cause analysis for issues involving data streaming or scheduling components. Provide production support for innovative, high‑quality, large‑scale financial solutions in partnership with engineering and operations professionals. Execute proactive monitoring using tools such as Splunk, Dynatrace, Grafana, and Moogsoft, including building alerts and dashboards to monitor Kafka lag, consumer health, broker performance, and scheduling job status.

Requirements

  • 5+ years of experience in an enterprise IT environment.
  • 4+ years of experience with Unix and Windows, including scheduling tools (Chronos optional; equivalents accepted).
  • 4+ years of experience with monitoring tools such as Splunk, Dynatrace, Grafana, and Moogsoft.
  • 4+ years of experience working with incident, change, and problem management processes and ticketing systems such as ServiceNow, Remedy, or CA Service Desk.
  • 3+ years of hands‑on experience with Kafka.
  • 3+ years of experience in production management, application/platform monitoring, and participation in on‑call rotations.
  • Bachelor’s degree in a related field or an equivalent combination of education, military, and work experience.

Nice To Haves

  • Knowledge of ITIL frameworks and best practices.
  • Experience with automation and scripting languages (e.g., Shell, Python).
  • Strong analytical and troubleshooting skills.
  • Experience with ARO, PCF, AWS, Azure, or Google Cloud.
  • Experience with version control tools such as VSS, Git, and Bitbucket.

Responsibilities

  • Participate in core production management functions including incident, change, and problem management, collaborating closely with cross‑functional teams to support metrics‑based tracking, planning, risk mitigation, and certification readiness for top‑tier clients.
  • Support Kafka clusters, topics, consumer groups, and message flows to maintain high availability and consistent data processing across production systems.
  • Oversee and troubleshoot job scheduling workflows—with Chronos preferred but optional—ensuring SLA adherence, timely batch execution, dependency handling, and automated recovery for failed tasks.
  • Perform plan reviews and submit high‑quality changes that improve platform resiliency, change hygiene, and operational success for both Kafka and scheduling workloads.
  • Collaborate with production assurance teams to maintain production run books, change registers, automation assets, and process improvements; conduct detailed impact and root‑cause analysis for issues involving data streaming or scheduling components.
  • Provide production support for innovative, high‑quality, large‑scale financial solutions in partnership with engineering and operations professionals.
  • Execute proactive monitoring using tools such as Splunk, Dynatrace, Grafana, and Moogsoft, including building alerts and dashboards to monitor Kafka lag, consumer health, broker performance, and scheduling job status.

Benefits

  • Fuel Your Life program to support your physical, financial, social, and emotional well-being.
  • Paid holidays and generous time away policies.
  • No-cost mental health support through Employee Assistance Programs.
  • Living Proof program to recognize your peers’ extra effort with points redeemable for rewards.
  • Eight Employee Resource Groups to foster a collaborative culture and expand your network.
  • Unparalleled professional growth with training, development, and internal mobility opportunities.
  • Medical, dental, vision, life, and disability insurance options available from day one.
  • Retirement planning and discounted shares with the Employee Stock Purchase Plan.
  • Tuition assistance and reimbursement program.
  • Paid parental, caregiver, and military leave.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service