Senior Associate - Workload Automation Engineer

New York LifeNew York, NY
$90,000 - $128,500Hybrid

About The Position

Serve as the engineering owner for New York Life’s enterprise workload automation ecosystem. You’ll operate and harden scheduling platforms and calendars, design resilient restart/rerun patterns, and standardize job definitions, logging, and audit evidence across environments. Your work will ensure critical batch chains run predictably, meet SLAs, and support a consistent, automation-first operating model.

Requirements

  • 5–8+ years of experience in enterprise workload automation, SRE, or production operations supporting mission-critical batch processing.
  • Hands-on experience with Stonebranch or at least one major enterprise scheduler (e.g., ESP, Control-M, AutoSys, IBM Workload Scheduler/TWS, Redwood) including:
  • Operating controllers/agents across environments.
  • Managing calendars/holiday tables and SLA jeopardy configurations.
  • Strong scripting and automation skills in PowerShell, Bash, or Python, plus familiarity with YAML/JSON and REST APIs.
  • Experience with Git-based workflows and CI/CD pipelines for job-as-code and configuration promotion.
  • Proven design and implementation of restart/rerun patterns, dependency modeling, and idempotent batch frameworks.
  • Experience integrating schedulers with observability platforms (logs/metrics/dashboards) and defining SLIs/SLOs.
  • Excellent coordination skills across incident and change processes, with clear, concise communication to technical and non-technical stakeholders.

Nice To Haves

  • Experience in financial services or other highly regulated industries.
  • Background standardizing multiple schedulers and creating common audit schemas and evidence-capture patterns.
  • Relevant certifications such as ITIL, cloud architect/operations, DR/BC (e.g., DRII/BCI), or security (e.g., CISSP).

Responsibilities

  • Operate and maintain scheduling controllers and agents across environments.
  • Manage calendars and holiday tables; configure SLA jeopardy thresholds, alerting, and escalation paths.
  • Implement platform upgrades, patches, and configuration changes in line with standards and change governance.
  • Design restart/rerun patterns (checkpointing, idempotent wrappers) and failure-handling flows for critical batches.
  • Model dependencies and schedules as code (job-as-code) in version control with CI/CD-based promotion.
  • Reduce single points of failure and improve consistency across job chains and environments.
  • Define and maintain standard naming conventions, templates, parameters, and calendars across schedulers.
  • Engineer common audit-evidence and log schemas to support internal and external reviews.
  • Ensure data retention, traceability, and segregation of duties align with policies and regulatory requirements.
  • Implement pre/post checks, synthetic probes, and health validations for batch workflows.
  • Define and maintain SLIs/SLOs for batch completion, success rates, and recovery times.
  • Build safeguards that detect anomalies and misconfigurations before they impact downstream processes.
  • Integrate schedulers with observability tools (logs, metrics, dashboards) to improve visibility.
  • Tune job concurrency, execution windows, and resource usage for performance and cost efficiency.
  • Reduce noisy alerts and improve the signal-to-noise ratio for incident responders.
  • Align scheduler changes, maintenance, and releases with APSO/Change Management processes.
  • Lead incident triage and resolution for batch failures, including rapid root-cause analysis and safe restarts/reruns.
  • Contribute to post-incident reviews and drive remediation actions into platform and pattern improvements.
  • Collaborate with Application Owners/Developers, DBAs/Data teams, SRE/Observability, Security, and Vendors to keep batch chains healthy and compliant.
  • Provide guidance on best practices for job design, scheduling windows, dependencies, and error handling.
  • Document patterns, playbooks, and standards; mentor peers and junior engineers in workload automation.

Benefits

  • We provide a full package of benefits for employees – and have unique offerings for a modern workforce, including leave programs, adoption assistance, and student loan repayment programs.
  • Based on feedback from our employees, we continue to refine and add benefits to our offering, so that you can flourish both inside and outside of work.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service