About The Position

The Associate Principal of Infrastructure Engineering is responsible for OCC's infrastructure engineering while driving continuous improvement across the production environment. This role combines hands-on technical expertise with strategic operational excellence, managing production administration of application scheduling and automation using Automic/UC4 software across AWS and on-premises platforms (Windows, MVS, Linux). The position serves as a technical evangelist for the best operational practices, working cross-functionally with Production Support, Release Management, Platform Services, Development, Testing, and Business Operations teams to shift the organization from reactive to proactive operations.

Requirements

  • Strongly customer service oriented with understanding of service level agreement urgency
  • Excellent consultative, communication, analytical, and judgment skills
  • Strong problem-solving and decision-making abilities with capacity to perform well under pressure
  • Highly detail-oriented with strong time management skills
  • Ability to prioritize and align efforts with production availability requirements and business priorities
  • Ability to work effectively and interact with clients, team members, technical staff, consultants, and vendors
  • Adaptability to shifting priorities and embrace of change
  • Natural curiosity with strong desire to share knowledge, cross-train, and mentor others
  • Pragmatic approach to problem-solving that optimizes available resources
  • Demonstrates sound analytical and diagnostic skills for complex, ill-defined issues
  • Experience working in the financial services or clearing industry
  • Demonstrated ability to multi-task in a high-intensity environment with constantly changing priorities
  • Advanced proficiency with Automic/UC4 Scheduler and Automic scripting language
  • Strong knowledge of OS/390 JCL and IBM utilities
  • Proficiency with modern scripting languages (Python, PowerShell, Bash) and REGEX for pattern matching and validation
  • Experience with REST API integration and development
  • Proficiency with transmission protocols: Connect Direct, Sterling Integrator, FTP
  • Experience with ServiceNow for incident, problem, and change management
  • Working knowledge of ITIL principles and processes
  • Proficiency with Unix/Linux commands and tools (Putty, WinSCP)
  • Experience across multiple operating systems (Windows, OS/390, Linux) in both cloud and on-premises environments
  • Strong understanding of technical monitoring, metrics, and alerting systems
  • Knowledge of application lifecycle and deployment processes
  • Strong PC skills and proficiency with Microsoft Office Suite including Visio
  • Sophisticated understanding of enterprise technology architecture
  • Mature understanding of the relationship between software and hardware
  • Bachelor's degree in Computer Science, Engineering, Telecommunications, or related technical discipline (or equivalent combination of education and experience)
  • 5+ years of experience in infrastructure engineering and production systems operations

Nice To Haves

  • AWS cloud infrastructure experience and/or AWS certifications
  • Problem/Incident management certifications
  • Background in disaster recovery planning and execution
  • Experience leading continuous improvement initiatives
  • Experience with hybrid cloud/on-premises infrastructure management
  • Familiarity with legacy scripting languages (Perl, REXX)

Responsibilities

  • Infrastructure Engineering & Production Administration
  • Development Support: Code and maintain uc4 objects for development and testing teams; provide first-level support for non-production environments
  • Production Support: Provide second-level troubleshooting for scheduling issues in production environments
  • Automic Administration: Perform basic Automic/UC4 system administration including starting/stopping the Automic system and agents
  • Security Management: Maintain accurate inventory of all departmental owned logon IDs and passwords in CyberArk
  • Disaster Recovery: Provide support for all disaster recovery tests and exercises
  • Enterprise Scheduling & Automation
  • Complex Scheduling: Develop, maintain, and optimize job schedules using Automic/UC4 software across multiple AWS cloud and on-premises platforms (Windows, MVS, Linux), managing multi-job dependencies and automated workflows
  • Scripting & Automation: Utilize Automic scripting language, Python, PowerShell, and other scripting tools to automate variable passing, job execution, and test environment setup; use REGEX for pattern matching and validation
  • File Transmission: Coordinate and test new file transmissions with exchanges and members using Connect Direct, Sterling Integrator, and FTP protocols
  • Batch Processing: Prepare and maintain batch jobs and REST API calls
  • Operational Excellence & Continuous Improvement
  • Proactive Operations: Lead the shift from reactive to proactive operational posture by identifying and addressing issues before they impact production
  • Performance Analysis: Analyze and report on production performance, capacity planning, and critical-path processing opportunities
  • Metrics & KPIs: Develop, monitor, and report Key Performance Indicators to maintain compliance and drive measurable improvements through trend analysis
  • Problem Resolution: Identify and diagnose complex problems affecting production performance; stand up SWAT teams when needed to drive resolution of ongoing issues
  • Cross-Domain Collaboration: Work across network, database, storage, and application teams to assist with tuning and optimization
  • Risk Management: Surface environmental and operational risks; analyze repeating alerts to proactively identify issues
  • ITIL Leadership: Actively participate in and shepherd Incident, Problem, and Change Management processes using ServiceNow; ensure adherence to ITIL best practices
  • Process Automation: Evangelize for and implement repeatable, scalable automated processes
  • Capacity Planning: Forecast system demands and recommend upgrades, expansions, and reconfigurations
  • Documentation, Communication & Stakeholder Management
  • Documentation: Maintain updated procedures on all supported products; create comprehensive process documentation
  • Status Reporting: Provide daily status reports to management; attend project and status meetings as required
  • Knowledge Sharing: Cross-train team members and stakeholders; deliver training on new product releases and best practices
  • Vendor Coordination: Manage vendor support relationships and drive issue resolution
  • Consultation: Serve as a consultant and evangelist for operational best practices across the organization
  • Additional Responsibilities
  • On-Call Support: Provide on-call and/or on-site support for installs, production issues, and system availability
  • Off-Hours Work: Participate in after-hours and weekend maintenance windows as required
  • Troubleshooting: Perform complex hardware and software troubleshooting, taking corrective actions or coordinating with IT staff and vendors

Benefits

  • A highly collaborative and supportive environment developed to encourage work-life balance and employee wellness.
  • A hybrid work environment, up to 2 days per week of remote work
  • Tuition Reimbursement to support your continued education
  • Student Loan Repayment Assistance
  • Technology Stipend allowing you to use the device of your choice to connect to our network while working remotely
  • Generous PTO and Parental leave
  • 401k Employer Match
  • Competitive health benefits including medical, dental and vision
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service