Principal Systems Engineer

DTCCJersey City, NJ
Hybrid

About The Position

The Principal Architect – Systems Engineer (Linux OS Operations, Private Cloud) is a critical role responsible for the stability, security, and long‑term evolution of the enterprise Linux private cloud platform. This role serves as the technical authority for Linux operating system architecture, ensuring highly available, resilient, and compliant platforms that support mission‑critical business applications. By driving architectural standards, modernization initiatives, and operational best practices, the role directly reduces infrastructure risk, improves platform reliability, and enables predictable, high‑quality service delivery in a regulated enterprise environment. In addition, this role shapes the future of Linux operations through automation, standardization, and cross‑functional technical leadership. The Principal Architect influences how Linux platforms are designed, built, secured, and operated—leading efforts in OS lifecycle management, security hardening, capacity planning, and disaster recovery readiness. Acting as a trusted advisor to senior leadership and engineering teams, the role translates business and regulatory requirements into scalable technical solutions while mentoring senior engineers and establishing reference architectures that drive operational efficiency, cost optimization, and continuous improvement across the private cloud ecosystem.

Requirements

  • Minimum of 8+ years of related experience
  • Bachelor's degree preferred or equivalent experience
  • Strong hands-on experience administering enterprise-scale Linux (RHEL or equivalent) environments.
  • Solid expertise in OS system administration (systemd, networking, filesystems, permissions).
  • Proven experience supporting mission-critical, high-availability platforms in regulated environments.
  • Strong knowledge of ITIL processes, including Incident, Problem, Change, and Release Management.
  • Demonstrated ability to lead and resolve Critical and Major production incidents across multi‑platform environments.
  • Proficiency in Ansible automation and scripting using PowerShell and Linux scripting (Bash/Python).
  • Strong understanding of virtualization technologies (VMware ESXi or equivalent) and virtual infrastructure operations.
  • Experience supporting cloud and hybrid environments, preferably AWS, including Windows and Linux workloads.
  • Excellent troubleshooting, analytical, and communication skills with the ability to operate effectively under pressure.

Responsibilities

  • Provide advanced Level 2 / Level 3 production support for Linux server environments across the enterprise.
  • Manage patching, upgrades, and lifecycle management of Windows and Linux platforms in alignment with security, compliance, and enterprise standards.
  • Troubleshoot and resolve OS-level, hardware, and platform issues, including CPU, memory, disk, network, storage, and kernel/system services.
  • Lead or contribute to automation and scripting initiatives using Ansible, PowerShell, Bash, Python, or similar tools to improve operational efficiency and reduce manual effort.
  • Ensure availability, performance, resilience, and recoverability of production environments through proactive monitoring, maintenance, and capacity planning.
  • Act as a key responder for Critical and Major production incidents, driving end‑to‑end restoration, root cause analysis, remediation, and post‑incident reviews.
  • Support application deployments and collaborate closely with application, database, middleware, storage, network, and security teams.
  • Develop and deliver operational metrics, dashboards, and KPIs across infrastructure platforms, providing actionable insights to leadership.
  • Drive platform standardization, best practices, and continuous improvement initiatives across distributed environments.
  • Serve as a technical escalation point and mentor for engineers across Linux and platform operations teams.

Benefits

  • Competitive compensation, including base pay and annual incentive
  • Comprehensive health and life insurance and well-being benefits, based on location
  • Pension / Retirement benefits
  • Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
  • DTCC offers a flexible/hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service