SRE Consultant

NTT DATAJersey City, NJ
1dHybrid

About The Position

Site Reliability Engineering Consultant – New Jersey, Hybrid (2 days/wk) The Site Reliability Engineering Consultant, will be responsible for the development and overall implementation of software in a complex, critical and large cross-departmental and multi-disciplinary area. This role is part of a multi-year transformation journey that will require a successful candidate to establish best practices, motivate and promote a cultural shift that will ensure a successful adoption of Engineering Principles and Practices within Production Management. The role… requires a comprehensive understanding of multiple areas within a function and how they interact to achieve the objectives of the function. applies in-depth understanding of the business impact of technical contributions. is accountable for delivery of a full range of end-to-end projects. requires excellent communication skills required to negotiate internally. involves short- to medium-term planning of actions and resources for own area.

Requirements

  • Relevant experience in a critical software development role with high business impact, ability to understand how software delivers business value
  • Excellent engineering skills and senior architecture
  • Excellent working knowledge of key computer science concepts (networking, operating systems, virtualization, containerization, etc.)
  • Polyglot full-stack developer mentality and ability to pick up new languages and skills
  • Excellent understanding of Software Engineering concepts like Software Development Life Cycle and GitOps
  • Excellent debugging and analytical skills: ability to isolate root cause across networking/infrastructure, application and database stacks
  • Operational experience of deploying and running services at scale on top of Docker/Kubernetes stack and a service mesh, like Istio, is highly desirable
  • Operational experience with orchestration tools for CI/CD and Infrastructure-as-Code tooling (Terraform, Cloud Formation, etc.) is a highly desirable
  • Experience of delivering software using Agile delivery methodologies is a must (SCRUM/Kanban)
  • Operational experience of using middleware technologies (MQ, Apache Kafka, etc.) to run services at scale is desirable
  • Strong experience with end-to-end observability stacks (Datadog, AppDynamics, Dynatrace, etc.) is desirable
  • Degree in computer science/mathematics/physics or related technical subject is desirable
  • Experience of senior stakeholder management
  • Consistently demonstrates clear and concise written and verbal communication skills
  • 9+ years in a site reliability engineering related role with proven hands-on expertise and the capability to demonstrate technical proficiency in the following:
  • Programming (Java, Python, or equivalent)
  • Containerization
  • Kubernetes
  • GitOps
  • High Availability Systems
  • Infrastructure as a code
  • Configuration Management
  • Observability (tools and implementation)
  • Hyperscale Systems
  • Middleware configuration

Responsibilities

  • Demonstrate an in-depth understanding of Software Development Lifecycle and how it integrates within the overall technology landscape to deliver scalable, reliable and resilient applications.
  • Ability to operate in a global environment with on-/near-/off-shore matrix reporting structures.
  • Operate into a highly regulated environment that requires in-depth understanding of the regulatory requirements and the industry implications for our technologies.
  • Improve the service level the team provides to our end users, which includes maximizing operational efficiencies, strengthening incident management, problem management and knowledge sharing practices.
  • Drive Continuous Delivery and Automation efforts across the supported applications by means of Root Cause Analysis reviews, Knowledge management, Performance tuning, and user training.
  • Foster a culture that promotes transparency and innovation for increased team productivity.
  • Coach members of the team and outside the immediate reporting line about the best practices and recognize anti-patterns that are quickly addressed.
  • Implement the Agile Framework through one of its implementations like SCRUM or Kanban and ensure it integrates with overall organization processes.
  • Avidly communicate progress and project status across the organization and ensure that stakeholders are managed appropriately throughout the execution period.

Benefits

  • This position may also be eligible for incentive compensation based on individual and/or company performance.
  • This position is eligible for company benefits including medical, dental, and vision insurance with an employer contribution, flexible spending or health savings account, life and AD&D insurance, short and long term disability coverage, paid time off, employee assistance, participation in a 401k program with company match, and additional voluntary or legally-required benefits.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service