Site Reliability Engineer

TATA Consulting ServicesAtlanta, GA
40d$100,000 - $125,000

About The Position

Building and supporting a reliable application suite for the environment to meet the development and maintenance requirements of systems/platforms Implement Service Reliability Engineering by working as part of the development team to evaluate the health, stability, and reliability of applications Lead the team in best practices in incident, problem, and change management Utilizing monitoring, alerts, dashboards, and management tools to ensure the availability, reliability, cost, and performance of applications and services Constantly working to improve and implement automation of applications tasks Providing technical support for systems/platforms according to application SLA's Responsible for designing and developing resiliency in the application code, troubleshooting incidents, engaging with squads to address failure patterns, and participating in incident management Develop delivery pipelines and automated deployment scripts Configure services, such as databases and monitoring

Requirements

  • Monitoring solutions - CloudWatch, Dynatrace, PagerDuty
  • DevOps - GitLab, GitLab CI/CD, AWS Cloud Development Kit (CDK), CloudFormation (CFT) and CodePipeline
  • Languages, IDEs, Tools & Architectures - Node.js, TypeScript, YAML, VSCode, IntelliJ, Eclipse, REST API, Postman, Docker,
  • AWS Technologies - API Gateway, Route 53, Lambda, Kafka, ElastiCache, PostgeSQL, SNS, Quarkus, EventBridge, Secret Manager

Responsibilities

  • Building and supporting a reliable application suite for the environment to meet the development and maintenance requirements of systems/platforms
  • Implement Service Reliability Engineering by working as part of the development team to evaluate the health, stability, and reliability of applications
  • Lead the team in best practices in incident, problem, and change management
  • Utilizing monitoring, alerts, dashboards, and management tools to ensure the availability, reliability, cost, and performance of applications and services
  • Constantly working to improve and implement automation of applications tasks
  • Providing technical support for systems/platforms according to application SLA's
  • Responsible for designing and developing resiliency in the application code, troubleshooting incidents, engaging with squads to address failure patterns, and participating in incident management
  • Develop delivery pipelines and automated deployment scripts
  • Configure services, such as databases and monitoring

Benefits

  • Discretionary Annual Incentive.
  • Comprehensive Medical Coverage: Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans.
  • Family Support: Maternal & Parental Leaves.
  • Insurance Options: Auto & Home Insurance, Identity Theft Protection.
  • Convenience & Professional Growth: Commuter Benefits & Certification & Training Reimbursement.
  • Time Off: Vacation, Time Off, Sick Leave & Holidays.
  • Legal & Financial Assistance: Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Professional, Scientific, and Technical Services

Education Level

No Education Listed

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service