Senior Devops Engineer - MN

RAZR MarketingMinnetonka, MN
Onsite

About The Position

We are seeking an experienced Senior DevOps Engineer to join our engineering team at RAZR Marketing. In this role, you will be responsible for maintaining and optimizing our AWS infrastructure, ensuring system reliability, and implementing disaster recovery strategies. Working with our enterprise-scale Fibonacci NX monorepo containing 50+ applications and libraries, you will focus on operational excellence, system maintenance, and AWS best practices to support our customer-facing loyalty platform and banking integrations. We are looking for someone that is passionate about maintaining reliable and resilient systems in AWS, enjoy implementing operational best practices, and thrive in a collaborative environment supporting enterprise-scale applications.

Requirements

  • Bachelor's degree in Computer Science, Information Technology, or related field.
  • 7+ years of experience in DevOps, systems administration, or cloud operations roles.
  • Proficiency with infrastructure as code tools, particularly Pulumi.
  • Strong hands-on experience with AWS services and cloud architecture.
  • Proven experience designing and implementing disaster recovery solutions.
  • Solid understanding of database administration, particularly PostgreSQL/RDS.
  • Experience with containerization using Docker and orchestration with ECS or Kubernetes.
  • Proficiency in scripting languages: Python, Bash, or Ruby for automation tasks.
  • Strong understanding of networking concepts, VPCs, security groups, and load balancers.
  • Experience with monitoring and logging tools (CloudWatch, ELK stack, or similar).
  • Knowledge of security best practices and compliance requirements.
  • Excellent problem-solving skills and ability to troubleshoot complex system issues.
  • Strong communication skills and ability to document technical procedures clearly.

Nice To Haves

  • Experience with banking or financial services systems is a plus.
  • AWS certifications (e.g., AWS Certified Solutions Architect, AWS Certified SysOps Administrator) are highly desirable.

Responsibilities

  • Infrastructure as Code: Design, implement, and maintain infrastructure using Pulumi. Automate infrastructure provisioning and configuration management. Manage environment configurations across dev, staging, and production. Implement version control and code review processes for infrastructure changes. Develop reusable infrastructure components and modules.
  • AWS Infrastructure Management: Manage and optimize AWS services including ECS, Lambda, RDS, S3, CloudFront, and VPC configurations. Implement AWS best practices for security, performance, and cost optimization. Monitor and maintain system health across multiple environments (dev, staging, production). Conduct regular infrastructure audits and implement improvements.
  • Disaster Recovery and Business Continuity: Design, implement, and maintain disaster recovery plans for critical systems and databases. Develop and execute backup strategies for RDS databases, S3 data, and application configurations. Conduct regular DR drills and validate recovery procedures. Document and maintain runbooks for disaster recovery scenarios. Implement multi-region failover strategies for high-availability services.
  • System Maintenance and Operations: Perform routine system maintenance including patching, updates, and security hardening. Manage database maintenance tasks including backups, performance tuning, and capacity planning. Monitor system performance and proactively address potential issues. Coordinate maintenance windows and communicate with stakeholders. Maintain and rotate secrets, certificates, and access credentials.
  • Deployment Support: Support application deployments for Angular containers, NestJS servers, and Java Spring Boot services. Manage Docker container deployments to AWS ECS and serverless environments. Coordinate release of deployments and provide rollback support when needed. Maintain deployment documentation and standard operating procedures.
  • Security and Compliance: Implement and maintain AWS security best practices including IAM policies, security groups, and encryption. Conduct regular security assessments and vulnerability remediation. Ensure compliance with industry standards for banking and financial services integrations. Manage access controls and audit logging across AWS environments.
  • Monitoring and Alerting: Maintain comprehensive monitoring solutions using CloudWatch, application logs, and custom metrics. Configure and tune alerting thresholds for infrastructure and application health. Participate in on-call rotation for production incidents. Conduct post-incident reviews and implement preventative measures.
  • Cost Optimization: Monitor and optimize AWS costs through resource right sizing and reserved capacity planning. Identify and eliminate unused or underutilized resources. Implement cost allocation tags and provide cost reporting to stakeholders.
  • Documentation and Knowledge Sharing: Create and maintain operational documentation, runbooks, and architecture diagrams. Document system configurations, procedures, and troubleshooting guides. Share knowledge with development and operations teams through training and mentorship.
  • Collaboration: Work closely with SRE team on infrastructure reliability and incident response. Partner with development teams to understand operational requirements. Collaborate with security team on compliance and vulnerability management. Participate in Scrum ceremonies and contribute to operational planning.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service