Sr. Reliability Engineer, Digital Marketing

SkechersManhattan Beach, CA
Hybrid

About The Position

The Sr. Reliability Engineer, Digital Marketing is responsible for ensuring the reliability, observability, and continuous improvement of Skechers' global marketing technology ecosystem. This role owns end-to-end reliability across customer data, audiences, campaigns, journeys, loyalty events, and digital experience signals — building a more intelligent, resilient, and increasingly self-healing marketing stack. Working closely with analytics, data engineering, and service provider teams, this engineer prevents issues before they impact customers, reduces operational toil through automation, and continuously monitors and validates the accuracy, completeness, and timeliness of customer data across marketing and loyalty platforms.

Requirements

  • Hands-on experience supporting complex SaaS platforms in production, ideally including Salesforce Data Cloud, Salesforce Marketing Cloud, and/or enterprise CRM or marketing technology platforms with high business criticality.
  • Strong understanding of customer data flows, segmentation, audience activation, marketing journeys, campaign operations, and loyalty-related business processes.
  • Experience with Google Analytics, Quantum Metric, or similar digital analytics platforms used to diagnose customer and business impact.
  • Strong troubleshooting skills across data, integrations, APIs, workflows, and application behavior, with hands-on experience building and operating monitoring, alerting, dashboards, runbooks, and incident management processes.
  • Strong SQL skills and working knowledge of scripting or automation languages such as Python, JavaScript, or Bash, with experience leveraging AI-assisted engineering tools such as Claude Code, ChatGPT Codex, or Cursor to improve operational efficiency and automation.
  • Strong understanding of email deliverability, including operational drivers of inbox placement, sender health, and remediation practices.
  • Ability to communicate clearly with both technical and business stakeholders during normal operations and high-severity incidents, with a demonstrated ability to identify repetitive manual work and replace it with durable engineering solutions.
  • Bachelor's degree in Computer Science, Engineering, or related field, or equivalent experience.
  • 7+ years of experience in reliability engineering, site reliability engineering, platform engineering, production engineering, application support engineering, or marketing technology operations.

Responsibilities

  • Own service reliability across the full marketing platforms flow, including data ingestion, identity resolution, audience creation, segmentation, activation, campaign and journey execution, triggered and scheduled communications, loyalty program events, and downstream measurement.
  • Ensure Salesforce Data Cloud audiences are accurate, timely, and operationally dependable, with strong controls around data freshness, segmentation quality, publish success, and downstream activation.
  • Ensure Salesforce Marketing Cloud campaigns, journeys, automations, and email operations execute as designed, with clear operational thresholds, monitoring, and recovery playbooks.
  • Ensure Salesforce Loyalty Management processes including member activity, accrual, redemption, promotions, and related integration points are reliable, traceable, and aligned with customer experience expectations.
  • Build and maintain observability across platform health, business process health, and customer-impact signals, including dashboards, alerts, trend reporting, and escalation paths.
  • Leverage Google Analytics and Quantum Metric to connect technical incidents to customer and business impact, including conversion degradation, journey drop-off, landing page friction, loyalty enrollment issues, and campaign experience problems.
  • Define and manage SLIs, SLOs, and operational thresholds for business-critical marketing services, with proactive detection of platform and data issues before they impact customers or campaigns.
  • Own end-to-end operational response for priority incidents, including detection, triage, severity assessment, stakeholder communication, vendor engagement, mitigation, recovery, and post-incident review — translating technical issues into clear business impact across audience activation, journeys, loyalty activity, deliverability, and digital experience.
  • Lead incident coordination including bridge calls, cross-functional alignment, vendor escalation, recovery communication, and closure messaging, with defined update cadences for marketing operations, CRM, loyalty, analytics, leadership, and Salesforce support.
  • Own release readiness and go/no-go support for all product changes, including risk assessment, dependency checks, and rollback readiness.
  • Maintain steady-state operational risk reporting covering platform health trends, recurring failure patterns, deliverability risks, and proactive recommendations.
  • Design and implement automation that reduces manual work, speeds recovery, and enables safer scale, including AI-assisted alert enrichment, knowledge retrieval, incident summarization, runbook execution, and low-risk self-healing patterns under human oversight.
  • Define and maintain operational standards including runbooks, change controls, release readiness checks, and problem management processes for business-critical marketing services.
  • Drive adoption of reliability engineering best practices across delivery and marketing technology teams.
  • Partner with marketing operations, CRM, data engineering, eCommerce, loyalty, analytics, and vendor teams to ensure reliability considerations are built into new initiatives from the start, serving as a reliability advocate during architecture design and solution reviews.
  • Collaborate with data engineering on proactive monitoring and validation of data accuracy, completeness, timeliness, and consistency across ingestion, identity resolution, transformation, and activation layers.
  • Serve as the primary engineering partner for Salesforce Signature Success, incorporating Proactive Monitoring alerts and recommendations, managing escalations, and converting vendor insights into permanent improvements.
  • Participate in a global support and escalation model while continuously reducing after-hours operational load through better monitoring, smarter automation, and stronger engineering discipline.

Benefits

  • Equal employment opportunities for all employees and applicants for employment without regard race, color, religion, gender, gender identification and expression, national origin, marital status, age, disability, genetic information, military status, sexual orientation, or any other protected characteristic established by local, state or federal law.
  • Reasonable accommodation may be made to enable individuals with disabilities, who are otherwise qualified for the job position, to perform the essential functions.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service