Director, Software Engineering: Retail Platform Delivery (ELERA)

Toshiba Global Commerce Solutions - ExternalDurham, NC

About The Position

Toshiba Global Commerce Solutions is hiring a Director of Software Engineering to own end-to-end delivery of the ELERA retail platform across our strategic global retail partners, including Fortune 100 grocery, warehouse club, and general merchandise accounts. This role collapses Development, Test, and Professional Services Engineering into a single accountable function and operates as the customer-facing engineering authority for strategic programs. You own what ships, how it scales, how it recovers, and how quickly the team learns. This role runs on Toshiba’s agentic AI development ecosystem as the default way of working, not a side experiment. Agents participate in requirement decomposition, code generation, test authoring, PR review, defect triage, and release scoring. You are expected to measure and compound that leverage. This is not an engineering-only or QA-only role. You own the outcome, the telemetry that proves it, and the workflows that compound it.

Requirements

  • 12+ years in software engineering
  • 6+ years leading engineering organizations at scale (100+ engineers)
  • Proven ownership of production systems serving Fortune 500 customers
  • Delivered complex, distributed, multi-tenant platforms under strict SLOs
  • Direct customer-facing engineering leadership at executive level
  • Distributed systems, event-driven architectures, microservices, API design
  • Cloud-native (AWS, Azure, or GCP) AND edge/store deployment models
  • Observability engineering: OpenTelemetry, SLOs, structured logging, distributed tracing
  • Progressive delivery: feature flags, canary, blue/green, shadow traffic
  • CI/CD at scale, GitOps, supply chain security
  • Performance engineering: load, stress, endurance, chaos
  • Modern quality practices: TDD, contract testing, mutation testing
  • Hands-on experience with LLM-based engineering tooling (Claude Code, agentic IDE plugins, MCP servers)
  • Working understanding of agent orchestration, MCP, and RAG architectures
  • Ability to design AI-leveraged workflows that compound team output, not just assist individuals
  • Comfortable measuring AI leverage in dollar-equivalent and headcount-equivalent terms
  • Executive customer-facing credibility; holds the room with CIOs and CTOs
  • Decides under ambiguity with clear tradeoff articulation
  • Balances speed vs quality, customization vs platform integrity, innovation vs stability
  • Execution-focused, data-driven, low tolerance for process theater

Nice To Haves

  • Retail domain: POS, payments, loyalty, store systems, edge computing at store
  • ELERA platform familiarity or comparable enterprise retail platform experience
  • Background spanning both engineering AND quality disciplines
  • PCI DSS and SOX compliance experience in a payments context

Responsibilities

  • Customer-Facing Engineering Leadership
  • Primary engineering authority for customer programs spanning POS, payments, loyalty, and store systems
  • Partner with customer leaders to define success criteria, operational acceptance, rollout strategy, and risk posture
  • Translate real-world retail workflows into architecture decisions, API contracts, and system behavior
  • Maintain architecture decision records (ADRs) and a visible technical debt register tied to delivery plans
  • End-to-End Delivery Ownership (DORA-Native)
  • Own the four DORA metrics as published, measurable targets:
  • Deployment Frequency: on-demand for services, scheduled-safe for store edge
  • Lead Time for Changes: under 24 hours for services
  • Change Failure Rate: under 10% trending down
  • MTTR: under 1 hour for customer-impacting incidents
  • AI-built feature branches with pre-merge release testing: agents build the full feature on an isolated branch; release-grade testing (functional, performance, security, integration) runs on that branch and must pass before merge. Main stays releasable at all times.
  • Progressive delivery is mandatory: feature flags, canary, blue/green, shadow traffic. Decouple deploy from release.
  • Zero scheduled-downtime windows for cloud services; store edge deploys are non-disruptive to live lanes
  • Unified Engineering + Quality Model
  • Test-Driven Development is the baseline. Tests exist before code. No exceptions for new code paths.
  • Quality gates block merges: coverage thresholds, mutation score, static analysis, SAST, dependency and license scanning, SBOM generation
  • Contract testing (consumer-driven) between every service boundary; OpenAPI and AsyncAPI versioned and governed
  • Chaos engineering in staging, quarterly game days, production-like load testing ahead of peak retail events
  • Shift-right validation: synthetic monitoring, canary analysis, error budget enforcement
  • Own metrics: defect leakage, change failure rate, cycle time, escape rate, production incident rate
  • AI-Native Development (Agentic Delivery)
  • Leverage Toshiba’s agentic development ecosystem as the default delivery model:
  • AI-assisted engineering: requirement decomposition, design-to-code, test generation, refactoring at scale, PR review
  • Agent-driven workflows: defect triage and clustering, root cause analysis from telemetry, release readiness scoring, customer-reported incident classification
  • MCP-integrated tooling: every engineer works with context-aware agents wired to source, tickets and telemetry
  • Agents as team members: RaaS (RAG-as-a-Service), Sentinel AI for test authoring and execution, Agentic TPM for program management, CloudOps Sandbox for environment provisioning, ELERA Agent Registry for discovery
  • Measured leverage: agent-authored code accepted, test cases generated, incidents auto-triaged, AI-equivalent headcount by role tracked quarterly
  • Feedback loops: production telemetry feeds AI-ranked backlog and refactor targets
  • Scalability, Performance, and Reliability (SRE Discipline)
  • SLOs and error budgets defined for every customer-facing service; error budget exhaustion blocks feature work
  • Observability-as-code: OpenTelemetry instrumentation is required before merge; service.version tagged; traces, metrics, and logs correlated by default
  • Performance engineering: load, stress, endurance, and soak testing sized for peak retail events (Black Friday, holiday, payroll cycles)
  • Edge resilience: stores survive network partitions; offline transaction handling; store-and-forward retry with exponential backoff and jitter; blue/green at the edge
  • Horizontal scale to tens of thousands of lanes across multi-tenant, multi-region deployments
  • Capacity planning tied to FinOps: cost per transaction tracked and optimized, not assumed
  • Platform Engineering & Developer Experience
  • Treat the engineering org as the customer of an internal developer platform (AI Dev Portal as the front door)
  • Golden paths: paved roads for new services, standard observability, standard security posture, standard CI/CD
  • Self-service environments: CloudOps Sandbox, on-demand synthetic test data, ephemeral preview environments per PR
  • Developer productivity instrumented: DORA plus SPACE metrics; time-to-first-commit for new hires; PR review latency; local build time
  • Professional Services Engineering Model
  • Customer-specific extensions delivered without fragmenting the platform
  • Config-driven first, code-driven by exception. Every bespoke extension has a sunset plan or a promotion path to core.
  • Reusable solution accelerators for common retail integrations (payments, peripherals, back office, tax, loyalty, CRM)
  • Integration contracts versioned and governed; breaking changes require deprecation windows and migration tooling
  • Clear engineering handoff to customer operations and support, with runbooks, dashboards, and on-call rotations live at go-live
  • Release Governance and Operational Readiness
  • Engineering-driven release gates, not process theater:
  • Functional completeness, validated by automated acceptance
  • Performance validation against published SLOs
  • Integration stability, with contract tests green
  • Observability live (dashboards, alerts, runbooks) before traffic
  • Security posture verified (SAST, DAST, SCA, SBOM, secrets scanning, threat model reviewed)
  • Rollback tested, not assumed
  • Post-release: automated canary analysis, 24-hour watch, blameless incident reviews with 48-hour turnaround and action-items tracked to closure
  • Security and Compliance Engineering
  • Shift-left security: threat modeling at design time, SAST in every PR, DAST in staging
  • Compliance built into the pipeline: PCI DSS, SOX, GDPR controls are code, not audit artifacts
  • Software supply chain: SBOM generation, signed artifacts, SLSA build-level targets
  • Secrets management, zero-trust service-to-service authentication, least-privilege IAM
  • Leadership and Organizational Impact
  • Lead cross-functional teams spanning Engineering, embedded Quality, and Professional Services
  • Develop engineering leaders who:
  • Own outcomes, not tasks
  • Think in systems and data, not features
  • Hold the room at the customer executive table
  • Drive a culture of:
  • Accountability without blame
  • Evidence-based decisions (telemetry, metrics, experiments)
  • Continuous improvement with measurable deltas

Benefits

  • Group health coverage (medical, dental, & vision)
  • Employee Assistance Programs
  • Pre-tax spending accounts
  • 401(k) plan (with company match)
  • Company provided life insurance
  • Pet Insurance
  • Employee discounts
  • Generous paid holiday schedule, paid vacation & sick/personal days

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Director

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service