About The Position

As a Staff Software Engineer at Trase, you will be responsible for the core execution model and platform architecture of Trase OS, the central platform powering all Trase deployments in regulated environments. This role is critical for defining the system's abstractions, APIs, and ensuring its correctness, scalability, and extensibility. You will set the technical direction for the platform, acting as a force multiplier for engineering teams. The position requires a strong focus on clean abstractions and correctness under failure, as the system operates long-lived agents in healthcare and defense sectors where auditability and reliability are paramount. The role addresses risks associated with system design quality, such as poor abstractions, difficulty reasoning about workflow execution under failure, and fragmentation of platform capabilities. You will be instrumental in defining durable abstractions, ensuring deterministic workflow execution, translating product requirements into coherent platform architecture, and enabling teams to build on Trase OS without introducing systemic complexity.

Requirements

  • 10+ years of experience building distributed/platform systems, including significant experience defining architecture across teams or domains
  • Experience building mission-critical runtimes or workflow/orchestration systems
  • Deep expertise with durable execution (e.g., state machines, event sourcing, saga/compensation, idempotency, exactly/at-least-once semantics)
  • Proven track record with security & governance in production systems (auth, RBAC, audit, policy)
  • Hands-on with observability (Grafana or equivalent), including trace correlation across async boundaries
  • Strong systems design across storage, queues, schedulers, and evented architectures; performance tuning under load
  • Excellence in a modern language (e.g., Go, Rust, Java, or TypeScript) and cloud-native stacks (containers, CI/CD, IaC)
  • Comfortable operating in regulated or high-assurance environments; bias toward correctness, clarity, and documentation
  • Proven ability to influence technical direction across an organization and drive adoption of architectural standards
  • Ability to incorporate advance LLM capabilities into system design and platform architecture decisions where appropriate

Nice To Haves

  • Prior work on workflow engines (Temporal/Cadence/AWS Step Functions, Argo, Airflow) or serverless runtimes
  • Experience with policy engines (OPA), secrets/KMS, or data-handling controls (PII/PHI)
  • ML/LLM evaluation frameworks, tool/plugin architectures, or embedding model governance into execution
  • Government or healthcare experience (HIPAA, audit readiness) and multi-tenant isolation

Responsibilities

  • Develop the core execution model (state machine, lifecycle, resource model, failure semantics)
  • Design platform APIs/SDKs connecting workflows, agents, tools, and product surfaces; drive versioning & compatibility
  • Guarantee correctness via idempotency, deterministic replays, compensating actions, and data integrity
  • Engineer reliability at scale: concurrency controls, rate limits, backpressure, sharding/partitioning, and workload isolation
  • Build security & governance into the core: RBAC/ABAC, policy enforcement, fine-grained audit & lineage
  • Deliver observability: distributed tracing, structured logs, metrics, and evaluation hooks; build an “explainable trail” of agent actions
  • Own quality: design reviews, test strategy (unit, property, chaos), performance baselines, SLOs, incident response, and postmortems
  • Mentor & unblock senior engineers; partner with Product, Security, and Customer teams to translate requirements into durable primitives
  • Make pragmatic choices on storage, queueing, and compute; create paved roads that accelerate all other teams
  • Define system boundaries and reduce cross-service coupling through clear architectural patterns
  • Drive platform-wide standards for correctness, reliability, and API design across teams
  • Balance short-term delivery with long-term architectural integrity, ensuring the platform evolves without accumulating systemic risk

Benefits

  • Career track opportunity with potential for rapid advancement with strong performance as the firm grows
  • 100% employer paid, comprehensive health care including medical, dental, and vision for you and your family.
  • Paid maternity and paternity for 14 weeks at employees' normal pay.
  • Unlimited PTO, with management approval.
  • Opportunities for professional development and continued learning.
  • Optional 401K, FSA, and equity incentives available.
  • Mental health benefits are available through Tara Mind.
  • Cost effective GLP-1 solutions available through Crux.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service