Decision Intelligence Engineer - Next Best Action

Humana

4d•$129,300 - $177,800•Remote

About The Position

Humana is seeking a skilled Decision Intelligence Engineer to design, train, and improve the reinforcement learning policy at the core of their Next Best Action platform. This role is hands-on and research-oriented, involving the design and evaluation of decision-making algorithms, and the instrumentation of training pipelines. The engineer will collaborate with data and platform engineers to ensure the system operates correctly within clinical eligibility rules and program-specific objectives.

Requirements

8+ years of software engineering or quantitative research experience building and operating large-scale production systems, with emphasis on data-intensive platforms, recommendation systems, optimization engines, or simulation frameworks serving millions of users.
3+ years of hands-on experience implementing reinforcement learning, operations research methods, or simulation-driven decision systems in production.
Relevant backgrounds include policy gradient and value-based RL (PPO, A3C, DQN, CQL), stochastic dynamic programming, discrete-event simulation, or large-scale combinatorial or constrained optimization.
Deep familiarity with Markov Decision Processes, Bellman-equation-based value estimation, reward or objective shaping, exploration-exploitation tradeoffs, and constraint formulation in real-world decision systems.
Demonstrated ability to diagnose failure modes in learned or optimized policies: instability, poor credit assignment across long horizons, and distributional shift across large populations.
Proficiency in Python 3.x.
Experience with PyTorch or TensorFlow for policy network or learned model implementation.
Experience with Ray RLlib or equivalent distributed computation frameworks for large-scale training or optimization.
Experience with Databricks, PySpark, and Delta Lake for large-scale ML or data pipelines processing tens of millions of records.
Experience with MLflow for experiment tracking, model registry, and artifact management.
Experience with shipping systems that operate reliably under production load, not just research or prototype work.

Nice To Haves

Experience with multi-agent RL frameworks (PettingZoo or equivalent) or multi-agent simulation and coordination methods.
Familiarity with operations research methods applicable to constrained sequential decisioning: linear programming, mixed-integer programming, Lagrangian relaxation, or constraint programming.
Experience operating decision or optimization systems in regulated domains (healthcare, finance, or insurance) where member safety, auditability, and explainability are requirements.
Experience building simulation environments using Gymnasium, SimPy, AnyLogic, or equivalent frameworks for policy evaluation and backtesting.
Familiarity with event-driven feedback loops and how disposition signals feed retraining or re-optimization pipelines.
OpenTelemetry instrumentation experience for ML or optimization pipeline observability.

Responsibilities

Design, implement, and evaluate algorithms suited to long-horizon, sparse-reward sequential decision-making in healthcare, including reinforcement learning methods (PPO, A3C, DQN, CQL, Decision Transformer), dynamic programming, and constrained optimization.
Frame member decisioning problems as Markov Decision Processes (MDPs) or Partially Observable MDPs, defining state representations, action spaces, transition dynamics, and reward structures.
Apply Bellman-equation-based value estimation, reward shaping, and constraint formulations to encode clinical eligibility, suppression rules, and program-specific objectives.
Manage exploration-exploitation tradeoffs appropriate for a production healthcare environment.
Model member journey dynamics using tools from stochastic processes, simulation, or probabilistic graphical models.
Build simulation and backtesting environments (discrete-event simulation, Monte Carlo methods) to evaluate policy or decision quality before production promotion.
Diagnose and remediate failure modes specific to learned or optimized policies, such as policy collapse, credit assignment errors, distributional shift, and constraint violations.
Define performance threshold criteria and automated evaluation gates within the nightly Databricks training workflow.
Instrument training and optimization runs with MLflow tracking.
Own the nightly Databricks training workflow, including feature engineering, distributed training (Ray RLlib), and batch scoring.
Collaborate with the Data Engineering team to ensure training inputs, reward signals, and feature pipelines are reproducible and auditable.
Write production-quality PySpark feature engineering jobs and maintain data lineage through Databricks Unity Catalog.
Manage model artifacts, versioning, and lifecycle in the MLflow Model Registry, ensuring rollback capability.
Apply multi-agent decision-making concepts (MARL) where member household or population-level coordination is required.
Implement constraint handling to enforce hard business rules (member caps, cooldown periods, clinical eligibility) within the optimization objective.
Collaborate with rules engine stakeholders to ensure eligibility guards and policy priorities are correctly aligned.
Partner with decision engine and rules engine teams to integrate model outputs cleanly with the real-time decisioning hot path.
Collaborate with platform architects to define feedback loop contracts for disposition outcomes.
Document model behavior, known limitations, and failure modes for clinical and compliance stakeholders.
Use AI-assisted engineering tools for scaffolding, testing, and documentation, ensuring core model logic remains human-authored and peer-reviewed.

Benefits

Medical benefits
Dental benefits
Vision benefits
401(k) retirement savings plan
Time off (including paid time off, company and personal holidays, volunteer time off, paid parental and caregiver leave)
Short-term disability
Long-term disability
Life insurance
Bonus incentive plan

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume