Sr AIOps Engineer

Honeywell•Phoenix, AZ

5d•Hybrid

About The Position

As a Sr AI OPS Engineer here at Honeywell, you will play a crucial role in designing and implementing advanced data solutions for AI solutions that drive business insights, enhance decision-making processes and empower AI solutions. Your expertise will help in critical AI development activities across all AI modalities (classic, Gen and agentic) and data types (structured and unstructured). You will report directly to our AI Director, and you’ll work out of our Phoenix, AZ location on a Hybrid work schedule. Note: New Hires will work onsite M-F for the first 90 days. In this role, you will impact the organization by leveraging your technical skills to develop innovative software solutions that support strategic initiatives and improve operational efficiency.

Requirements

Bachelor’s degree from an accredited institution in a technical discipline such as science, technology, engineering, mathematics.
6–8 years of strong experience developing and deploying models using ML, deep learning, NLP, and related AI techniques.
6–8 years of hands-on experience with Python and PySpark in a production, programming-intensive role.
4–6 years of industry experience with ML frameworks such as TensorFlow, Keras, PyTorch, HuggingFace Transformers, and scikit-learn.
4 years of experience building, deploying, and supporting ML models in production, with ownership over model lifecycle and automation.
4–6 years of hands-on experience with Databricks (preferably on AWS), including MLflow, Delta Lake, Jobs, and MLOps architecture patterns.
Strong experience with distributed computing frameworks (Spark, Kubernetes ecosystem, or similar).
Proficiency with GitHub Actions and CI/CD pipelines for ML/AI workloads.
Experience integrating ML workloads with data platforms such as Snowflake and orchestrating workflows in Dataiku .
Expertise in at least two AI/ML domains (e.g., classic ML, deep learning, NLP, GenAI).
Excellent communication skills, with the ability to partner across engineering, data science, and product teams.

Nice To Haves

Bachelor’s or advanced degree in Computer Science, Mathematics, Statistics, Engineering, or a related field.
Experience operationalizing GenAI systems including prompt pipelines, fine‑tuning, LLM evaluation, and RAG orchestration.
Familiarity with agentic AI patterns , including memory stores, tool registries, multi‑step reasoning evaluation, and safety/guardrail integration.
Hands-on experience with vector databases (Databricks Vector Search, OpenSearch, Pinecone, Chroma).
Experience building automated quality gates: data validation, model audits, bias checks, content safety tests, and regression testing for LLMs and agents.
Knowledge of scalable serving architectures: Model Serving, serverless inference, API Gateways, and low‑latency microservices.
Passion for automation, reproducibility, observability, and building resilient AI systems.
Curiosity, experimentation mindset, and desire to continuously improve the AI development lifecycle across classic, GenAI, and agentic AI.

Responsibilities

Lead the design, automation, and operation of end‑to‑end MLOps pipelines supporting classic ML, GenAI/LLM systems, and agentic AI workloads across Databricks and Dataiku.
Build, maintain, and optimize training, evaluation, and deployment pipelines, ensuring reliability, reproducibility, and alignment with business objectives.
Collaborate with data scientists, AI software developers, data engineers, and platform engineers to operationalize models, LLMs, RAG workflows, and agentic AI capabilities.
Architect and implement solutions for distributed training , hyperparameter optimization, accelerated inference, and performance‑tuned model serving.
Develop automated testing, validation, governance, and monitoring frameworks for ML/LLM/agentic workflows, including drift detection, model quality, and guardrail coverage.
Own CI/CD pipelines for model assets, prompts, embeddings, vector search updates, and agent tool registries using GitHub Actions and modern ML deployment frameworks.
Manage MLflow experiment tracking, model registry lifecycle, lineage, and promotion flows across multiple environments in Databricks and Dataiku.
Optimize integration between ML frameworks (PyTorch, TensorFlow, scikit‑learn) and cloud‑based compute ecosystems including Spark, Kubernetes, and serverless runtimes.
Ensure production-grade reliability, scalability, performance, and observability of all deployed AI workloads (classic → GenAI → agentic).
Establish best practices, patterns, reusable templates, and standards for MLOps across the AI delivery lifecycle.

Benefits

In addition to a competitive salary, leading-edge work, and developing solutions side-by-side with dedicated experts in their fields, Honeywell employees are eligible for a comprehensive benefits package.
This package includes employer-subsidized Medical, Dental, Vision, and Life Insurance; Short-Term and Long-Term Disability; 401(k) match, Flexible Spending Accounts, Health Savings Accounts, EAP, and Educational Assistance; Parental Leave, Paid Time Off (for vacation, personal business, sick time, and parental leave), and 12 Paid Holidays.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume