Principal Architect - Machine Learning

United Airlines•Chicago, IL

1d•$147,060 - $191,516

About The Position

United's Digital Technology team is comprised of many talented individuals all working together with cutting-edge technology to build the best airline in the history of aviation. Our team designs, develops and maintains massively scaling technology solutions brought to life with innovative architectures, data analytics, and digital solutions. United Airlines is seeking talented people to join the Data and Machine Learning Engineering team. The organization is responsible for leading data driven insights & innovation to support the Machine Learning needs for commercial and operational projects with a digital focus. This role will frequently collaborate with ML engineers, data scientists and data engineers. This role will design, architect, implement and lead key components of the Machine Learning Platform, Gen AI/ML business use cases, and establish processes and best practices.

Requirements

Bachelor's degree in Computer Science, Data Science, Generative AI, Engineering or related discipline or Mathematics experience required
5+ years of software engineering experience with languages such as Python, Go, Java, or C/C++
5+ years of experience in machine learning, deep learning, and natural language processing
Strong software engineering experience with Python and at least one additional language such as Go, Java, or C/C++
Strong technical leadership and familiarity with data science methodologies and frameworks (e.g., PyTorch, Tensorflow) and preferably building and deploying production ML pipelines
Experience in ML model life cycle development experience and prefer experience to common algorithms like XGBoost, CatBoost, Deep Learning, etc
Experience setting up and optimizing data stores (RDBMS/NoSQL) for production use in the ML app context
Cloud-native DevOps, CI/CD experience using tools such as Jenkins or AWS CodePipeline; preferably experience with GitOps using tools such as ArgoCD, Flux, or Jenkins X
Experience with generative models such as GANs, VAEs, and autoregressive models
Prompt engineering: Ability to design and craft prompts that evoke desired responses from LLMs
LLM evaluation: Ability to evaluate the performance of LLMs on a variety of tasks, including accuracy, fluency, creativity, and diversity
LLM debugging: Ability to identify and fix errors in LLMs, such as bias, factual errors, and logical inconsistencies
LLM deployment: Ability to deploy LLMs in production environments and ensure that they are reliable and secure
Experience with LLMOps (Large Language Model Operations) or AgenticOps (Agentic Operations) to manage the end-to-end lifecycle of large language models
Experience with generative ai methods such as retrieval augmented generation (RAG) and instruction fine tuning
Must be legally authorized to work in the United States for any employer without sponsorship
Successful completion of interview required to meet job qualification
Reliable, punctual attendance is an essential function of the position

Nice To Haves

Master's/PhD degree in Computer Science or related STEM field
5 + years of experience working in cloud environments (AWS preferred) - Kubernetes, Dockers, ECS and EKS
5 + years of experience with Big Data technologies such as Spark, Flink and SQL programming
5 + years of experience with cloud-native DevOps, CI/CD
3 – 5 + years of relevant enterprise Architecture experience
1+ years of experience with Generative AI/LLMs

Responsibilities

Build high-performance, cloud-native machine learning infrastructure and services to enable rapid innovation across United
Set up containers and Serverless platform with cloud infrastructure
Design and develop tools and apps to enable ML automation using AWS ecosystem
Build data pipelines to enable ML models for batch and real-time data
Hands on development expertise of Spark and Flink for both real time and batch applications
Support large scale model training and serving pipelines in distributed and scalable environment
Stay aligned with the latest developments in cloud-native and ML ops/engineering and to experiment with and learn new technologies – NumPy, data science packages like sci-kit, microservices architecture
Optimize, fine-tune generative AI/LLM models to improve performance and accuracy and deploy them
Evaluate the performance of LLM models, Implement LLMOps processes to manage the end-to-end lifecycle of large language models
Develop, optimize, fine-tune Generative AI/LLM models to improve performance and accuracy and deploy them