Lead Infrastructure Engineer (HOAi)

Vantaca-posted about 2 months ago

Full-time • Mid Level

Remote • Wilmington, NC

251-500 employees

Professional, Scientific, and Technical Services

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

HOAi is a fast-growing startup revolutionizing the community association management industry. Our AI workforce platform integrates machine learning technology to streamline labor-heavy processes, eliminating inefficiencies and driving scalability. With rapid growth in the AI space, we are pushing boundaries to redefine industry standards. HOAi is the leading AI solution for the community association management industry, enabling organizations to deploy AI Agents that function like experienced managers. These AI Agents go beyond traditional AI by proactively executing complex, multi-step processes with human-like reasoning-working autonomously, 24/7, across your entire operation. This transformation optimizes labor costs, enables growth without additional hires, and ensures faster, higher-quality service for residents and board members. HOAi was acquired by Vantaca in the fall of 2024. Vantaca just achieved unicorn status with a $1.25B valuation, so it's safe to say we're past the "scrappy startup phase." We're not just building a successful company - we're building the category-defining platform that will transform how an entire industry operates. Here's the reality of our trajectory: Growing 100% year-over-year Our AI product (HOAi) went from $0 to millions in months Backed by Cove Hill Partners and JMI Private Equity 6M+ doors on our platform, displacing legacy systems The Lead AI Infrastructure Engineer at HOAi is responsible for scaling and maintaining the infrastructure that powers our AI-driven products and services. This role sits at the intersection of infrastructure engineering, machine learning operations, and product development, ensuring our AI systems operate with exceptional reliability, performance, and efficiency. The ideal candidate is someone who gets excited about making AI systems fundamentally faster and more scalable. You'll work directly with our engineering and product teams to build the foundational infrastructure that enables HOAi to deliver the most advanced AI product in the community association management industry.

Profile and optimize database queries, API endpoints, and ML inference pipelines
Implement caching strategies, connection pooling, and distributed systems for scale
Monitor and optimize GPU utilization, memory usage, and compute costs
Design load balancing and auto-scaling policies for variable AI workloads
Build disaster recovery systems with redundancy
Build and maintain CI/CD pipelines specifically for model deployment
Implement model versioning, A/B testing infrastructure, and rollout mechanisms
Create automated testing frameworks for model quality and performance regression
Develop infrastructure for model monitoring, drift detection, and retraining workflows
Manage experiment tracking and model registry systems
Implement comprehensive monitoring, logging, and alerting across the AI stack
Refine dashboards for real-time visibility into system health and performance
Conduct post-mortems and implement reliability improvements
Design circuit breakers, retry logic, and graceful degradation for critical services
Refine security best practices for AI infrastructure and data handling
Ensure compliance with data privacy regulations and industry standards
Manage credentials and access control across infrastructure
Support security audits and vulnerability assessments
Work closely with Product & Engineering team to understand infrastructure needs and to enable fast, safe feature deployment
Document infrastructure architecture, runbooks, and operational procedures
Mentor team members on infrastructure best practices and tooling
Contribute to technical strategy and architectural decisions

3-7 years of experience in infrastructure engineering, DevOps, or SRE
Strong cloud platform expertise
Experience building and maintaining deployment pipelines
Experience with PostgreSQL, Redis, or other production databases
Experience with APM tools, metrics, logging, and alerting
Familiarity with vector databases, model serving frameworks and cross-system observability and traceability
Managing and optimizing GPU work
Real-time inference with low-latency serving infrastructure
LLM deployment
Track record of achieving 10x performance improvements
Able to debug complex distributed systems and find root causes
Obsessed with latency, throughput, and resource efficiency
Defaults to automating repetitive tasks and building scalable solutions
Understands security implications and implements best practices
Able to explain complex technical concepts clearly
Works effectively across teams and functions
Takes initiative to identify and solve problems before they become critical
Comfortable with ambiguity and changing priorities in a fast-moving startup
Supporting A/B deployment strategy

Medical, Dental, and Vision kick in day one.
Unlimited PTO (with a requirement for employees to take a minimum of one continuous week per year).
401K with Company Match.
Remote Flexible - come to the office when needed.
Great parental leave benefits.

Track Jobs with Teal

Job Search Resources

•

Resume Builder

•

Resume Examples

•

Cover Letter Examples

Lead Infrastructure Engineer (HOAi)

Job Search Resources

Tools

Career Hubs

Guides

Company