About The Position

Vultr is seeking a highly skilled and experienced Infrastructure Capacity Analytics Engineer to We’re looking for a forward‑thinking infrastructure capacity engineer who can turn massive streams of global telemetry into powerful, predictive insights. In this role, you’ll design advanced forecasting models, build scalable Python data pipelines, and create executive‑ready dashboards that directly influence engineering, operations, and financial strategy. You’ll partner across the business to anticipate growth, optimize infrastructure, and shape the future of our global compute, storage, and network footprint—driving reliability, efficiency, and innovation at scale. This is a highly visible role in a high-growth technology company, which will require a strong candidate who can build predictive capacity models, develop scalable data pipelines, create insightful dashboards, and partner cross‑functionally to ensure our global infrastructure remains efficient, reliable, and ready for growth. This is your opportunity to join our fast growing team and leave your mark on Vultr and the future of Cloud Infrastructure.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related discipline.
  • 6+ years of professional experience in Python development and data engineering.
  • 4+ years in infrastructure capacity planning, performance analysis, or related fields.
  • Strong expertise in time‑series forecasting, statistical modeling, and Python libraries (pandas, NumPy, scikit‑learn, statsmodels, XGBoost).
  • Proficiency with SQL scripting and column-based SQL databases (I.e. ClickHouse); experience designing scalable ETL/ELT pipelines.
  • Advanced proficiency in Grafana, Tableau, or Power BI (DAX, Power Query, modeling, custom visuals).
  • Experience working with infrastructure telemetry and systems (servers, storage, networking).
  • Prior experience managing capacity at a cloud service provider or large‑scale distributed environment.
  • Excellent communication and executive‑level presentation skills.

Nice To Haves

  • Experience with cloud‑native data platforms (Azure Data Lake/Synapse, AWS Redshift, Google BigQuery).
  • Familiarity with containerized environments (Docker) and CI/CD pipelines.
  • Knowledge of RESTful APIs and microservices architecture.
  • Experience with version control (Git) and agile engineering practices.
  • Exposure to machine learning, anomaly detection, or ensemble forecasting methods.
  • Strong spreadsheet skills (advanced formulas, modeling workflows).

Responsibilities

  • Develop and maintain capacity models for compute, storage, and network infrastructure across global environments.
  • Build and productionize advanced time‑series forecasts (e.g., ARIMA/ETS, Prophet, XGBoost/LightGBM) to predict demand, saturation points, and runway.
  • Conduct scenario modeling (“what‑if”) on deployment plans, workload changes, demand spikes, and hardware refresh strategies.
  • Analyze historical utilization to identify emerging risks, inefficiencies, and optimization opportunities.
  • Design, build, and maintain Python‑based data pipelines for ingesting, transforming, and validating large‑scale infrastructure telemetry.
  • Create ETL/ELT workflows to support analytics, modeling, and reporting.
  • Integrate data from observability platforms (e.g., Prometheus/Grafana), CMDB/asset systems, and internal services.
  • Develop APIs/services to expose forecast results and capacity signals to dashboards and tooling.
  • Build executive‑ready dashboards in Power BI (DAX, Power Query, custom visuals) and integrate real‑time forecasting outputs.
  • Deliver clear, compelling insights to engineering, operations, and finance leaders to support both strategic and tactical decision‑making.
  • Automate reporting workflows and ensure up‑to‑date visibility into runway, utilization, and risk posture.
  • Partner with engineering, operations, and finance teams to align capacity plans with growth, reliability, and cost objectives.
  • Establish standards for model governance, documentation, and data quality.
  • Drive continuous improvement of capacity planning systems, tooling, and analytics frameworks.

Benefits

  • 100% company-paid insurance premiums for employee medical, dental and vision plans.
  • 401(k) plan that matches 100% up to 4%, with immediate vesting
  • Professional Development Reimbursement of $2,500 each year
  • 11 Holidays + Paid Time Off Accrual + Rollover Plan
  • Commitment matters to Vultr! Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
  • $500 stipend for remote office setup in first year + $400 each following year
  • Internet reimbursement up to $75 per month
  • Gym membership reimbursement up to $50 per month
  • Company paid Wellable subscription
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service