Software Python Engineer (Cloud Inference)

Gcore•Town of Poland, NY

47d•Hybrid

About The Position

The world’s digital experiences run on something invisible: the infrastructure and software that keep them fast, reliable, and secure. At Gcore, you’ll help design and deliver that foundation for an AI-driven world. We’re a global provider of infrastructure and software solutions for AI, cloud, network, and security, powering everything from real-time communication and streaming to enterprise AI and secure web applications. With 210+ edge locations, 50+ cloud regions, and thousands of GPUs, your work here can reach users and businesses across the globe. You’ll collaborate with leading technology partners such as Intel, NVIDIA, Dell, and Equinix, and work on platforms that power digital products used around the world. Our vision is simple: to connect the world to AI, anywhere, anytime. Want to work on technology that goes beyond a single product or industry? Join a global team of 550+ professionals building infrastructure and software that supports the entire digital ecosystem. We are currently looking for a Software Python Engineer to join our Edge Cloud Team.

Requirements

Proficiency with Python, especially in the context of ML tooling or backend development.
Experience with AI/ML pipelines or integrating machine learning frameworks like TensorFlow or PyTorch into production environments.
Hands-on experience with vLLM and SGLang
Familiarity with cloud-native tooling such as Docker, Helm, and related CNCF technologies.
A problem-solving mindset and genuine interest in working on distributed systems and platform-level challenges.
Clear communication skills and a collaborative attitude - you enjoy working closely with others to build great solutions.

Nice To Haves

Solid experience with Go programming, particularly in the context of Kubernetes - including building controllers, operators, and working with custom resources (CRDs).
Strong understanding of Kubernetes architecture, container orchestration, and resource management at scale.
Understanding of GPU scheduling and performance optimization in Kubernetes.
Awareness of Kubernetes security practices, including RBAC and container hardening.
Contributions to open-source projects or involvement in cloud-native or MLOps communities.

Responsibilities

Contribute to the development of the Everywhere Inference platform - a Kubernetes-based solution enabling scalable and portable AI inference across a wide range of environments.
Design and implement APIs and developer tools to simplify deployment, management, and monitoring of AI applications.
Focus on packaging and integrating new ML models into the platform, using Python and common ML frameworks.
Optimize serverless container workflows for AI workloads, ensuring performance, scalability, and seamless autoscaling.
Collaborate with customers to fine-tune ML model performance and support their unique use cases.
Work with cross-functional teams to improve the AI applications marketplace and ensure smooth model onboarding and lifecycle management.
Stay current with trends in Kubernetes, machine learning, and MLOps, and help drive innovation within the platform.