Platform Software Engineer, Kubernetes

ComcastWillistown Township, PA
1d$114,986 - $172,479Hybrid

About The Position

Make your mark at Comcast -- a Fortune 30 global media and technology company. From the connectivity and platforms we provide, to the content and experiences we create, we reach hundreds of millions of customers, viewers, and guests worldwide. Become part of our award-winning technology team that turns big ideas into cutting-edge products, platforms, and solutions that our customers love. We create space to innovate, and we recognize, reward, and invest in your ideas, while ensuring you can proudly bring your authentic self to the workplace. Join us. You’ll do the best work of your career right here at Comcast. (In most cases, Comcast prefers to have employees on-site collaborating unless the team has been designated as virtual due to the nature of their work. If a position is listed with both office locations and virtual offerings, Comcast may be willing to consider candidates who live greater than 100 miles from the office for the remote option.) Job Summary As a Platform Engineer, you will be responsible for building, managing, and optimizing the underlying infrastructure and tools that enable efficient, scalable, and reliable execution of large-scale data processing workloads. This role is a specialized subset of data platform engineering, ensuring the environment where data engineers and data scientists run their Spark jobs is robust and cost-efficient. . Job Description

Requirements

  • Bachelor's degree in computer science or a related field, or equivalent experience, typically 5 years in a DevOps or Systems Engineering role.
  • Expertise in Apache Spark: Deep understanding of Spark architecture, including RDDs, DataFrames, execution hierarchy, lazy evaluation, shuffling, and fault tolerance.
  • Proficiency in languages used for Spark development and automation, such as Python, Pyspark and Scala/Java.
  • Proficient in Linux Scripting (Bash).
  • Proficient in writing SQL.
  • Experience in CI/CD tools, Github.
  • Experience in setting up and using observability tools like Prometheus, Grafana etc.,
  • Knowledge on networking concepts like TCP/IP, DNS, Load Balancer etc.,
  • Automation via Terraform/Ansible
  • Hands-on experience with on-prem and major cloud providers (AWS, Azure, GCP) and container orchestration tools like Docker and Kubernetes.
  • Handson experience setting up IAM, VPC, EC2 etc.,
  • Familiarity with related technologies and formats like Delta Lake, Apache Iceberg, Apache Kafka, Hadoop, and various data storage systems (S3, HDFS, etc.).

Nice To Haves

  • Nice to have experience with Catalogs like Hive Metastore, Unity Catalog (Databricks and Open Source)
  • Nice to have experience setting up, maintaining caching layers like Alluxio.

Responsibilities

  • Architecting and managing the platforms where Spark runs, such as Kubernetes clusters, or cloud services like AWS (EKS).
  • Packaging Spark workloads (often via Docker/Kubernetes) and integrating them with orchestration systems like Apache Flyte.
  • Deploying Infrastructure via Terraform/Ansible
  • Troubleshooting and resolving job failures, memory/resource issues, and execution anomalies. This includes optimizing Spark configurations to reduce cloud compute and storage costs.
  • Building automation and tools in languages like Python, Java, or Scala, Linux Scripting (Bash) to increase the productivity of development teams.
  • Write medium to complex SQL Queries as needed.
  • Implementing and maintaining systems for monitoring, logging, and alerting (e.g. Prometheus, Grafana) to ensure platform stability and reliability.
  • Working closely with data engineers, data scientists, and other engineering teams to define requirements, advise on best practices, and ensure successful delivery of data objectives.
  • Engaging with open-source communities (like Apache Spark, Delta Lake, or Apache Iceberg) to discuss technical challenges and contribute improvements.
  • Create and maintain comprehensive documentation for Kubernetes infrastructure, processes, and procedures.
  • Provide training and support to team members as needed.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service