Senior Data Engineer with DevOps

CGIDallas, TX
Onsite

About The Position

We are seeking a Data Engineer with 5 years of experience to design and maintain scalable data pipelines supporting analytics, reporting, and operational needs. The role involves collaborating with cross-functional teams to ensure data alignment with business requirements and enterprise standards. This role will require someone at our client site 5 days a week in either Pittsburgh, PA, Cleveland, OH, or Dallas, TX. For this role on this particular client engagement, employer sponsorship of immigration-related visa and/or green card status as part of the PERM process will not be available.

Requirements

  • 5+ years of experience in data engineering and big data processing
  • Strong expertise in Apache Spark (Spark Core, Spark SQL) and PySpark for large scale batch processing
  • Experience working with structured and semi-structured data, including complex transformations and performance tuning
  • Proficiency in data ingestion and integration from sources like Oracle, SQL Server, Hive, HDFS, and S3; transform data into ‘curated data models'
  • Experience writing data to Hive tables, Data Lakes (Iceberg), and downstream reporting systems
  • Strong knowledge of SQL and data modeling concepts
  • Hands-on experience with Apache Airflow for workflow orchestration (DAG design, scheduling expectations, monitoring)
  • Proficiency in shell scripting for job automation, file validation, dependency handling, and logging. Trigger Spark Jobs, perform file checks and validation; Archive & purge data; manage job dependency, logging & error handling
  • Strong understanding of batch processing and batch job scheduling frameworks
  • Experience migrating from CA7/Control M Airflow (daily, hourly, weekly schedules)
  • CI/CD for data pipelines
  • Fundamentals in Linux and Networking
  • Docker, OCP containerization / Kubernetes
  • Knowledge of CI/CD pipeline tools: Tools commonly include Jenkins, GitHub Actions, Azure DevOps, GitLab CI, Maven, and Gradle
  • Automate operational tasks using Python, Bash/Shell, and PowerShell
  • Implement monitoring and alerting, Application Insights. Enable centralized logging with tools such as ELK.
  • Experience ensuring data quality, reliability, and compliance in regulated environments
  • Good communication and documentation skills

Responsibilities

  • Design and build scalable data pipelines aligned with business needs
  • Process large datasets (batch + sometimes near Realtime)
  • Ensure data quality, consistency, and governance standards across systems
  • Support data integration and transformation efforts for analytics and reporting platforms
  • Maintain data dictionaries, metadata, and documentation
  • Participate in data architecture reviews and model validation processes
  • Support analytics reporting and risk platforms

Benefits

  • Competitive compensation
  • Comprehensive insurance options
  • Matching contributions through the 401(k) plan and the share purchase plan
  • Paid time off for vacation, holidays, and sick time
  • Paid parental leave
  • Learning opportunities and tuition assistance
  • Wellness and Well being programs
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service