About The Position

As a Senior Data Engineer at iHorizons, you will design, develop, and maintain scalable data pipelines and architectures that power our AI and advanced analytics initiatives across government and private clients. Working closely with the AI & Data Science team, you will ensure high-quality, reliable, and secure data flow across batch and streaming systems, enabling data-driven model development, deployment, and business intelligence at scale for iHorizons and its clients.

Requirements

  • Bachelor’s degree in Computer Science, Software Engineering, Information Systems, Data Science, or a related field.
  • 6 plus years of experience in data platform architecture design, enterprise-scale data ecosystems, cloud cost optimization, and mentoring junior engineers.
  • Strong foundational knowledge in data structures and algorithms, database systems, and distributed computing principles, forming the basis for building scalable and high-performance data platforms.
  • Advanced proficiency in programming languages such as Python, Java, Scala, Shell scripting, and especially SQL, which is essential for this role.
  • Strong hands-on experience working with relational database systems such as PostgreSQL, MySQL, and Oracle, including expertise in writing complex SQL queries, indexing strategies, query optimization, and data modelling techniques such as 3NF, star schema, and snowflake schema.
  • Familiarity with NoSQL database technologies, depending on project needs, including platforms such as MongoDB, Cassandra, Redis, and DynamoDB.
  • Proven ability to design and implement scalable ETL/ELT pipelines, with solid understanding of data warehousing concepts and experience building both batch and streaming data workflows.
  • Experience using industry-standard tools and platforms such as Apache Airflow, Informatica, Snowflake, Google BigQuery, and Azure Synapse to support enterprise data integration and analytics.
  • Strong knowledge of big data and distributed systems, including frameworks such as Apache Spark, the Hadoop ecosystem, and streaming platforms like Apache Kafka, with an understanding of distributed computing principles, scalability, and fault tolerance.
  • Hands-on expertise with modern cloud data platforms, particularly Google Cloud Platform (BigQuery, Dataflow, Pub/Sub) and Microsoft Azure (Data Factory, Synapse, Databricks), which are critical for today’s data engineering environments.
  • Infrastructure-level understanding of containerization and orchestration tools such as Docker and Kubernetes.
  • Demonstrated ability to design data lakes and enterprise data architectures, including implementing medallion architecture (bronze, silver, gold layers), applying dimensional modelling approaches such as Kimball methodology, and ensuring strong practices in data governance, quality, and observability.
  • Strong working knowledge of supporting engineering practices including version control (Git), schema design, and end-to-end data pipeline architecture.
  • Strong analytical and problem-solving skills, with the ability to troubleshoot complex data and pipeline issues.
  • Excellent communication skills to explain technical concepts clearly to both technical and non-technical stakeholders.
  • Strong collaboration mindset and interpersonal skills

Nice To Haves

  • Master’s degree is an advantage, particularly in Data Engineering, AI, or Cloud Computing.
  • Professional certifications are considered a strong advantage, particularly:
  • Google Professional Data Engineer (highly valuable)
  • Azure Data Engineer Associate
  • Databricks Certified Data Engineer
  • Apache Spark Certification

Responsibilities

  • Design, develop, and deploy scalable ETL/ELT pipelines for structured and unstructured datasets.
  • Implement batch and real-time data processing solutions using modern frameworks.
  • Build data ingestion systems from multiple sources such as APIs, databases, logs, IoT devices, and streaming platforms.
  • Ensure data pipelines support AI/ML feature engineering and training workflows.
  • Automate pipeline execution, monitoring, and orchestration using tools such as Apache Airflow.
  • Build and manage data transformation workflows using modern tools such as dbt to support SQL-based data modeling within cloud data warehouses.
  • Develop distributed processing jobs using Apache Spark and Hadoop ecosystem tools.
  • Work with streaming platforms such as Apache Kafka for real-time data delivery.
  • Apply distributed computing principles including scalability, partitioning, and fault tolerance.
  • Optimize workloads for performance and reliability across large-scale datasets.
  • Build and manage cloud-native pipelines and warehousing solutions on GCP and Azure.
  • Work with services such as BigQuery, Dataflow, Pub/Sub, Azure Synapse, Databricks, and Data Factory.
  • Implement containerized deployments using Docker and Kubernetes.
  • Support cost optimization and performance tuning of cloud-based data platforms.
  • Design and implement enterprise-grade data lakes and data warehouses.
  • Apply medallion architecture principles across bronze, silver and gold data layers.
  • Develop dimensional data models using Kimball methodology, including star and snowflake schemas.
  • Ensure strong governance, data quality, lineage, and observability practices.
  • Build reusable, scalable data models for analytics and AI feature stores.
  • Work extensively with relational databases such as PostgreSQL, MySQL, and Oracle.
  • Write complex SQL queries with advanced proficiency.
  • Apply indexing strategies, query optimization, and performance tuning.
  • Design efficient schemas aligned with normalization and warehousing standards.
  • Support NoSQL database solutions where required, including MongoDB, Cassandra, Redis, and DynamoDB.
  • Develop and maintain clear technical documentation for data pipelines, architectures, and implementations.
  • Write high-quality, maintainable code aligned with established engineering standards and best practices.
  • Ensure all solutions comply with iHorizons’ data security, privacy, and governance policies.
  • Troubleshoot and resolve data pipeline and system issues through structured root-cause analysis.
  • Collaborate with cross-functional teams to continuously improve platform reliability and delivery outcomes.
  • Provide technical guidance and mentorship to junior engineers, supporting skill development and excellence.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service