AI Enablement Data Engineer

IDEXXWestbrook, ME
1d

About The Position

As an AI Enablement Data Engineer in the Data and AI Centre Of Excellence (DAICOE), you will be embedded within mixed project teams and be responsible for building robust, fault-tolerant data pipelines that collect, assemble, transform, and aggregate unorganized data distributed into modern data platforms. You will take full ownership of planning, decision-making, and execution of end-to-end data pipeline development, quickly embedding within project teams with minimal support required. You will compile and install database systems, write complex queries and data tests using dbt, SQL, and Python, scale solutions across distributed platforms, and implement disaster recovery systems. You will build the groundwork for data consumers (software or human) to easily retrieve needed data for evaluations and experiments. You will support operational data use cases such as moving large volumes of data across applications via operational data stores, data hubs, and data lakes, and build private/segregated data pipelines between specific applications. Our technology stack: Databricks, dbt Core, GitHub, Python, SQL, Spark, Snowflake, Iceberg, AWS, and PySpark

Requirements

  • Master’s degree or combination of education and experience preferred. Bachelor’s degree in engineering, analytics, statistics, computer science, information systems, or related field is required.
  • 4-6 years of experience in data engineering, with demonstrated ability to own end-to-end data pipeline development.
  • Expert proficiency in SQL and Python programming.
  • Strong experience with dbt (dbt Core or dbt Cloud) for data transformation, testing, and documentation.
  • Deep understanding of data warehousing concepts, solutions, and architectural patterns.
  • Proven experience with modern distributed data platforms such as Databricks, Snowflake, or equivalent data warehouse technologies.
  • Expertise in Apache Spark for distributed data processing.
  • Extensive experience building distributed and cloud-based data pipelines (AWS, Azure, or GCP).
  • Experience with CI/CD practices for data pipelines and infrastructure as code.
  • Demonstrated ability to quickly embed within new teams and take ownership of technical decision-making with minimal support.
  • Proven track record of balancing multiple projects and priorities simultaneously while maintaining high-quality delivery.
  • Excellent collaboration and communication skills with ability to lead technical discussions in cross-functional embedded teams.
  • Strong organization and time management skills with ability to independently prioritize and manage complex workloads.
  • Ability to function independently in a fast-paced, high-energy environment with changing requirements.
  • Strong commitment to data quality, system functionality, and user satisfaction in rapidly growing and changing environments.
  • Demonstrated initiative in resolving complex problems and balancing conflicting requirements in partnership with project stakeholders.
  • Familiarity with Agile and Scrum methodologies.
  • Applicants must be authorized to work for ANY employer in the U.S. We are unable to sponsor or take over sponsorship of an employment visa for this role.

Responsibilities

  • Lead data engineering efforts on one primary project while supporting planning, decision-making, or execution on 1-2 additional project teams, providing backup coverage and cross-project collaboration as needed.
  • Take ownership of planning, decision-making, and execution for end-to-end data pipeline development as the data engineering lead for your primary project team, adhering to established technical standards and best practices set by data engineering leadership while driving continuous improvement and innovation.
  • Build and maintain sophisticated data pipelines using dbt Core, SQL, and Python on Databricks.
  • Implement measures to ensure data accuracy and accessibility, constantly monitoring and refining the performance of data management systems.
  • Monitor structural performance and utilization, identify problems, and implement solutions independently.
  • Define, design, and implement data management, storage, backup, and recovery solutions that ensure high performance of the organization's enterprise data.
  • Design automated software deployment functionality that allows efficient management of data applications across distributed platforms.
  • Define standards for how data will be stored, consumed, integrated, and managed, understanding structural and business requirements across project contexts.
  • Lead the creation of standards, best practices, and new processes for operational integration of new technology solutions within your project context.
  • Ensures environments are compliant with defined standards and operational procedures.
  • Collaborate with data engineering technical leads and management to align project work with broader organizational objectives and technical standards.
  • Quickly embed within new project teams with minimal ramp-up time, establishing yourself as the technical owner of the data engineering stack.
  • Collaborate closely with embedded project team members including ML engineers, data scientists, product owners, and other technical staff to deliver data solutions aligned with project objectives.
  • Participate in planning and decision-making across multiple project lines simultaneously while maintaining execution excellence.
  • Complete complex problem tickets including bug fixes, design modifications, and enhancements based on customer requirements, often requiring architectural consideration.
  • Completes problem tickets including bug fixes, design modification and enhancement based on customer requirements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service