Data Engineering Intern - REMOTE

First AmericanSanta Ana, CA
Remote

About The Position

First American builds and delivers industry-leading real estate data products, including solutions such as the National Title Plant and National Starter Exchange. These platforms power critical title, settlement, underwriting, and risk decisions across the real estate ecosystem. The Data Engineering team designs, builds, and maintains large-scale data platforms that support these external, customer-facing products. The team works with modern cloud technologies to ensure scalable, secure, and high-performance data processing pipelines that enable real-time and batch analytics at enterprise scale. This internship will support First American’s cloud modernization initiatives, with a focus on migrating data engineering workloads from Azure to Google Cloud Platform (GCP). The intern will contribute to ETL development, data pipeline optimization, and platform modernization efforts that directly support First American’s national data products.

Requirements

  • Strong SQL skills
  • Python for data engineering and transformation
  • Experience with data manipulation using Pandas and NumPy.
  • Understanding basic machine learning concepts, including regression, classification, Clustering, model evaluation, and overfitting.
  • Ability to build, train, and evaluate ML models using tools like Scikit-learn.
  • Familiarity with data visualization tools such as Matplotlib or Seaborn.
  • Understanding of basic statistics, probability, and machine learning concepts (regression and classification).
  • Experience with data cleaning, preprocessing, and exploratory data analysis (EDA) to understand patterns and trends.
  • Knowledge of ETL processes and experience with ETL tools such as Informatica; familiarity with GCP tools (e.g., Big Query, Dataflow, Cloud Composer, Cloud Storage) is an added advantage.
  • Familiarity with cloud platforms (Azure and/or GCP preferred)
  • Understanding of distributed data processing concepts
  • Experience with version control (e.g., GitHub)
  • Knowledge of data warehousing concepts and data modeling
  • Strong problem-solving and analytical skills

Responsibilities

  • Assisting in the migration of data pipelines from Azure-based infrastructure to GCP
  • Designing, building, and testing ETL workflows in cloud-native environments
  • Refactoring and optimizing existing data transformation processes for scalability and performance
  • Supporting data validation, reconciliation, and quality assurance efforts
  • Contributing to technical documentation and architectural diagrams
  • Collaborating with product engineering teams to ensure seamless integration of data services
  • Participating in code reviews and version control workflows

Benefits

  • medical
  • dental
  • vision
  • 401k
  • PTO/paid sick leave
  • employee stock purchase plan
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service