Data Engineer Intern (49553)

HEADWAY TECHNOLOGIES INCMilpitas, CA
23hOnsite

About The Position

Under the direction of the Manager of Application Development, the Data Engineer Intern will work on various development projects within the application development group using Cloud related techniques, Python programming language, and MS SQL databases. The Intern must possess a working knowledge of data engineering like data handling, cleaning and extraction, as well as statistical and machine-learning based analysis. This position is located in Milpitas, CA.

Requirements

  • Actively Pursuing a Bachelor’s, Master’s or PHD degree in Data Science, Machine Learning, Computer Science, Computer Engineering and/or equivalent relevant experience
  • Experience with the following programming in Python, SQL
  • Experience with software development and project life cycle
  • Knowledge of basic database and data handling processes
  • Knowledge of software languages such as Python, SQL
  • Knowledge of data engineering process and techniques.
  • Knowledge of Machine Learning techniques.
  • Knowledge of the basic principles of software programming and database/data handling
  • Knowledge and ability to use Microsoft Office applications to create spreadsheets, Word documents, and presentations
  • Able to communicate effectively, both verbally and in writing, with employees and management
  • Able to comply with all safety policies and procedures
  • Demonstrated prioritization and organizational skills
  • Demonstrated time management skills
  • Demonstrated problem-solving and trouble shooting skills
  • Flexible and able to prioritize

Responsibilities

  • Uses script like SQL to deploy pipelines on Cloud Services like Azure Data Factory.
  • Use Notebooks in cloud to develop data transformation and anomaly detection algorithms on times series dataset.
  • Knowledge of working with relational databases such as MS SQL
  • Understands the fundamentals of data handling, cleaning and extraction of data files and processes
  • Asist in deploying and maintaining data pipelines that move and synchronize data between cloud storage/services and on-premise SQL Server instances
  • Develop, run and maintain automated data quality tests and monitoring (e.g., row counts, schema checks, null/uniqueness constraints, referential integrity, validations); triage and remediate pipeline failures or data issues
  • Set up automatic jobs on cloud services for daily data engineering tasks and utilize DevOps to develop data pipeline self-updates and monitoring
  • Knowledge of machine learning techniques and deploy algorithms to detect model drifts
  • Performs other duties of a similar nature or level
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service