Data Engineer Intern (49553)

HEADWAY TECHNOLOGIES INC•Milpitas, CA

23h•Onsite

About The Position

Under the direction of the Manager of Application Development, the Data Engineer Intern will work on various development projects within the application development group using Cloud related techniques, Python programming language, and MS SQL databases. The Intern must possess a working knowledge of data engineering like data handling, cleaning and extraction, as well as statistical and machine-learning based analysis. This position is located in Milpitas, CA.

Requirements

Actively Pursuing a Bachelor’s, Master’s or PHD degree in Data Science, Machine Learning, Computer Science, Computer Engineering and/or equivalent relevant experience
Experience with the following programming in Python, SQL
Experience with software development and project life cycle
Knowledge of basic database and data handling processes
Knowledge of software languages such as Python, SQL
Knowledge of data engineering process and techniques.
Knowledge of Machine Learning techniques.
Knowledge of the basic principles of software programming and database/data handling
Knowledge and ability to use Microsoft Office applications to create spreadsheets, Word documents, and presentations
Able to communicate effectively, both verbally and in writing, with employees and management
Able to comply with all safety policies and procedures
Demonstrated prioritization and organizational skills
Demonstrated time management skills
Demonstrated problem-solving and trouble shooting skills
Flexible and able to prioritize

Responsibilities

Uses script like SQL to deploy pipelines on Cloud Services like Azure Data Factory.
Use Notebooks in cloud to develop data transformation and anomaly detection algorithms on times series dataset.
Knowledge of working with relational databases such as MS SQL
Understands the fundamentals of data handling, cleaning and extraction of data files and processes
Asist in deploying and maintaining data pipelines that move and synchronize data between cloud storage/services and on-premise SQL Server instances
Develop, run and maintain automated data quality tests and monitoring (e.g., row counts, schema checks, null/uniqueness constraints, referential integrity, validations); triage and remediate pipeline failures or data issues
Set up automatic jobs on cloud services for daily data engineering tasks and utilize DevOps to develop data pipeline self-updates and monitoring
Knowledge of machine learning techniques and deploy algorithms to detect model drifts
Performs other duties of a similar nature or level

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume