The Data Engineer III is responsible for designing, building, and optimizing scalable big data pipelines, architectures, and datasets that enable advanced analytics and data-driven decision-making. This role involves developing efficient data transformation and processing frameworks, managing data structures, metadata, dependencies, and workloads, and ensuring the reliability and performance of the data ecosystem. The engineer will also work extensively with unstructured datasets, applying analytical techniques to extract insights and improve data accessibility across the organization. What you'll do... Data Modeling: Designing and implementing data models to support structured and unstructured datasets, ensuring data integrity and efficiency. Data Extraction: Developing and optimizing data extraction processes from various sources including databases, APIs, and logs. Data Cleaning: Preprocessing and cleaning data to remove inconsistencies and improve data quality. Data Screening: Implementing data validation and quality checks to ensure accuracy and completeness of data. Data Exploration: Conducting exploratory data analysis to understand patterns, trends, and correlations in the data. Data Visualization: Creating visualizations using tools like Tableau, PowerBI, or Looker to communicate insights and findings effectively. Big Data Technologies: Utilizing tools and frameworks such as Spark, Spark SQL, PySpark, HDFS, and MapReduce for processing large datasets efficiently. Cloud Services: Leveraging cloud platforms like GCP, Azure/AWS, Databricks, Azure HD Insights, ADF for data storage, processing, and analytics. Data Querying: Writing advanced SQL queries to extract and manipulate data from relational databases and other data stores. Data Pipeline Development: Building and optimizing scalable data pipelines and architectures to move and transform data across systems. Data Transformation: Developing processes for data transformation, structure, metadata, dependency, and workload management. Enterprise Software Development: Contributing to the development of enterprise-level software products related to data engineering and analytics. What you'll bring: Cross-functional Collaboration: Working closely with cross-functional teams including data scientists, analysts, and software engineers to achieve common goals. Programming Languages: Proficiency in at least one scripting language like Python or Scala for automation, data manipulation, and tool development. Agile Environment: Collaborating effectively in an Agile environment, participating in sprints, and adapting to changing Analytical Skills: Applying strong analytical skills to work with complex and unstructured datasets, extracting valuable insights and actionable information. project requirements. Big Data Data Stores: Implementing and managing highly scalable big data stores to efficiently store and access large volumes of data. Data Value Extraction: Manipulating, processing, and extracting value from large, diverse datasets to drive business decisions and innovation. Big Data Technologies: Experience utilizing tools and frameworks such as Spark, Spark SQL, PySpark, HDFS, and MapReduce for processing large datasets efficiently. Cloud Services: Experience leveraging cloud platforms like GCP, Azure/AWS, Databricks, Azure HD Insights, ADF for data storage, processing, and analytics.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level