Data Engineer

COFENSE
19h

About The Position

Reporting to the Manager, BI Development, the Data Engineer will be responsible for integrating enterprise wide data from Cofense’s Cyber Security based Applications, Micro-services and other disparate data sources. Essential Duties/Responsibilities Work with Architects & Cloud Systems Engineers in designing Data Platform and Architecture Substantial experience with SQL and no-SQL OLTP databases and OLAP data warehousing technologies, especially AWS Aurora Experience data modeling and building data pipelines for multi-product and/or multi-department organizations Develop Data Pipelines on Cloud Technologies like Azure/AWS with well-defined tool frameworks Able to develop ETL code to stream data from disparate (structured and semi-structured) SaaS product data stores to Data Lake/Data Warehouse using Python, Azure/AWS Data Lake services Ability to write complex SQL scripts and automate them using Python Develop test cases and unit tests for key implementations of Data Platform by adhering to software engineering best practices and standards Secure data end-to-end by complying data privacy rules while developing processes to move data across Applications/Data Sources and Data Lake/Data Warehouse, as well as while delivering data through SQL clients and BI tools. Experience building adhoc reporting APIs and service layer on top of underlying OLTP and OLAP databases Help integrate Data Platform with BI tools like Power BI, Tableau, Splunk etc. Ability to develop and interpret Entity Relationship Diagrams (ERD) across data sets in relational database systems as well as non-relational Data stores Able to do Data Mining and Identify trends, patterns, anomalies in complex data sets across multiple data sources/systems and present results without ambiguity Develop data transformations to generate Facts, Summaries, Key metrics by applying business rulesets and aggregations using Python, SQL and other transformation tools Able to review current processes related to data ingestion, transformation and statistical analysis and re-engineer them Collaborate with business users across Cofense’s departments in defining requirements, prioritize project work and deliver them timely Other duties as assigned Knowledge, Skills and Abilities Required Expertise in SQL skills for data transformations, statistical analysis, and troubleshooting across more than one Database Platforms (MySQL, PostgreSQL, Redshift, Azure SQL Warehouse etc.) Expert in writing complex SQL scripts and automate them using Python Knowledge of Data management on NoSQL DBs like DynamoDB, Mongo, and know-how of Big Data tools Hadoop, Spark, Kafka/Kinesis/SQS/Azure Queues or other messaging tools is huge plus Analytical skills, with good at finding data trends/outliers, anomalies, and articulate complex information or data points with Business Users, Management, and individuals Enthusiasm to work with lot of data across disparate data sources and Databases Has strong sense of engineering craftsmanship, takes pride in the code they write. Has a sense of intellectual curiosity and a burning desire to learn is self-driven, actively looks for ways to contribute, and knows how to get things done Is deliriously customer-focused both internal and external customers Sees big picture impact and relationships among and across work units Identifies complex technical problems and tries to resolve with minimal help Education and/or Experience: Bachelor’s degree in Computer Science or Math, Data Analytics, Data Sciences, BI or demonstrated industry experience preferred Over 5 years of proven experience in data architecture, data modelling, and lifecycle management Hands-on experience with relational and cloud databases, including Azure SQL Database, Microsoft SQL Server, Amazon Aurora, and Amazon Redshift Strong background in developing Python and data technologies Practical experience designing and developing ETL data pipelines and applications using SQL and Python Strong expertise in writing complex SQL queries for data transformation and automating processes using Python Experience building and consuming RESTful APIs using Python libraries Proficient in integrating data platforms with BI tools such as Power BI and Splunk, including dashboard and report development Solid experience working with Unix/Linux environments, including SSH tunnelling and writing/interpreting Bash scripts Advanced proficiency in Python 3, with hands-on experience using NumPy, SciPy, Scikit-learn, and Pandas Experience leveraging APIs to extract data and load it into databases using Python

Requirements

  • Expertise in SQL skills for data transformations, statistical analysis, and troubleshooting across more than one Database Platforms (MySQL, PostgreSQL, Redshift, Azure SQL Warehouse etc.)
  • Expert in writing complex SQL scripts and automate them using Python
  • Analytical skills, with good at finding data trends/outliers, anomalies, and articulate complex information or data points with Business Users, Management, and individuals
  • Enthusiasm to work with lot of data across disparate data sources and Databases
  • Has strong sense of engineering craftsmanship, takes pride in the code they write.
  • Has a sense of intellectual curiosity and a burning desire to learn is self-driven, actively looks for ways to contribute, and knows how to get things done
  • Is deliriously customer-focused both internal and external customers
  • Sees big picture impact and relationships among and across work units
  • Identifies complex technical problems and tries to resolve with minimal help
  • Bachelor’s degree in Computer Science or Math, Data Analytics, Data Sciences, BI or demonstrated industry experience preferred
  • Over 5 years of proven experience in data architecture, data modelling, and lifecycle management
  • Hands-on experience with relational and cloud databases, including Azure SQL Database, Microsoft SQL Server, Amazon Aurora, and Amazon Redshift
  • Strong background in developing Python and data technologies
  • Practical experience designing and developing ETL data pipelines and applications using SQL and Python
  • Strong expertise in writing complex SQL queries for data transformation and automating processes using Python
  • Experience building and consuming RESTful APIs using Python libraries
  • Proficient in integrating data platforms with BI tools such as Power BI and Splunk, including dashboard and report development
  • Solid experience working with Unix/Linux environments, including SSH tunnelling and writing/interpreting Bash scripts
  • Advanced proficiency in Python 3, with hands-on experience using NumPy, SciPy, Scikit-learn, and Pandas
  • Experience leveraging APIs to extract data and load it into databases using Python

Nice To Haves

  • Knowledge of Data management on NoSQL DBs like DynamoDB, Mongo, and know-how of Big Data tools Hadoop, Spark, Kafka/Kinesis/SQS/Azure Queues or other messaging tools is huge plus

Responsibilities

  • Work with Architects & Cloud Systems Engineers in designing Data Platform and Architecture
  • Substantial experience with SQL and no-SQL OLTP databases and OLAP data warehousing technologies, especially AWS Aurora
  • Experience data modeling and building data pipelines for multi-product and/or multi-department organizations
  • Develop Data Pipelines on Cloud Technologies like Azure/AWS with well-defined tool frameworks
  • Able to develop ETL code to stream data from disparate (structured and semi-structured) SaaS product data stores to Data Lake/Data Warehouse using Python, Azure/AWS Data Lake services
  • Ability to write complex SQL scripts and automate them using Python
  • Develop test cases and unit tests for key implementations of Data Platform by adhering to software engineering best practices and standards
  • Secure data end-to-end by complying data privacy rules while developing processes to move data across Applications/Data Sources and Data Lake/Data Warehouse, as well as while delivering data through SQL clients and BI tools.
  • Experience building adhoc reporting APIs and service layer on top of underlying OLTP and OLAP databases
  • Help integrate Data Platform with BI tools like Power BI, Tableau, Splunk etc.
  • Ability to develop and interpret Entity Relationship Diagrams (ERD) across data sets in relational database systems as well as non-relational Data stores
  • Able to do Data Mining and Identify trends, patterns, anomalies in complex data sets across multiple data sources/systems and present results without ambiguity
  • Develop data transformations to generate Facts, Summaries, Key metrics by applying business rulesets and aggregations using Python, SQL and other transformation tools
  • Able to review current processes related to data ingestion, transformation and statistical analysis and re-engineer them
  • Collaborate with business users across Cofense’s departments in defining requirements, prioritize project work and deliver them timely
  • Other duties as assigned
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service