Big Data Engineer, AVP

Barclays•Jefferson, CO

1d•$160,805 - $170,000•Hybrid

About The Position

The purpose of the role is to build and maintain the systems that collect, store, process, and analyse data, such as data pipelines, data warehouses and data lakes to ensure that all data is accurate, accessible, and secure. This role involves building and maintenance of data architectures and pipelines, designing and implementing data warehouses and data lakes, developing processing and analysis algorithms, and collaborating with data scientists to build and deploy machine learning models. As an Assistant Vice President, expectations include advising and influencing decision-making, contributing to policy development, taking responsibility for operational effectiveness, and collaborating closely with other functions/business divisions. This role may involve leading a team or leading collaborative assignments and guiding team members. It requires consulting on complex issues, identifying ways to mitigate risk, and performing complex analysis of data from multiple sources. Communication of complex information and influencing stakeholders are key aspects of this role. All colleagues are expected to demonstrate Barclays Values and Mindset.

Requirements

Analyze system requirements to automate and improve functionality and test, maintain, monitor, and enhance radial and BDH environment engineering capabilities.
Identify gaps in existing systems and deliver improvements to configuration, data, quality, applications, and processes.
Oversee IT process flows, Hadoop, UNIX, development processes, RFT architecture, clearing system data, among others, to carry out the sophisticated job duties.
Analyze and review log files and assist application team with performance tuning for local and cluster level configuration, queries, and table partitioning.
Collaborate with admin and Unix teams to install, configure, and support Hadoop Clusters.
Continue to generate and collect metrics, proactively plan infrastructure usage.
Utilize automated scripts and chef scripts to monitor, and control local file system disk space usage, user groups, quotas, logfiles, and log file cleaning.
Devise, implement, and monitor systems, automations, notification tools, access control, and data governance.
Initiate and perform capacity planning and data replication setup, replace, and recover disks, commission and decommission data nodes, perform disk balancing, and conduct file directory activities.
Monitor cluster health status, workload and Hadoop services, schedule automated alerts, and tune system related configuration parameters.
Coordinate and propagate application and cluster level changes in non-production and production environments.
Create root cause analysis (RCA) efforts for high severity incidents.
Analyze system failures, identify root causes, create, and prioritize cases with Cloudera vendor, and recommend and follow courses of action.
Perform release management, team handover, and sprint to pipeline activities and follow Information Technology Infrastructure Library (ITIL) procedures.

Nice To Haves

May telecommute pursuant to company policies.

Responsibilities

Build and maintenance of data architectures pipelines that enable the transfer and processing of durable, complete and consistent data.
Design and implementation of data warehoused and data lakes that manage the appropriate data volumes and velocity and adhere to the required security measures.
Development of processing and analysis algorithms fit for the intended data complexity and volumes.
Collaboration with data scientist to build and deploy machine learning models.
Analyze system requirements to automate and improve functionality and test, maintain, monitor, and enhance radial and BDH environment engineering capabilities.
Identify gaps in existing systems and deliver improvements to configuration, data, quality, applications, and processes.
Oversee IT process flows, Hadoop, UNIX, development processes, RFT architecture, clearing system data, among others, to carry out the sophisticated job duties.
Analyze and review log files and assist application team with performance tuning for local and cluster level configuration, queries, and table partitioning.
Collaborate with admin and Unix teams to install, configure, and support Hadoop Clusters.
Continue to generate and collect metrics, proactively plan infrastructure usage.
Utilize automated scripts and chef scripts to monitor, and control local file system disk space usage, user groups, quotas, logfiles, and log file cleaning.
Devise, implement, and monitor systems, automations, notification tools, access control, and data governance.
Initiate and perform capacity planning and data replication setup, replace, and recover disks, commission and decommission data nodes, perform disk balancing, and conduct file directory activities.
Monitor cluster health status, workload and Hadoop services, schedule automated alerts, and tune system related configuration parameters.
Coordinate and propagate application and cluster level changes in non-production and production environments.
Create root cause analysis (RCA) efforts for high severity incidents.
Analyze system failures, identify root causes, create, and prioritize cases with Cloudera vendor, and recommend and follow courses of action.
Perform release management, team handover, and sprint to pipeline activities and follow Information Technology Infrastructure Library (ITIL) procedures.