COSMOS - Data Engineer III

University of Arkansas•Little Rock, AR

23h•$85,000•Onsite

About The Position

The Project/Program Specialist (Data Engineer III) position is a full-time provisional with the COSMOS Research Center (cosmos.ualr.edu) at the University of Arkansas and is funded through a grant or contract. The position renewal is contingent upon continued grant funding and satisfactory job performance. The Project/Program Specialist (Data Engineer III) will collect, manage, and convert raw data into usable information for analytics and decision-making. This role requires comprehensive data analysis skills, developing and maintaining datasets, improving data quality and efficiency through leveraging data systems and pipelines, interpreting trends and patterns of data, creating complex data reports, and building algorithms and prototypes. In addition, the other responsibilities include leading the development of social media data collection pipelines, inference methods and infrastructure, classification methods and infrastructure, and visualization dashboards. The Project/Program Specialist (Data Engineer III) will collaborate with researchers, participate in research projects, and interact with other developers at the COSMOS research center and partner organizations to achieve the best possible performance metrics across various projects. Excellent communication and problem-solving skills are essential for this on-campus position, as the data engineer's role is to find opportunities to contribute to cutting-edge research in the exciting field of social computing. This position is governed by state and federal laws and agency/institution policy. The position reports to Dr. Nitin Agarwal ([email protected]), Maulden-Entergy Endowed Chair, Distinguished Professor, and Director, COSMOS Research Center at UA-Little Rock.

Requirements

The candidate must have a master's degree in Computer Science, Information Science, or a related discipline
The candidate must have 2+ years of experience as a data engineer, software developer, software engineer, database administrator, or other similar roles
The candidate must have 1+ years of experience leading a team of data engineers
Advanced proficiency level in working with data models, data pipelines, ETL processes, data stores, data mining, and segmentation techniques
Advanced proficiency level in working with programming/scripting languages (e.g. Java and Python)
Advanced proficiency level in working with data integration platforms and SQL database design
Advanced proficiency level in working with numerical, analytical, and data security skills
Advanced proficiency level in collecting raw data from various social media platforms
Advanced proficiency level in creating CI/CD pipelines
Advanced proficiency level with front-end development (HTML/CSS, JavaScript, Node.js, etc.)
Advanced proficiency level with training and deploying machine learning (ML) models on datasets
Ability to lead a team of data engineers
Advanced proficiency level with Kafka and MongoDB for NoSQL storage across Kubernetes clusters
Advanced proficiency level with microservices, Python, Golang, FlaskAPI, GraphQL, and Docker
Advanced proficiency level with Elasticsearch, Grafana, Prometheus, and Kibana

Responsibilities

Lead a team of data engineers
Collecting and analyzing raw data from various sources, including social media platforms
Organize and maintain datasets
Improving data quality and process efficiency
Design and manage data ETL pipelines that encompass the journey of data from source to destination systems, processing 10 million+ data points daily, utilizing Kafka for real-time data streaming and MongoDB for NoSQL storage across Kubernetes clusters
Design and deploy scalable microservices in Python and Golang, leveraging FlaskAPI, GraphQL, and Docker, ensuring sub-second response times and efficient concurrency with goroutines. Migrated large amounts of data from legacy databases to MongoDB to achieve sub-second access latencies and optimize storage for unstructured data through Elasticsearch integration
Set up and manage the infrastructure required for ingestion, processing, and storage of data
Evaluate the model needs and objectives, interpret trends and patterns of data
Conduct complex data analysis and report on results
Prepare data for analysis and reporting by transforming and cleansing it
Combining raw information from different sources
Explore ways to enhance data quality and reliability
Identify opportunities for data acquisition
Develop analytical tools and programs
Collaborate with teams at COSMOS on several projects
Managing services and operational infrastructure for system reliability and resiliency
Creating continuous integration and continuous deployment (CI/CD) pipelines with Jenkins and GitLab CI for automating service/system deployment
Integrate Prometheus for monitoring, Grafana for real-time dashboarding/visualization, and log analysis with Kibana sourced from Elasticsearch
Front-end development (HTML/CSS, JavaScript, Node.js, etc.)
Training machine learning (ML) models on datasets
Deploying machine learning (ML) models
Enhance the system’s fault tolerance by incorporating alert mechanisms
Work on other tasks as asked