COSMOS - Data Engineer III

University of ArkansasLittle Rock, AR
23h$85,000Onsite

About The Position

The Project/Program Specialist (Data Engineer III) position is a full-time provisional with the COSMOS Research Center (cosmos.ualr.edu) at the University of Arkansas and is funded through a grant or contract. The position renewal is contingent upon continued grant funding and satisfactory job performance. The Project/Program Specialist (Data Engineer III) will collect, manage, and convert raw data into usable information for analytics and decision-making. This role requires comprehensive data analysis skills, developing and maintaining datasets, improving data quality and efficiency through leveraging data systems and pipelines, interpreting trends and patterns of data, creating complex data reports, and building algorithms and prototypes. In addition, the other responsibilities include leading the development of social media data collection pipelines, inference methods and infrastructure, classification methods and infrastructure, and visualization dashboards. The Project/Program Specialist (Data Engineer III) will collaborate with researchers, participate in research projects, and interact with other developers at the COSMOS research center and partner organizations to achieve the best possible performance metrics across various projects. Excellent communication and problem-solving skills are essential for this on-campus position, as the data engineer's role is to find opportunities to contribute to cutting-edge research in the exciting field of social computing. This position is governed by state and federal laws and agency/institution policy. The position reports to Dr. Nitin Agarwal ([email protected]), Maulden-Entergy Endowed Chair, Distinguished Professor, and Director, COSMOS Research Center at UA-Little Rock.

Requirements

  • The candidate must have a master's degree in Computer Science, Information Science, or a related discipline
  • The candidate must have 2+ years of experience as a data engineer, software developer, software engineer, database administrator, or other similar roles
  • The candidate must have 1+ years of experience leading a team of data engineers
  • Advanced proficiency level in working with data models, data pipelines, ETL processes, data stores, data mining, and segmentation techniques
  • Advanced proficiency level in working with programming/scripting languages (e.g. Java and Python)
  • Advanced proficiency level in working with data integration platforms and SQL database design
  • Advanced proficiency level in working with numerical, analytical, and data security skills
  • Advanced proficiency level in collecting raw data from various social media platforms
  • Advanced proficiency level in creating CI/CD pipelines
  • Advanced proficiency level with front-end development (HTML/CSS, JavaScript, Node.js, etc.)
  • Advanced proficiency level with training and deploying machine learning (ML) models on datasets
  • Ability to lead a team of data engineers
  • Advanced proficiency level with Kafka and MongoDB for NoSQL storage across Kubernetes clusters
  • Advanced proficiency level with microservices, Python, Golang, FlaskAPI, GraphQL, and Docker
  • Advanced proficiency level with Elasticsearch, Grafana, Prometheus, and Kibana

Responsibilities

  • Lead a team of data engineers
  • Collecting and analyzing raw data from various sources, including social media platforms
  • Organize and maintain datasets
  • Improving data quality and process efficiency
  • Design and manage data ETL pipelines that encompass the journey of data from source to destination systems, processing 10 million+ data points daily, utilizing Kafka for real-time data streaming and MongoDB for NoSQL storage across Kubernetes clusters
  • Design and deploy scalable microservices in Python and Golang, leveraging FlaskAPI, GraphQL, and Docker, ensuring sub-second response times and efficient concurrency with goroutines. Migrated large amounts of data from legacy databases to MongoDB to achieve sub-second access latencies and optimize storage for unstructured data through Elasticsearch integration
  • Set up and manage the infrastructure required for ingestion, processing, and storage of data
  • Evaluate the model needs and objectives, interpret trends and patterns of data
  • Conduct complex data analysis and report on results
  • Prepare data for analysis and reporting by transforming and cleansing it
  • Combining raw information from different sources
  • Explore ways to enhance data quality and reliability
  • Identify opportunities for data acquisition
  • Develop analytical tools and programs
  • Collaborate with teams at COSMOS on several projects
  • Managing services and operational infrastructure for system reliability and resiliency
  • Creating continuous integration and continuous deployment (CI/CD) pipelines with Jenkins and GitLab CI for automating service/system deployment
  • Integrate Prometheus for monitoring, Grafana for real-time dashboarding/visualization, and log analysis with Kibana sourced from Elasticsearch
  • Front-end development (HTML/CSS, JavaScript, Node.js, etc.)
  • Training machine learning (ML) models on datasets
  • Deploying machine learning (ML) models
  • Enhance the system’s fault tolerance by incorporating alert mechanisms
  • Work on other tasks as asked
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service