University of Arkansas-posted 14 days ago
Full-time • Mid Level
Onsite • Little Rock, AR

The Data Engineer IV position is a full-time provisional with the COSMOS Research Center (cosmos.ualr.edu) at the University of Arkansas and is funded through a grant or contract. The position renewal is contingent upon continued grant funding and satisfactory job performance. The Data Engineer IV will collect, manage, and convert raw data into usable information for analytics and decision-making. This role requires comprehensive data analysis skills, developing and maintaining datasets, improving data quality and efficiency through leveraging data systems and pipelines, interpreting trends and patterns of data, complex data reports, and building algorithms and prototypes. In addition, the other responsibilities include leading the development of social media data collection pipelines, inference methods and infrastructure, classification methods and infrastructure, and visualization dashboards. The Data Engineer IV will collaborate with researchers, participate in research projects, and interact with other developers at the COSMOS research center and partner organizations to achieve the best possible performance metrics across various projects. Excellent communication and problem-solving skills are essential for this on-campus position as the data engineer's role is to find opportunities to contribute to cutting-edge research in the exciting field of social computing. This position is governed by state and federal laws, and agency/institution policy. The position reports to: Dr. Nitin Agarwal ([email protected]), Maulden-Entergy Endowed Chair and Distinguished Professor and Director, COSMOS Research Center, UA-Little Rock.

  • Lead a team of data engineers
  • Collecting and analyzing raw data from various sources including social media platforms
  • Organize and maintain datasets
  • Improving data quality and process efficiency
  • Design and manage data ETL pipelines that encompass the journey of data from source to destination systems processing 10 million+ data points daily, utilizing Kafka for real-time data streaming and MongoDB for NoSQL storage across Kubernetes clusters
  • Design and deploy scalable microservices in Python and Golang, leveraging FlaskAPI, GraphQL, and Docker, ensuring sub-second response times and efficient concurrency with goroutines.
  • Migrate large amounts of data from legacy databases to MongoDB to achieve sub-second access latencies and optimize storage for unstructured data through Elasticsearch integration
  • Setup and manage the infrastructure required for ingestion, processing, and storage of data
  • Evaluate the model needs and objectives, interpret trends and patterns of data
  • Conduct complex data analysis and report on results
  • Prepare data for analysis and reporting by transforming and cleansing it
  • Combine raw information from different sources
  • Explore ways to enhance data quality and reliability
  • Identify opportunities for data acquisition
  • Develop analytical tools and programs
  • Collaborate with teams at COSMOS on several projects
  • Managing services and operational infrastructure for system reliability and resiliency
  • Creating continuous integration continuous deployment (CI/CD) pipelines with Jenkins and GitLab CI for automating service/system deployment
  • Integrate Prometheus for monitoring, Grafana for real-time dashboarding/visualization, and log analysis with Kibana sourced from Elasticsearch
  • Front-end development (HTML/CSS, JavaScript, Node.js, etc.)
  • Training machine learning (ML) models on datasets
  • Creating continuous integration continuous deployment (CI/CD) pipelines with Jenkins and GitLab CI for automating service/system deployment
  • Integrate Prometheus for monitoring, Grafana for real-time dashboarding/visualization, and log analysis with Kibana sourced from Elasticsearch
  • Front-end development (HTML/CSS, JavaScript, Node.js, etc.)
  • Deploying machine learning (ML) models
  • Enhance the system’s fault tolerance by incorporating alerting mechanisms
  • Develop frameworks like Spring Boot, React
  • Work on other tasks as asked
  • The candidate must have a master's degree in Computer Science, Information Science, or a related discipline
  • The candidate must have 4+ years of experience as a data engineer/software developer/software engineer/database administrator, or other similar roles; or a PhD degree in Computer Science, Information Science, or a related discipline
  • The candidate must have 2+ years of experience leading a team of data engineers
  • Expert proficiency level in working with data models, data pipelines, ETL processes, data stores, data mining, and segmentation techniques
  • Expert proficiency level in working with programming/scripting languages (e.g., Java and Python)
  • Expert proficiency level in working with data integration platforms and SQL database design
  • Expert proficiency level in working with numerical, analytical, and data security skills
  • Expert proficiency level in collecting raw data from various social media platforms
  • Expert proficiency level in creating CI/CD pipelines
  • Expert proficiency level with front-end development (HTML/CSS, JavaScript, Node.js, etc.)
  • Expert proficiency level with training and deploying machine learning (ML) models on datasets
  • Ability to lead a large team of data engineers (5+ members)
  • Expert proficiency level with Kafka and MongoDB for NoSQL storage across Kubernetes clusters
  • Expert proficiency level with microservices, Python, Golang, FlaskAPI, GraphQL, and Docker
  • Expert proficiency level with Elasticsearch, Grafana, Prometheus, and Kibana.
  • Expert proficiency level in data modeling concepts (ERD, Dimensional Modeling, Data Vault) and data APIs (RESTful API)
  • Expert proficiency level in data processing software (e.g., Hadoop, Spark, TensorFlow, Pig, Hive) and algorithms (e.g., MapReduce, Flume)
  • Expert proficiency level in cloud platforms (AWS, Azure, GCP) and data warehousing solutions (Snowflake, Amazon Red Shift, Google BigQuery, Azure Synapse)
  • Expert proficiency level in technical communications
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service