Data Engineer Self Service Analytics and Real Time Data Platforms

Paramount•Burbank, CA

2d•$99,000 - $147,000•Hybrid

About The Position

The Data Engineering team is seeking a Data Engineer – Self-Service Analytics & Real-Time Data Platforms. In this role, you will help build scalable data products, semantic layers, and real-time data platforms. These platforms enable trusted, governed, and self-service access to data. You will develop solutions that power BI, analytics, experimentation, AI applications, agents, and conversational analytics experiences.

Requirements

2–4+ years of experience building and scaling ETL/ELT pipelines in production environments.
Proven experience with workflow orchestration tools such as Airflow, Composer, or similar platforms.
Working knowledge of distributed data processing concepts.
Expert-level SQL skills for large-scale transformation and analytics.
Experience designing scalable warehouse schemas and ML-ready data layers.
Proven experience optimizing complex queries across multi-terabyte datasets.
Proficiency in Python (or similar language) for data processing and ML pipeline integration.
Experience with distributed processing frameworks such as Spark.
Experience integrating data pipelines with ML platforms such as Vertex AI (preferred), Databricks ML, or equivalent. This includes model training, batch/online inference, and pipeline orchestration.
Experience building real-time data pipelines using Kafka, Pub/Sub, or similar technologies.
Knowledge of feature streaming, low-latency data processing, and event-driven architectures.
Ability to work closely with the streaming team to architect and build real-time dashboards using Superset.
Experience designing cloud-native data architectures (GCP preferred).
Experience with lakehouse architectures and cloud data warehouses.
Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience).
2–4+ years of experience in data engineering, data pipeline development, or related fields.
Solid foundation in modern data engineering principles, distributed systems design, and cloud-native architectures.
Demonstrated ability to design and operate large-scale production data systems.
Excellent problem-solving skills with the ability to work in dynamic, high-velocity environments.
Motivated, thorough, and committed to engineering excellence and ongoing improvement.

Nice To Haves

Knowledge of vector databases, embeddings pipelines, and AI-serving infrastructure is a plus.

Responsibilities

Design, develop, and maintain scalable batch (ETL/ELT) and near real-time streaming data pipelines. These pipelines will process large-scale structured and unstructured datasets.
Design and maintain semantic layers, metrics frameworks, and curated data products.
Enable self-service analytics through governed and reusable business data models.
Implement monitoring, observability, and operational best practices.
Develop governed data access patterns for AI, conversational analytics, and MCP-based applications.
Build AI-ready data products that support machine learning, GenAI, AI agents, and chatbot applications.
Partner with Product, Analytics, BI, and Engineering stakeholders to deliver trusted data solutions.
Design scalable data models optimized for analytics, real-time reporting, and AI use cases.
Develop reusable semantic and transformation layers that provide consistent business definitions.
Drive best practices for data quality, governance, metadata, and discoverability.