We’re building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you’ll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time. Position Summary: Caremark LLC, a CVS Health company, is hiring for the following role in Hartford, CT: Principal Data Engineer to develop large scale data structures and pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs. Collaborate with data science team to transform data and integrate algorithms and models into automated processes. Build data marts and data models to support Data Science and other internal customers. Integrate data from a variety of sources, assuring that they adhere to data quality and accessibility standards. Analyze current information technology environments to identify and assess critical capabilities and recommend solutions. Build high-performance data processing frameworks leveraging cloud and/or on-premise data platform. Design and build large-scale data structures, pipelines, and efficient Extract/Load/Transform (ETL) workflows. Write ETL (Extract / Transform / Load) processes, design database systems, and develop tools for real- time and offline analytic processing. Transform data and integrate algorithms and models into automated processes. Analyze and synthesize data to meet the insights, reporting dashboard, and descriptive/predictive/prescriptive analytic requirements. Design conformed, aggregated, and semantic data layers, and manipulating large datasets to support insights and analytics using SQL, BTEQ, SAS, and similar tools, as applicable. Data management in building data layers in Sandbox or a production environment for reporting and analytical use cases. Work on "big data" platforms, including Hadoop (Azure or GCP preferred) and Spark, as applicable. Design data models and solutions for analytical and reporting use cases. Use knowledge in Hadoop architecture, HDFS commands, and experience as applicable, designing and optimizing queries to build data pipelines. Use strong programming skills in Python, Java, and/or any of the major languages to build robust data pipelines and dynamic systems. Experiment with available software tools and advise on new tools in order to determine optimal solution given the requirements dictated by the model/use case. Support modeling/diagramming and build design specifications for data objects and surrounding data processing logic. Collaborate with business solution strategists and support new data source onboarding process through data discovery, profiling, and mapping. Participate in proof of concepts to build the data layers and concepts to derive analytical insights. Leverage multiple tools and programming languages to analyze and manipulate data sets from disparate data sources. Telecommuting available. Multiple positions.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal