Senior Data Engineer (1043) - DataSF

City and County of San Francisco•San Francisco, CA

45d•Hybrid

About The Position

DataSF is seeking a Senior Data Engineer with 3+ years of experience to join our growing team. Reporting to the Principal Data Engineer, you will be instrumental in designing, building, and maintaining the City's data infrastructure, enabling robust data pipelines and reliable data access for analytical and operational needs. This is an exciting position for someone eager to apply advanced data engineering techniques to complex urban challenges, contributing directly to San Francisco's commitment to efficient, equitable , and ethical service delivery. Learn more about DataSF’s recent work on our blog . If you are an entrepreneurial and passionate data enthusiast, join our team to improve government through good use of data!

Requirements

An associate degree in computer science, computer engineering, information systems, or a closely related field from an accredited college or university OR its equivalent in terms of total course credits/units [i.e., at least sixty (60) semester or ninety (90) quarter credits/units with a minimum of twenty (20) semester or thirty (30) quarter credits/units in one of the fields above or a closely-related field].
Three (3) years of experience analyzing, installing, configuring, enhancing, and/or maintaining the components of an enterprise network.
Additional experience as described above may be substituted for the required degree on a year-for-year basis (up to a maximum of two (2) years). One (1) year is equivalent to thirty (30) semester units/ forty-five (45) quarter units with a minimum of 10 semester / 15 quarter units in one of the fields above or a closely related field.
Completion of the 1010 Information Systems Trainee Program may be substituted for the required degree.

Nice To Haves

Hands-on experience administering and developing on managed cloud data platforms such as Snowflake, BigQuery, or Databricks.
Demonstrated expertise in writing advanced, performant SQL, and using tools like dbt for SQL-based data transformation and modeling.
Strong programming skills in Python (with libraries like pandas, PySpark) for data processing and automation.
Proficiency with an Infrastructure as Code tool, with a preference for Terraform.
Experience building and deploying data pipelines using orchestration tools like Azure Data Factory, Airflow, Dagster, or similar technologies.
Deep understanding of data warehousing concepts, data modeling, and modern ELT principles.
Understanding of data governance, data security, and data privacy principles.
Experience with real-time data streaming technologies (e.g., Kafka, Kinesis, Snowpipe).
Experience deploying and managing data pipelines for machine learning models.
Strong problem-solving skills with ability to design practical and effective data solutions
Excellent verbal and written communication skills, including the ability to explain technical concepts to non-technical stakeholders
A collaborative mindset with enthusiasm to work across diverse, cross-functional teams
Commitment to equity, transparency, and ethical data use
Passion for public service and using data to improve government services
Empathy for San Francisco’s diverse communities and a drive to make data and services and more accessible for SF residents
Interest or experience in public sector data or social impact work

Responsibilities

Platform Administration : Manage our central Snowflake data warehouse, including access control, security policies, resource monitoring, performance tuning, and cost optimization. Administer our platform with a focus on data democratization and accessibility, while protecting privacy and security.
Pipeline Developmen t: Build and maintain scalable and resilient pipelines to ingest and structure data from diverse sources. Design infrastructure to support both streaming and batch processes, and both structured and unstructured data sources.
Infrastructure as Code (IaC) : Use Terraform to define, deploy, and manage data infrastructure, ensuring our pipelines are reproducible, version-controlled, and production-ready.
Best Practices & Innovation : Champion and implement best practices for documentation, data modeling, warehouse architecture, SQL optimization, and testing. Think creatively to find new ways to improve our data platform's capabilities and efficiency. Provide guidance to department partners on data engineering best practices.
Collaboration : Work closely with data scientists, analysts, product managers, software engineers, and nontechnical stakeholders in diverse domains to understand data requirements and build solutions that meet their needs.
Monitoring & Support : Proactively monitor the health of the data platform and pipelines, troubleshoot issues, and ensure high standards of data quality and availability.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume