We are seeking a Principal Software Engineer to lead the architecture and evolution of our data and machine learning infrastructure. This role will shape the foundation on which data-driven products, analytics, and AI applications are built. You will design systems that enable large-scale data processing, reliable pipelines, and efficient machine learning development-from feature engineering to real-time model serving. As a principal engineer, you will partner with product, data science, and platform teams to set technical direction, drive adoption of reusable frameworks, and mentor engineers across the organization. You will ensure that both data and ML platforms are scalable, reliable, cost-efficient, and compliant with privacy and governance standards. The core of the Data Platform is a data lake on AWS S3 with Apache Iceberg as the table format to ensure reliability. Data ingestion is standardized through Confluent Kafka for real-time streaming and Fivetran for ingestion of files and change-data. The transformation layer is decoupled from storage, using Apache Flink for stream processing, AWS Glue (Spark) for core ETL , and dbt/Athena for building analytical data models. The platform serves data through fit-for-purpose data stores, including Amazon DynamoDB for low-latency applications and Google BigQuery as the primary engine for analytics and BI. This is a hybrid role based in our New York City headquarters, reporting to the Sr. Director of Engineering. You can typically expect to come into the office 2+ days per week.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Industry
Publishing Industries
Education Level
No Education Listed
Number of Employees
5,001-10,000 employees