Data & Systems Developer Intern

Kohn Pedersen Fox•New York, NY

55d•$22 - $25

About The Position

KPF is a global architecture and urban design firm known for its innovative approach to the built environment. We are committed to integrating cutting-edge technology into our design process, and we are building an internal data and systems capability to enhance how we work, collaborate, and deliver world-class architecture. Role Overview We are looking for a curious and motivated Data & Systems Developer Intern to join our technology team for a summer or semester engagement. In this role, you will get hands-on experience building data pipelines, working with APIs, developing lightweight internal tools, and supporting the infrastructure that powers KPF's AI systems. You will work closely with our AI Systems Engineer and Junior Data & Systems Developer, contributing to real projects from day one. This is not a passive internship — you will write real code, work with real data, and build things that are actually used. You will gain hands-on exposure to AI systems, including large language models, retrieval-augmented generation, vector databases, and agentic tools through direct collaboration with our team. This is a role where you'll build real, marketable AI and data engineering skills on the job.

Requirements

Python — basic to intermediate proficiency; comfortable writing scripts and small programs
SQL — ability to write queries; some experience with relational databases
REST APIs — basic understanding of how APIs work; some experience consuming them
Git — basic version control; comfortable committing and collaborating on a shared codebase
Currently enrolled in a Bachelor's or Master's program in Computer Science, Information Systems, Data Science, or a related field
Genuinely curious and self-motivated — you explore things on your own and enjoy figuring things out
Not afraid to ask questions — we'd rather you ask than guess
Comfortable in a small, collaborative team where everyone's contribution matters
Detail-oriented — you care about getting things right, not just getting them done
Able to communicate clearly with team members, including non-technical colleagues
A builder mindset — you get excited about creating things that solve real problems

Nice To Haves

Any exposure to Flask or FastAPI — even a course project counts
Basic familiarity with AWS or any cloud platform
Some experience with Docker or containerization concepts
Exposure to data visualization tools or libraries (D3, Chart.js, etc.)
Basic familiarity with LLMs and AI concepts (embeddings, vector search, RAG)
Experience with AI coding tools like Claude Code, Cursor, or GitHub Copilot
Interest or background in AEC (Architecture, Engineering & Construction) or creative industries

Responsibilities

Assist in building and maintaining ETL/ELT data pipelines to collect, transform, and load data from internal and external sources
Help clean, validate, and structure data to ensure quality and consistency across systems
Write SQL queries to extract, transform, and report on data from relational databases (MSSQL, PostgreSQL, MySQL, or similar)
Help document data flows, schemas, and pipeline logic
Connect to and consume REST APIs to pull and process data
Assist in building lightweight Python scripts and integrations to automate data collection workflows
Learn to handle authentication, pagination, rate limiting, and error handling in API integrations
Contribute to simple internal web applications and dashboards built with Flask or FastAPI
Help build basic frontend interfaces (HTML/CSS/JavaScript) to expose data and tools to internal users
Collaborate with team members to understand needs and iterate on practical solutions
Get hands-on experience deploying on AWS using core services such as EC2, S3, and RDS
Learn to containerize applications using Docker
Follow best practices for environment configuration and basic security
Assist in preparing and structuring data for ingestion into AI and LLM pipelines (e.g., document chunking, metadata tagging, embedding-ready formatting)
Help maintain vector store data ingestion workflows (uploading documents, refreshing indexes)
Support the team with data tasks related to RAG (Retrieval-Augmented Generation) systems
Gain exposure to knowledge bases and structured data sources that feed conversational AI tools
Assist with testing and validating data quality within AI-powered applications

Benefits

Opportunities for training, development, and career progression.
Collaborative, high-performance environment with a focus on innovation and excellence.
Comprehensive health insurance, including medical, dental, and vision insurance, 401k with company matching contributions, paid time off and other perks.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume