Data Engineer 2

JLL•Boston, MA

68d•Onsite

About The Position

JLL empowers you to shape a brighter way. Our people at JLL and JLL Technologies are shaping the future of real estate for a better world by combining world class services, advisory and technology for our clients. We are committed to hiring the best, most talented people and empowering them to thrive, grow meaningful careers and to find a place where they belong. Whether you’ve got deep experience in commercial real estate, skilled trades or technology, or you’re looking to apply your relevant experience to a new industry, join our team as we help shape a brighter way forward. About the Role: JLL is seeking a Data Engineer to join our Central Data Platform Team, supporting enterprise-wide data initiatives and foundational platform solutions. You will play a key role in designing, building, and maintaining scalable data pipelines and indexing infrastructure, with a strong preference for Databricks experience. You will collaborate with engineering peers to ensure the reliability, performance, and operational excellence of JLL’s enterprise data systems, leveraging cutting-edge cloud and AI technologies. This position offers substantial ownership of technical projects and opportunities to impact code uniformity, performance optimization, and enterprise data best practices. You’ll work with diverse data sources, contribute to both internal and stakeholder-facing solutions, and participate in the transformation of JLL’s data ecosystem at a global scale.

Requirements

Minimum 4 years of professional experience in data engineering or a related field.
Proficient in PySpark and Databricks for enterprise-scale ETL, data processing, and pipeline optimization.
Hands-on experience with DocumentDB, cache databases, advanced indexing, and database concepts.
Exposure to cloud platforms (Azure, AWS, GCP) and strong understanding of cloud architecture and networking components.
Familiarity with data lake architectures, medallion patterns, and data governance frameworks.
Experience with API development and event-driven data pipelines.
Competency using GitHub, GitHub Actions, and CI/CD practices in data engineering environments.
Experience or interest in applying AI/ML and AI tooling to data workflows.
Effectively builds rapport with diverse stakeholders and adapts communication to needs.
Strong ability to interpret information and resolve moderately defined and advanced problems within guidelines.
Reviews data to identify and resolve missing, inconsistent, or incomplete information.
Makes sound decisions within established guidelines, escalating complex issues as needed.
May provide guidance or direction to lower-level staff, contributing to their development.
Supports and contributes to a collaborative team, monitoring timeline and accuracy of operations.
Demonstrates flexibility and accountability while managing multiple priorities in fast-paced environments.
Initiates and maintains professional relationships to support team and business objectives.
Bachelor’s degree in Computer Science, Engineering, Information Technology, or a related field.
Minimum 3 years of hands-on experience in data engineering, including cloud and big data platforms.

Responsibilities

Design, implement, and optimize scalable ETL jobs using PySpark within platforms such as Databricks.
Build and maintain high-performance data ingestion, transformation, and indexing pipelines that support reliable search and retrieval across both structured and unstructured data sources (e.g., documents, PDFs) within enterprise data lakes.
Architect and support medallion architecture patterns, ensuring data quality, lineage, and governance across all central data platform layers.
Develop and maintain APIs for data integration, supporting business and analytical needs.
Leverage DocumentDB, cache DBs, and indexing solutions for efficient data storage and access.
Collaborate with infrastructure, software, and AI teams to support the development of foundational data systems, integrating cloud (Azure, AWS, GCP), networking components, and security best practices.
Manage code repositories using GitHub and automate deployment processes with GitHub Actions and CI/CD pipelines.
Explore, experiment with, and apply AI tools to enhance data platform capabilities and intelligent search outcomes.
Contribute to documentation, knowledge sharing, and adoption of best practices in a collaborative, agile engineering environment.
Optimize cluster operations and Spark jobs for performance and cost efficiency.