Metropolitan Transportation Authority-posted 4 months ago
$110,000 - $130,511/Yr
Full-time • Mid Level
New York, NY
5,001-10,000 employees
Executive, Legislative, and Other General Government Support

The Metropolitan Transportation Authority is North America's largest transportation network, serving a population of 15.3 million people across a 5,000-square-mile travel area surrounding New York City, Long Island, southeastern New York State, and Connecticut. The MTA network comprises the nation's largest bus fleet and more subway and commuter rail cars than all other U.S. transit systems combined. MTA strives to provide a safe and reliable commute, excellent customer service, and rewarding opportunities. The incumbent will help lead the team that designs, builds, tests, and delivers end-to-end, automated data pipelines over complex on-premises and off-premises platforms. They will work to extract data from multiple source systems containing structured, semi-structured, and unstructured data to make it consistent, reliable, available, and usable to colleagues across the MTA and, in support of the agency's and New York State's Open Data goals, to external stakeholders and the general public. They will use languages such as SQL, Python, and R and relational database tools such as Oracle, Postgres, and SQL Server to analyze large datasets, build new ones, and design overall data architectures. They will carefully document all work and work closely with colleagues to define needs, problem-solve, support the overall team agenda, and build relationships throughout and at all levels of the agency. They will have to be able to quickly learn the unique features, data constraints, and business needs of any part of the MTA. In addition, unlike other data engineering roles, they will support the entire downstream pipeline process and, occasionally, end-users of the data products. In general, they will have to support the MTA's strategic goals to build data systems and processes that are well-structured and sustainable.

  • Lead projects to develop data pipelines, data warehouses, data marts, multi-dimensional cubes, and data lakes to collect, structure, and integrate a wide range of data sources.
  • Create data system assets that are efficient, timely, reliable, accurate, robust, and scalable.
  • Make information available to staff and decision-makers for analysis and consumption.
  • Identify, propose, support, and carry out the development of new data sets, data access, extraction methodologies, algorithms, and other tools.
  • Test data sets and pipelines, conducting root cause analyses to resolve issues.
  • Incorporate quality assurance functionality in work products.
  • Evaluate existing legacy data, algorithmic, and process solutions; redesign and implement modern data infrastructure.
  • Play a lead role in developing standards for data architecture diagrams, system documentation, data models, and other design-related artifacts.
  • Provide input into data governance initiatives and influence the development of sustainable data management and governance practices.
  • Research and recommend the best pipeline technologies, toolsets, and applications to support the data science and reporting teams.
  • Lead project planning and support manager in the design of workstreams.
  • Enhance team performance by supporting the recruitment of new teammates and mentoring less-experienced members of the team.
  • Perform other duties as assigned.
  • Strong skills in programming, database design, and data lake architectures for data engineering.
  • Ability to play a lead role on the Data & Analytics team to set priorities and overall program planning.
  • Ability to work with data of different types - structured, semi-structured, unstructured.
  • Knowledge of transit/ transportation systems and excellent judgment on how to manage or overcome technical, organizational, or governance constraints.
  • Familiarity with common algorithms used to calculate KPIs and extensive experience with data visualization and business intelligence tools such as Power BI.
  • Experience with data engineering orchestration tools (e.g., Airflow, Dagster) and DevOps tools (e.g., ADO, Git, Jenkins, Docker).
  • Extensive experience and ability to design and implement quality assurance and automated testing systems.
  • Exceptional ability to read code and understand technical issues.
  • Ability to collaborate and provide support to all levels of MTA, both technical and non-technical.
  • Ability to deconstruct difficult problems into smaller and simpler pieces.
  • Extensive experience in project design, strong project management skills.
  • Proactive interest in identifying strategic, policy, and business issues in all proposed and ongoing projects.
  • Strong written communication skills for both non-technical and technical documents.
  • Ability to present and engage on complex work products with executive audiences.
  • Master's degree in Computer Science, Information Technology, Engineering, Mathematics, or a related field.
  • Teleworking eligibility (currently one day per week).
  • Opportunity to work in a diverse and inclusive environment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service