Principal Data Engineer

LeidosBethesda, MD
2d

About The Position

Leidos has an exciting opportunity for a Data Engineer in our Intel Sector's Analysis Solutions Business Area. Our talented team is at the forefront in Security Engineering, Computer Network Operations (CNO), Mission Software, Analytical Methods and Modeling, Signals Intelligence (SIGINT), and Cryptographic Key Management. At Leidos, we offer competitive benefits, including Paid Time Off, 11 paid Holidays, 401K with a 6% company match and immediate vesting, Flexible Schedules, Discounted Stock Purchase Plans, Technical Upskilling, Education and Training Support, Parental Paid Leave, and much more. Join us and make a difference in National Security! Job Summary We are seeking an experienced Data Engineer to execute the design, development and administration of an enterprise-scale Next Generation Correlation/Entity Resolution platform. This position supports a mission-critical system that serves as a foundational data exploitation capability servicing multiple applications/use cases. The ideal candidate will possess deep expertise in master data management, probabilistic matching algorithms and entity resolution, with the technical architecture skills required to optimize match performance at scale. This role supports technical planning, design, development, integration, and verification and validation. This role refines customer roadmaps, enterprise epics, and strategic requirements into detailed requirements and actionable user stories. This role coordinates with the team leadership to prioritize user stories that realize customer requirements.

Requirements

  • Hands-on experience with probabilistic matching and entity resolution solutions, including translating business requirements into technical configurations
  • Strong knowledge of entity resolution concepts, data linkage theory, and matching algorithms (Fellegi-Sunter, distance metrics, phonetic approaches)
  • Expertise in data quality dimensions, tokenization, standardization, and normalization techniques to improve matching effectiveness
  • Advanced analytical and problem-solving skills with proven experience in data analysis and pattern recognition
  • Proficiency in SQL and enterprise relational databases (Oracle) within large-scale environments, including performance tuning for high-volume transactional systems
  • Experience with enterprise ETL platforms, data integration tools, and maintaining large-scale, cloud-based Linux information systems
  • Familiarity with Agile development methodologies and collaborative software delivery practices
  • BS degree with 12 or more years of relevant experience; or Masters degree with 10 or more years of relevant experience. Will consider additional relevant work experience in lieu of a degree.
  • Must have an active TS/SCI with polygraph security clearance

Nice To Haves

  • Background in statistics, data science, or computational linguistics with hands-on experience with enterprise Master Data Management (MDM) platforms (e.g., IBM InfoSphere MDM)
  • Experience with IBM InfoSphere MDM administrative tools, workbench, and configuration utilities
  • Knowledge of probabilistic matching engines including standardization algorithms, bucketing strategies, comparison functions, and scoring models and familiarity with identity resolution in multi-domain environments and machine learning approaches to entity resolution and record linkage
  • Experience in healthcare, financial services, or other industries with complex entity matching requirements
  • Professional certification in Master Data Management, Data Quality or Database Administration
  • Experience with development in Commercial Cloud Platforms (e.g., AWS, Oracle, Azure) and leveraging cloud data services (e.g., S3, RDS, SQS)
  • Familiarity with Java or similar programming languages for custom extensions and matching algorithm development

Responsibilities

  • Platform Development & Administration: Design, develop, and maintain probabilistic matching configurations, algorithms, and scoring models to ensure accurate entity resolution across multi-domain data sources.
  • Entity Relationship Management: Analyze and connect records to determine relationships across datasets, creating accurate master data views that improve data quality and compliance readiness.
  • Full Lifecycle Management: Oversee data model design, implementation, testing, deployment, and environment administration to ensure high availability and performance.
  • Matching & Resolution Expertise: Configure and optimize standardization, bucketing, and comparison algorithms; design custom rules, weights, and thresholds; and apply advanced techniques to consolidate records across disparate sources.
  • Quality & Effectiveness Monitoring: Perform match tuning, false positive/negative analysis, threshold optimization, and maintain data quality scorecards and effectiveness metrics.
  • Performance & Scalability: Optimize database performance through indexing, partitioning, and query tuning; monitor systems; and conduct capacity planning and performance testing.
  • Collaboration & Leadership: Partner with stakeholders to translate requirements into solutions, document architecture and procedures, and participate in incident response and root cause analysis.

Benefits

  • Paid Time Off
  • 11 paid Holidays
  • 401K with a 6% company match and immediate vesting
  • Flexible Schedules
  • Discounted Stock Purchase Plans
  • Technical Upskilling
  • Education and Training Support
  • Parental Paid Leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service