Hitachi Digital Services-posted 7 days ago
Full-time • Mid Level
Dallas, TX

We’re Hitachi Digital Services, a global digital solutions and transformation business with a bold vision of our world’s potential. We’re people-centric and here to power good. Every day, we future-proof urban spaces, conserve natural resources, protect rainforests, and save lives. This is a world where innovation, technology, and deep expertise come together to take our company and customers from what’s now to what’s next. We make it happen through the power of acceleration. Imagine the sheer breadth of talent it takes to bring a better tomorrow closer to today. We don’t expect you to ‘fit’ every requirement – your life experience, character, perspective, and passion for achieving great things in the world are equally as important to us. The team Join a horizontal engineering team supporting 600+ application teams on a mission to raise engineering maturity by driving standards, guidelines, platform capabilities, and large-scale technical debt remediation. You will build advanced agentic AI workflows to automatically analyze codebases, detect tech debt, and generate high-quality fixes—from vulnerability patches to dependency and language upgrades. This is a hands-on, high-impact role shaping the future of automated software modernization. The role Key Responsibilities: Design, develop, and maintain LLM-powered multi-agent workflows for code analysis, remediation proposals, and safe patch generation. Implement agentic patterns including planning/execution loops, dynamic tool orchestration, sandboxing, guardrails, and failure recovery. Build scalable automation systems for technical debt remediation: language/runtime upgrades, vulnerability patching, dependency modernization, and config drift correction. Partner with Dev Experience and Platform teams to define engineering guidelines and reusable standards across the organization. Architect and optimize Retrieval-Augmented Generation (RAG) pipelines, managing chunking, embeddings, hybrid search, reranking, and retrieval policies. Develop robust evaluation frameworks for LLMs, RAG, and agent workflows, including offline datasets, validation metrics, statistical testing, and A/B tests. Contribute to backend systems using Python, distributed systems, microservices, PostgreSQL, DBT, vector databases, caching, streaming, and queueing. Build CI/CD pipelines, observability dashboards, and perform performance analysis on model, retrieval, and network layers. Collaborate cross-functionally with product, platform, and security to move prototypes to production-grade services. Communicate clearly with stakeholders, write technical documentation, and mentor junior engineers. What you’ll bring Must-Have Qualifications: 5+ years experience building production-grade systems with end-to-end ownership. Expertise in Python programming, software engineering best practices, testing strategies, CI/CD, and system design. Hands-on experience shipping LLM-powered features such as autonomous workflows or function calling with measurable impact on reliability or latency. Deep knowledge of multi-agent architectures including planners, executors, and tool routing. Strong understanding of RAG systems: chunking, embeddings, vector/hybrid search, and retrieval policies. Experience evaluating LLMs and agent workflows incorporating statistical reasoning and validation. Proficiency with AWS (Lambda, ECS/EKS, S3, API Gateway, EC2, IAM) and Infrastructure-as-Code for cloud resource automation and deployment. Experience with observability tools (Datadog, logging, tracing, metrics). Familiarity with PostgreSQL, DBT, data modeling, schema evolution, and performance tuning. Knowledge of vector databases like Pinecone or pgvector. Experience building or optimizing CI/CD pipelines (GitHub Actions or similar). Proven track record in application modernization, dependency management, and technical debt reduction. Ability to rapidly prototype, validate, and transition solutions to production systems. Preferred Skills: Experience designing agent infrastructure with sandboxing, tool isolation, and fail-safe execution. Background in large-scale platform engineering or developer experience tooling. Understanding of security, compliance, and privacy for enterprise AI systems. Strong architectural communication ability, including RFC writing and diagramming. Attributes: Adaptable and proactive problem solver. Strong ownership mindset with excellent collaboration and communication skills. Comfortable in ambiguous, fast-paced R&D environments. Passionate about building high-leverage platform capabilities impacting hundreds of engineering teams. This role offers the opportunity to lead in cutting-edge automated software modernization driven by GenAI and platform engineering standards. About us We’re a global, team of innovators. Together, we harness engineering excellence and passion to co-create meaningful solutions to complex challenges. We turn organizations into data-driven leaders that can make a positive impact on their industries and society. If you believe that innovation can bring a better tomorrow closer to today, this is the place for you.

  • Design, develop, and maintain LLM-powered multi-agent workflows for code analysis, remediation proposals, and safe patch generation.
  • Implement agentic patterns including planning/execution loops, dynamic tool orchestration, sandboxing, guardrails, and failure recovery.
  • Build scalable automation systems for technical debt remediation: language/runtime upgrades, vulnerability patching, dependency modernization, and config drift correction.
  • Partner with Dev Experience and Platform teams to define engineering guidelines and reusable standards across the organization.
  • Architect and optimize Retrieval-Augmented Generation (RAG) pipelines, managing chunking, embeddings, hybrid search, reranking, and retrieval policies.
  • Develop robust evaluation frameworks for LLMs, RAG, and agent workflows, including offline datasets, validation metrics, statistical testing, and A/B tests.
  • Contribute to backend systems using Python, distributed systems, microservices, PostgreSQL, DBT, vector databases, caching, streaming, and queueing.
  • Build CI/CD pipelines, observability dashboards, and perform performance analysis on model, retrieval, and network layers.
  • Collaborate cross-functionally with product, platform, and security to move prototypes to production-grade services.
  • Communicate clearly with stakeholders, write technical documentation, and mentor junior engineers.
  • 5+ years experience building production-grade systems with end-to-end ownership.
  • Expertise in Python programming, software engineering best practices, testing strategies, CI/CD, and system design.
  • Hands-on experience shipping LLM-powered features such as autonomous workflows or function calling with measurable impact on reliability or latency.
  • Deep knowledge of multi-agent architectures including planners, executors, and tool routing.
  • Strong understanding of RAG systems: chunking, embeddings, vector/hybrid search, and retrieval policies.
  • Experience evaluating LLMs and agent workflows incorporating statistical reasoning and validation.
  • Proficiency with AWS (Lambda, ECS/EKS, S3, API Gateway, EC2, IAM) and Infrastructure-as-Code for cloud resource automation and deployment.
  • Experience with observability tools (Datadog, logging, tracing, metrics).
  • Familiarity with PostgreSQL, DBT, data modeling, schema evolution, and performance tuning.
  • Knowledge of vector databases like Pinecone or pgvector.
  • Experience building or optimizing CI/CD pipelines (GitHub Actions or similar).
  • Proven track record in application modernization, dependency management, and technical debt reduction.
  • Ability to rapidly prototype, validate, and transition solutions to production systems.
  • Experience designing agent infrastructure with sandboxing, tool isolation, and fail-safe execution.
  • Background in large-scale platform engineering or developer experience tooling.
  • Understanding of security, compliance, and privacy for enterprise AI systems.
  • Strong architectural communication ability, including RFC writing and diagramming.
  • Adaptable and proactive problem solver.
  • Strong ownership mindset with excellent collaboration and communication skills.
  • Comfortable in ambiguous, fast-paced R&D environments.
  • Passionate about building high-leverage platform capabilities impacting hundreds of engineering teams.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service