Senior Principal Software Engineer - R&D Tech

GSK

14h•Hybrid

About The Position

GSK remains committed to achieving bold commercial ambitions for the future. By 2031, we aim to deliver £40 billion in annual sales, leveraging our existing strong performance momentum to significantly increase our positive impact on the health of billions of patients globally. Our Ahead Together strategy is centered on early intervention to prevent and alter the course of disease, thereby protecting people and supporting healthcare systems. Our diverse portfolio consists of vaccines, specialty medicines, and general medicines. Through continuous innovation and a dedicated focus on scientific and technical excellence, we strive to develop and launch new, groundbreaking treatments that address critical health challenges. About R&D Technology R&D at GSK is highly data-driven, and we're applying AI/ML and data engineering to generate new insights, enable analytics, gain efficiencies and automation. This role is in R&D Technology where you'll architect and build production-grade applications and data platforms. You'll work on diverse projects spanning regulatory, clinical, legal and HR domains. Versatility is key, with an ability to quickly understand domain data and requirements and translate them into robust technical solutions. You will interact with architects, software and data engineers, AI/ML modelers, product owners as well as other team members across R&D. You will actively participate in creating technical solutions, designs, implementations and participate in the relentless improvement of R&D Tech systems in alignment with agile and DevOps principles. We're seeking a Senior Principal Software Engineer with broad expertise across software development, data engineering, cloud architecture, and AI/ML technologies. This is a hands-on technical role where you'll spend the majority of your time writing code, building data pipelines, architecting cloud-native solutions, and integrating AI/ML capabilities into production applications. You'll be a versatile engineer who can work across the full stack, understand data flows, leverage cloud services effectively, and apply AI/ML techniques to solve real-world problems. In this role you will have the opportunity to work on a mixture of the following: Software Engineering & Application Development Write production-grade code for full-stack applications using Python and modern frontend frameworks Build and maintain scalable REST APIs and microservices architectures Design application architectures and implement technical solutions Develop user interfaces and data visualization components Write comprehensive tests and ensure code quality Debug and optimize application performance Cloud Architecture & Services Design and architect cloud-native applications and solutions on Azure Leverage Azure services including App Services, Azure Functions, AKS, Storage, Data Factory, Cosmos DB Implement scalable, resilient, and cost-effective cloud architectures Optimize cloud resource utilization and performance Design for high availability, disaster recovery, and security Implement cloud security best practices and governance Data Engineering Build and maintain data pipelines for large-scale data processing Implement ETL/ELT processes for diverse data sources Optimize data workflows and processing performance Design and implement data models and schemas Work with structured and unstructured data at scale AI/ML & GenAI Integration Integrate AI/ML models and APIs into production applications Build GenAI applications using LLMs and frameworks like LangChain Implement RAG (Retrieval Augmented Generation) architectures Work with vector databases for semantic search capabilities Apply prompt engineering techniques for optimal LLM performance Understand and implement basic NLP tasks (text classification, entity extraction, embeddings) Collaborate with data scientists to productionize ML models Evaluate and integrate new AI/ML technologies Database & Data Management Write SQL queries for data analysis and application needs Design and optimize database schemas for both relational and NoSQL databases Tune query performance and implement indexing strategies Implement data access patterns and ORM frameworks DevOps & Infrastructure Implement Infrastructure as Code and CI/CD pipelines Containerize applications and orchestrate deployments with Docker and Kubernetes Implement monitoring, logging, and alerting solutions Automate deployment and operational processes Ensure application scalability and reliability Cross-team Collaboration Work closely with data scientists, engineers, and product owners across R&D Participate in code reviews and knowledge sharing Contribute to technical discussions and solution designs Identify innovations and architect solutions Evaluate and integrate new technologies

Requirements

Bachelor's degree in Computer Science or equivalent relevant industry experience
Significant hands-on software development experience with demonstrated progression in technical complexity
Expert-level Python programming with extensive production application development experience
Strong full-stack development experience with modern frameworks: Backend: Python (FastAPI, Flask, Django) Frontend: React, Next.js, TypeScript, or similar modern frameworks
Cloud services experience, preferably Azure (App Services, Functions, Storage, or equivalent cloud services)
Strong SQL skills: Writing complex queries, data modeling, and optimization
Data engineering fundamentals: Building data pipelines and working with large datasets
Understanding of AI/ML concepts and practical experience: Familiarity with LLMs and GenAI applications Basic understanding of how to integrate AI/ML APIs into applications Knowledge of prompt engineering basics Understanding of RAG architectures or willingness to learn quickly
Experience building production-grade applications: Scalable, maintainable, well-tested code
Understanding of software architecture: Design patterns, microservices, distributed systems, cloud-native architectures
Version control with Git and collaborative development workflows
DevOps practices: CI/CD pipelines, containerization basics
Agile development practices and iterative development
Excellent problem-solving and debugging skills
Strong communication and collaboration skills
Ability to quickly learn and adapt to new technologies

Nice To Haves

Azure cloud platform expertise: Deep knowledge of Azure services (App Services, Azure Functions, AKS, Storage Accounts, Azure Data Factory, Cosmos DB, Azure SQL, Key Vault, Application Insights)
Cloud architecture and design: Designing scalable, secure, and cost-effective cloud solutions
Databricks and Apache Spark for large-scale data processing
Hands-on experience with GenAI platforms: OpenAI, Azure OpenAI, LangChain, or similar frameworks
Experience building RAG applications with chunking, vectorization, retrieval strategies
Vector databases: pgvector, Pinecone, Weaviate, or similar
DevOps maturity: Infrastructure as Code (Terraform, Bicep, ARM templates), advanced CI/CD
Containerization and orchestration: Docker and Kubernetes (AKS)
Database expertise: PostgreSQL, SQL Server, Azure SQL with performance tuning
Cloud security: Identity management, RBAC, network security, encryption
Azure DevOps or GitHub Actions for CI/CD pipelines
Experience with REST API design and microservices patterns
Azure certifications (Azure Solutions Architect, Azure Developer, Azure Data Engineer)
Advanced AI/ML knowledge: Experience with ML frameworks (TensorFlow, PyTorch, Hugging Face) Understanding of model training and evaluation Knowledge of NLP techniques beyond basic text processing
Experience with multi-agent systems or advanced RAG patterns
MLOps knowledge: Model deployment, versioning, monitoring, A/B testing
Azure AI services: Document Intelligence, Cognitive Search, Azure AI Studio, Azure Machine Learning
Search technologies: Azure Search, Sinequa, Elasticsearch, Lucene-based systems
Advanced Spark optimization and performance tuning
Real-time data processing and streaming architectures (Kafka, Azure Event Hubs)
Pharmaceutical, healthcare, or regulated industry experience
Experience with compliance requirements: HIPAA, GxP, 21 CFR Part 11
Experience with data visualization libraries (D3.js, Plotly, Chart.js)
Software security best practices and secure coding
FinOps practices: Cloud cost optimization and management
Experience mentoring junior engineers

Responsibilities

Write production-grade code for full-stack applications using Python and modern frontend frameworks
Build and maintain scalable REST APIs and microservices architectures
Design application architectures and implement technical solutions
Develop user interfaces and data visualization components
Write comprehensive tests and ensure code quality
Debug and optimize application performance
Design and architect cloud-native applications and solutions on Azure
Leverage Azure services including App Services, Azure Functions, AKS, Storage, Data Factory, Cosmos DB
Implement scalable, resilient, and cost-effective cloud architectures
Optimize cloud resource utilization and performance
Design for high availability, disaster recovery, and security
Implement cloud security best practices and governance
Build and maintain data pipelines for large-scale data processing
Implement ETL/ELT processes for diverse data sources
Optimize data workflows and processing performance
Design and implement data models and schemas
Work with structured and unstructured data at scale
Integrate AI/ML models and APIs into production applications
Build GenAI applications using LLMs and frameworks like LangChain
Implement RAG (Retrieval Augmented Generation) architectures
Work with vector databases for semantic search capabilities
Apply prompt engineering techniques for optimal LLM performance
Understand and implement basic NLP tasks (text classification, entity extraction, embeddings)
Collaborate with data scientists to productionize ML models
Evaluate and integrate new AI/ML technologies
Write SQL queries for data analysis and application needs
Design and optimize database schemas for both relational and NoSQL databases
Tune query performance and implement indexing strategies
Implement data access patterns and ORM frameworks
Implement Infrastructure as Code and CI/CD pipelines
Containerize applications and orchestrate deployments with Docker and Kubernetes
Implement monitoring, logging, and alerting solutions
Automate deployment and operational processes
Ensure application scalability and reliability
Work closely with data scientists, engineers, and product owners across R&D
Participate in code reviews and knowledge sharing
Contribute to technical discussions and solution designs
Identify innovations and architect solutions
Evaluate and integrate new technologies