Python Developer - NLP, ML, Gen AI - Assistant Vice President

Citi•Mississauga, ON

3d•Onsite

About The Position

We are looking for a mid-level Python Developer - NLP, ML, Gen AI with combined experience in Data Engineering and AI/NLP engineering. The candidate will build NLP pipelines using libraries such as Flair, BERT, and LLM frameworks, and will also work on large-scale data processing using PySpark, Pandas, and related data tools. The role includes developing APIs, integrating with platform services, and supporting CI/CD deployments using GitHub and LightSpeed Enterprise.

Requirements

3–5 years of hands-on Python programming experience.
Strong fundamentals in Python, OOP, and design patterns.
Experience with NLP libraries such as Flair, BERT, HuggingFace Transformers, or similar.
Solid experience with PySpark, Pandas, PyArrow, and distributed data pipelines.
Proficient in working with Parquet using FastParquet or pyarrow.parquet.
Familiarity with fast JSON parsing libraries (json, ujson, orjson).
Experience building APIs using Flask (FastAPI is a plus).
Experience with MLflow for model tracking and deployment.
Good understanding of CI/CD practices and Git workflows.
Experience working with Redis or similar in-memory stores.
Experience with Autosys JILs for job scheduling.
Comfortable with Linux command line and shell scripting.
Strong debugging, problem-solving, and teamwork skills.
Exposure to cloud services; AWS boto3 experience is an asset.

Nice To Haves

Experience with Polars or Dask for high-performance data processing.
Experience with PyTorch or TensorFlow for model training.
Experience with Docker, Kubernetes, or containerized deployments.
Experience with monitoring tools such as ITRS Geneos.
Experience with FastAPI, Airflow, or Prefect.

Responsibilities

Develop and optimize ETL/data processing jobs using PySpark, Pandas, PyArrow, and related libraries.
Work with Parquet files using FastParquet or pyarrow.parquet for efficient data processing.
Implement data parsing and serialization using json, ujson, or orjson for high-performance JSON handling.
Build and maintain NLP pipelines using Flair, BERT, and LLM-based models.
Develop scalable ingestion and data transformation pipelines for AI and analytics use cases.
Build and maintain Flask-based APIs for model inference and service integrations.
Use regular expressions for text cleaning, parsing, and NLP preprocessing.
Integrate caching and fast lookups using Redis.
Manage and deploy ML models using MLflow for tracking and versioning.
Support CI/CD workflows using GitHub, LightSpeed Enterprise, and deployment pipelines.
Create and maintain Autosys JILs for job scheduling and automation.
Use basic Linux commands for troubleshooting, operations, and deployment tasks.
Monitor application and system health using ITRS Geneos.
Write unit tests and improve automation test coverage (PyTest/unittest).
Work with REST APIs, microservices, and basic shell scripting.
Work with cloud services (ECS), including boto3.