Yash Gupta

Hi, I'm Yash 👋

AI Engineer specializing in LLM systems, multi-agent architectures, and production ML pipelines.

Based in Calgary, Canada

About

I'm Yash Gupta, an AI Engineer with 4+ years of experience building production AI systems in healthcare and enterprise domains. I specialize in LLM applications, multi-agent orchestration, and real-time voice AI, with a track record of deploying scalable solutions on AWS and GCP. I hold an M.Sc. in Artificial Intelligence and have published research in federated learning and edge AI.

Work Experience

Monark

May 2025 - February 2026

AI Engineer

Built a real-time voice AI roleplay bot using WebRTC, Python, Gemini Live, and Deepgram, improving transcript retention by 50% and reducing token usage by 25%.
Containerized and deployed the system with CI/CD and AWS ECR for scalable, low-latency production use.
Built a serverless leadership assessment pipeline with AWS Lambda and Step Functions to evaluate multiple traits in parallel with sub-60-second latency.
Designed research-backed prompts with I/O psychology experts and shipped the system with automated CI/CD using AWS SAM and GitHub Actions.
Implemented an LLM-as-a-judge evaluation pipeline using LangChain to enforce strict adherence to system prompt rules.
Deployed the service on AWS Lambda using a GPT model for automated quality assurance and bias-aware scoring.
Built a serverless PII redaction pipeline integrating Google Cloud NLP, LangExtract, and Gemini, achieving 90% F1 on redaction quality.
Ensured consistent redaction across transcripts, summaries, and insights for reliable downstream analytics.
Developed a speaker identification pipeline using HuggingFace and pyannote.audio for accurate multi-speaker transcription at scale.
Migrated model serving from SageMaker to Google Cloud Run, reducing infrastructure costs by 60%.
Built a multi-agent chatbot for conversational access to meeting data using LangChain and FastAPI with RAG over Vertex AI Vector Search.
Integrated the system with a Next.js frontend to deliver a production-ready, full-stack AI feature.

Aurora Constellations

May 2023 - September 2024

AI Researcher

Built a custom VSCode fork and Langium-based DSL for clinical pathway modeling with conditional logic for oncology workflows.
Implemented a decision-tree graph generator in Scala to navigate patient-specific pathways and reduce clinician cognitive load.
Built a RAG pipeline using GPT-3, LangChain, Pinecone, and FastAPI to generate structured patient treatment plans from unstructured clinical guidance.
Integrated the service with a Scala backend to iteratively generate and validate DSL-based plans.
Integrated with the OpenEpic platform to ingest EHR data and map it into the company’s DSL for downstream clinical tooling.
Implemented secure OAuth 2.0 flows with a FHIR-compliant server and built connectors to normalize patient data from Epic.
Modeled patient plans from MIMIC-III/IV as heterogeneous graphs and trained GNNs for ICU outcome prediction.
Achieved AUROC ~0.74 for mortality prediction and ~1.2 days RMSE for length-of-stay, matching strong clinical baselines.
Built a Neo4j-based clinical knowledge graph from MIMIC data to represent treatments, entities, and outcomes.
Implemented a Graph-RAG pipeline combining graph retrieval with GPT-3 to generate structured patient plans.

Aurora Constellations

May 2021 - May 2023

Software Engineer

Implemented structured logging for a Scala Play Server to improve observability, enable reliable log parsing, and support production analytics and debugging.
Built and maintained CI/CD pipelines for Play Server deployments, added secure deployment webhooks, and wrote Bash automation to generate and manage self-signed JWT certificates, improving release reliability and developer onboarding.
Developed a wake-word voice assistant using Picovoice for on-device trigger detection and Google Speech-to-Text for intent classification, enabling hands-free actions such as importing treatment plans and dictating patient notes.
Integrated voice intent routing into automated clinical workflows and optimized the system for low-latency, robust performance across diverse acoustic environments.

Lakehead University & Lockheed Martin

July 2021 - July 2022

Assistant Researcher

Designed and implemented a smart airflow regulation system for remote community homes using Arduino sensors, ESP32, and Raspberry Pi to monitor wall-siding temperatures and dynamically control airflow direction and fan speeds.
Engineered embedded hardware integrations with environmental sensors and actuators to enable efficient, automated temperature regulation.
Built a remote data logging and monitoring system using Raspberry Pi and Firebase, reducing the need for manual on-site data collection by ~95% and enabling real-time environmental insights.
Improved comfort and energy efficiency in off-grid housing through automated control logic and a connected monitoring infrastructure.

Lakehead University & Synergy North

June 2020 - August 2020

Assistant Researcher

Conducted research on high-frequency utility data (5-minute resolution) to build power consumption forecasting models using FFNs and LSTMs, teaching myself Python, TensorFlow, time series analysis, and hyperparameter tuning.
Engineered models to capture temporal patterns and seasonal effects using cyclical time features and weather inputs, improving performance over baseline methods.
Benchmarked model performance against industry research, achieving RMSE ≈ 0.1–0.17 on short-term electricity consumption forecasting tasks, consistent with published LSTM results.
Tuned network depth, training window size, and feature encoding to reduce prediction error and improve generalization across summer and winter test periods.
Demonstrated practical, end-to-end implementation of real-world time series forecasting using LSTM architectures aligned with deep learning benchmarks for load prediction.

Education

Lakehead University

Master of Science in Artificial Intelligence

Lakehead University

Bachelor of Software Engineering

Skills

Languages

Python

TypeScript

JavaScript

Scala

Frontend Engineering

Next.js

React

Tailwind CSS

Shadcn UI

Backend Engineering

FastAPI

Play Framework (Scala)

Databases & Vector Stores

PostgreSQL

Supabase

Pinecone

OpenSearch

Vertex AI Vector Search

Core Machine Learning

PyTorch

TensorFlow

scikit-learn

pandas

Agentic & LLM Systems

OpenAI

Deepgram

Google Gemini

Anthropic Claude

LangChain

LangGraph

RAG (Retrieval-Augmented Generation)

MCP (Multi-agent Orchestration)

Multi-agent System Design

Google Agent Development Kit (ADK)

Pipecat (Daily)

Evals

Cloud & Infrastructure

Github Actions

AWS Lambda

AWS Step Functions

AWS SAM

AWS ECS

AWS S3

AWS SageMaker

GCP Cloud Run

GCP Vertex AI (Models, Endpoints, Pipelines)

GCP Vertex AI Vector Search

GCP Cloud Storage

GCP Cloud Functions

My Projects

Check out my latest work

Research projects and applied ML work spanning federated learning, computer vision, and signal processing.

Federated Learning Framework via Distributed Mutual Learning

December 2022

Developed a privacy-preserving federated learning framework that replaces weight-sharing with loss-based mutual learning, reducing bandwidth usage and model inversion attack risks. By leveraging knowledge distillation and deep mutual learning, clients share insights without exposing sensitive data, improving model generalization. The framework was evaluated on a face mask detection case study, demonstrating superior performance compared to traditional synchronous and asynchronous federated learning methods.

Federated Learning

Deep Learning

Knowledge Distillation

Mutual Learning

Privacy-Preserving Machine Learning

Computer Vision

Convolutional Neural Networks (CNN)

KL Divergence Optimization

Python

TensorFlow

Arxiv

GitHub

Image Compression Using Fast Fourier Transform and JPEG Compression

April 2020

Developed an image compression tool in MATLAB using DFT, FFT, and DCT, implementing algorithms from scratch. The project optimized Fourier-based compression, benchmarked it against JPEG, and integrated a GUI for real-time visualization. Key concepts include Fourier Transform for frequency-domain compression, matrix transformations and quantization for data reduction, benchmarking compression efficiency across techniques, and a graphical user interface for user-controlled compression.

MATLAB

Signal Processing

Matrix Algebra & Linear Algebra

Fast Fourier Transform (FFT)

Discrete Cosine Transform (DCT)

Quantization & Data Reduction

Benchmarking & Performance Analysis

Graphical User Interface (GUI)

Project Report

Research

Research Publications

Peer-reviewed publications in federated learning, IoT, and healthcare AI.

November 2022

Toward Asynchronously Weight Updating Federated Learning for AI-on-Edge IoT Systems

Yash Gupta, Zubair Md Fadlullah, Mostafa M. Fouda

Journal Article

2022 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)DOI

Designed an asynchronously weight updating federated learning algorithm for AI-on-Edge IoT systems, enhancing data privacy by eliminating the need for centralized data sharing. Applied the approach to face mask detection, traditionally a centralized computer vision task, by distributing learning tasks across users. Investigated performance trade-offs between synchronous and asynchronous weight updates, introducing a penalization mechanism to optimize model aggregation. Experimental results demonstrated comparable accuracy to centralized training while significantly reducing transmission time overhead.

May 2022

Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation

Peter Sertic, Ayman Alahmar, Thangarajah Akilan, Yash Gupta, Marko Javorac

Journal Article

Healthcare 2022DOI

Developed and implemented a hardware-accelerated real-time face-mask detection system using deep learning (DL), optimized for embedded platforms including Raspberry Pi 4B (Google Coral TPU, Intel NCS2 VPU) and NVIDIA Jetson Nano. Designed a custom face-mask detection model (MaskDetect), independently quantized and optimized for each hardware platform. Conducted an ablation study comparing MaskDetect to transfer-learning models (VGG16, ResNet-50V2, InceptionV3), achieving 94%+ accuracy on most platforms. Results demonstrated that Jetson Nano offers the best trade-off in accuracy (94.2%), inference speed, and cost, making it ideal for real-time deployment.

December 2021

HELIUS: A Blockchain Based Renewable Energy Trading System

Yash Gupta, Marko Javorac, Shaun Cyr, Abdulsalam Yassine

Journal Article

2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)DOI

Developed a peer-to-peer (P2P) sustainable energy exchange system using Blockchain and Deep Learning to optimize energy trading during peak demand. Designed a novel framework for power system operations, enabling users to trade energy efficiently while simulating sustainable energy production based on location, time, and weather. Integrated a blind bidding mechanism and a web application to demonstrate real-world feasibility.

Contact

Get in Touch

Send me an email or connect with me on LinkedIn.

Send Email