Based in Calgary, Canada · Download Resume
About
I'm Yash Gupta, an AI Engineer with 4+ years of experience building production AI systems in healthcare and enterprise domains. I specialize in LLM applications, multi-agent orchestration, and real-time voice AI, with a track record of deploying scalable solutions on AWS and GCP. I hold an M.Sc. in Artificial Intelligence and have published research in federated learning and edge AI.
Work Experience
Monark
- Built a real-time voice AI roleplay bot using WebRTC, Python, Gemini Live, and Deepgram, improving transcript retention by 50% and reducing token usage by 25%.
- Containerized and deployed the system with CI/CD and AWS ECR for scalable, low-latency production use.
- Built a serverless leadership assessment pipeline with AWS Lambda and Step Functions to evaluate multiple traits in parallel with sub-60-second latency.
- Designed research-backed prompts with I/O psychology experts and shipped the system with automated CI/CD using AWS SAM and GitHub Actions.
- Implemented an LLM-as-a-judge evaluation pipeline using LangChain to enforce strict adherence to system prompt rules.
- Deployed the service on AWS Lambda using a GPT model for automated quality assurance and bias-aware scoring.
- Built a serverless PII redaction pipeline integrating Google Cloud NLP, LangExtract, and Gemini, achieving 90% F1 on redaction quality.
- Ensured consistent redaction across transcripts, summaries, and insights for reliable downstream analytics.
- Developed a speaker identification pipeline using HuggingFace and pyannote.audio for accurate multi-speaker transcription at scale.
- Migrated model serving from SageMaker to Google Cloud Run, reducing infrastructure costs by 60%.
- Built a multi-agent chatbot for conversational access to meeting data using LangChain and FastAPI with RAG over Vertex AI Vector Search.
- Integrated the system with a Next.js frontend to deliver a production-ready, full-stack AI feature.
Aurora Constellations
- Built a custom VSCode fork and Langium-based DSL for clinical pathway modeling with conditional logic for oncology workflows.
- Implemented a decision-tree graph generator in Scala to navigate patient-specific pathways and reduce clinician cognitive load.
- Built a RAG pipeline using GPT-3, LangChain, Pinecone, and FastAPI to generate structured patient treatment plans from unstructured clinical guidance.
- Integrated the service with a Scala backend to iteratively generate and validate DSL-based plans.
- Integrated with the OpenEpic platform to ingest EHR data and map it into the company’s DSL for downstream clinical tooling.
- Implemented secure OAuth 2.0 flows with a FHIR-compliant server and built connectors to normalize patient data from Epic.
- Modeled patient plans from MIMIC-III/IV as heterogeneous graphs and trained GNNs for ICU outcome prediction.
- Achieved AUROC ~0.74 for mortality prediction and ~1.2 days RMSE for length-of-stay, matching strong clinical baselines.
- Built a Neo4j-based clinical knowledge graph from MIMIC data to represent treatments, entities, and outcomes.
- Implemented a Graph-RAG pipeline combining graph retrieval with GPT-3 to generate structured patient plans.
Aurora Constellations
- Implemented structured logging for a Scala Play Server to improve observability, enable reliable log parsing, and support production analytics and debugging.
- Built and maintained CI/CD pipelines for Play Server deployments, added secure deployment webhooks, and wrote Bash automation to generate and manage self-signed JWT certificates, improving release reliability and developer onboarding.
- Developed a wake-word voice assistant using Picovoice for on-device trigger detection and Google Speech-to-Text for intent classification, enabling hands-free actions such as importing treatment plans and dictating patient notes.
- Integrated voice intent routing into automated clinical workflows and optimized the system for low-latency, robust performance across diverse acoustic environments.
Lakehead University & Lockheed Martin
- Designed and implemented a smart airflow regulation system for remote community homes using Arduino sensors, ESP32, and Raspberry Pi to monitor wall-siding temperatures and dynamically control airflow direction and fan speeds.
- Engineered embedded hardware integrations with environmental sensors and actuators to enable efficient, automated temperature regulation.
- Built a remote data logging and monitoring system using Raspberry Pi and Firebase, reducing the need for manual on-site data collection by ~95% and enabling real-time environmental insights.
- Improved comfort and energy efficiency in off-grid housing through automated control logic and a connected monitoring infrastructure.
Lakehead University & Synergy North
- Conducted research on high-frequency utility data (5-minute resolution) to build power consumption forecasting models using FFNs and LSTMs, teaching myself Python, TensorFlow, time series analysis, and hyperparameter tuning.
- Engineered models to capture temporal patterns and seasonal effects using cyclical time features and weather inputs, improving performance over baseline methods.
- Benchmarked model performance against industry research, achieving RMSE ≈ 0.1–0.17 on short-term electricity consumption forecasting tasks, consistent with published LSTM results.
- Tuned network depth, training window size, and feature encoding to reduce prediction error and improve generalization across summer and winter test periods.
- Demonstrated practical, end-to-end implementation of real-world time series forecasting using LSTM architectures aligned with deep learning benchmarks for load prediction.
Skills
Languages
Frontend Engineering
Backend Engineering
Databases & Vector Stores
Core Machine Learning
Agentic & LLM Systems
Cloud & Infrastructure
Check out my latest work
Research projects and applied ML work spanning federated learning, computer vision, and signal processing.

Federated Learning Framework via Distributed Mutual Learning
Developed a privacy-preserving federated learning framework that replaces weight-sharing with loss-based mutual learning, reducing bandwidth usage and model inversion attack risks. By leveraging knowledge distillation and deep mutual learning, clients share insights without exposing sensitive data, improving model generalization. The framework was evaluated on a face mask detection case study, demonstrating superior performance compared to traditional synchronous and asynchronous federated learning methods.

Image Compression Using Fast Fourier Transform and JPEG Compression
Developed an image compression tool in MATLAB using DFT, FFT, and DCT, implementing algorithms from scratch. The project optimized Fourier-based compression, benchmarked it against JPEG, and integrated a GUI for real-time visualization. Key concepts include Fourier Transform for frequency-domain compression, matrix transformations and quantization for data reduction, benchmarking compression efficiency across techniques, and a graphical user interface for user-controlled compression.
Research Publications
Peer-reviewed publications in federated learning, IoT, and healthcare AI.
Toward Asynchronously Weight Updating Federated Learning for AI-on-Edge IoT Systems
Yash Gupta, Zubair Md Fadlullah, Mostafa M. FoudaJournal Article2022 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)DOI
Designed an asynchronously weight updating federated learning algorithm for AI-on-Edge IoT systems, enhancing data privacy by eliminating the need for centralized data sharing. Applied the approach to face mask detection, traditionally a centralized computer vision task, by distributing learning tasks across users. Investigated performance trade-offs between synchronous and asynchronous weight updates, introducing a penalization mechanism to optimize model aggregation. Experimental results demonstrated comparable accuracy to centralized training while significantly reducing transmission time overhead.
Intelligent Real-Time Face-Mask Detection System with Hardware Acceleration for COVID-19 Mitigation
Peter Sertic, Ayman Alahmar, Thangarajah Akilan, Yash Gupta, Marko JavoracJournal ArticleHealthcare 2022DOI
Developed and implemented a hardware-accelerated real-time face-mask detection system using deep learning (DL), optimized for embedded platforms including Raspberry Pi 4B (Google Coral TPU, Intel NCS2 VPU) and NVIDIA Jetson Nano. Designed a custom face-mask detection model (MaskDetect), independently quantized and optimized for each hardware platform. Conducted an ablation study comparing MaskDetect to transfer-learning models (VGG16, ResNet-50V2, InceptionV3), achieving 94%+ accuracy on most platforms. Results demonstrated that Jetson Nano offers the best trade-off in accuracy (94.2%), inference speed, and cost, making it ideal for real-time deployment.
HELIUS: A Blockchain Based Renewable Energy Trading System
Yash Gupta, Marko Javorac, Shaun Cyr, Abdulsalam YassineJournal Article2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)DOI
Developed a peer-to-peer (P2P) sustainable energy exchange system using Blockchain and Deep Learning to optimize energy trading during peak demand. Designed a novel framework for power system operations, enabling users to trade energy efficiently while simulating sustainable energy production based on location, time, and weather. Integrated a blind bidding mechanism and a web application to demonstrate real-world feasibility.