Google | Gemini Agentic Intelligence
Gemini and CoreML
Advancing the capabilities of Gemini through core agentic infrastructure, cognitive architecture, and rigorous evaluation methodologies.
- Agentic Orchestration & Memory: Architecting computational harnesses for complex reasoning, tool-use, and stateful memory systems that allow agents to retain context and self-improve over time.
- Auto-Prompt Optimization: To optimize long-running agentic worfklows to improve efficiency and accuracy.
- Dense Rewards & Evaluation: Working on dense reward modeling for step-by-step fine-tuning and creating comprehensive benchmarking environments to assess agent autonomy, safety, and reasoning.
Meta | Superintelligence Lab
GenAI & LLM Evaluation and Reward Modelling
Applied research focused on developing evaluation frameworks and reward models for Meta's foundational models, specifically Llama 3 and Llama 4.
- Llama 3 & 4 Evaluation: Led evaluation strategies and developed AI judges/reward models for safety, reasoning, and multimodal capabilities, serving as critical signals for post-training.
- Alignment Research: Utilized SFT and GRPO techniques to align AI judges with human ground truth, creating "universal judges" that strictly adhere to complex rubrics.
Agentic Systems & Automation
Focused on architecting agentic systems and automated toolsets to enhance the efficiency of the research lifecycle.
- AI Evaluator System: Developed an agentic "AI Evaluator" system (Auto-Rubrics, APO) that significantly reduced cycle times and enabled automated hypothesis validation.
- Hybrid Human-AI Protocol: Designed a strategic routing system for low-confidence samples to human annotators, optimizing costs while effectively detecting model drift.
Evaluation Platform Architecture & Governance
Conceptualized and delivered a centralized evaluation hub, consolidating fragmented pipelines into a unified, company-wide infrastructure for GenAI assessment.
- Standardized Infrastructure: Unified fragmented evaluation pipelines into a company-wide standard, eliminating redundancy and establishing rigorous governance.
- Bias Mitigation: Developed methodologies to quantify and mitigate bias in AI judges, ensuring fair and safe benchmarking against industry competitors.
Past Research
On-Device AI & Wearables
Research focused on deploying and optimizing machine learning models for resource-constrained wearable devices and mobile platforms.
- On-device ML: Led research on deploying ML models on VR/AR headsets and Ray-Ban Stories, focusing on privacy-preserving personalization and efficient context-aware models.
- Samsung AI: Developed tools for quantization, pruning, and NAS to accelerate inference. Applied meta-learning and self-supervision to reduce bias and memory usage in "over-provisioned" models (e.g., ASR) for embedded devices.
Distributed & Federated Learning
Focused on architecting and deploying distributed and federated learning systems at scale in heterogeneous client environments.
- Federated Learning: Designed systems for large-scale FL with non-IID data, focusing on model convergence, bias mitigation, and privacy for personalized speech and vision models.
- Heterogeneous Compute Scheduling: Developed AI workload schedulers to manage concurrent execution across diverse mobile processors (CPU/GPU/NPU), optimizing for throughput and energy efficiency.
Network Intelligence & Mobile Systems
Applying AI to country-wide cellular networks for automation, optimization, and enhanced operational efficiency.
- Network Automation: Applied deep learning for zero-configuration automation, real-time anomaly detection, and self-healing in cellular networks (SDN/NFV).
- User & App Modeling: Predicted customer satisfaction and mobility patterns from large-scale data, and automated Android app testing using imitation learning.