Google | Gemini Agentic Intelligence

Gemini Agentic Intelligence

Gemini and CoreML

Advancing the capabilities of Gemini through core agentic infrastructure, cognitive architecture, and rigorous evaluation methodologies.

  • Agentic Orchestration & Memory: Architecting computational harnesses for complex reasoning, tool-use, and stateful memory systems that allow agents to retain context and self-improve over time.
  • Auto-Prompt Optimization: To optimize long-running agentic worfklows to improve efficiency and accuracy.
  • Dense Rewards & Evaluation: Working on dense reward modeling for step-by-step fine-tuning and creating comprehensive benchmarking environments to assess agent autonomy, safety, and reasoning.

Meta | Superintelligence Lab

Llama 3 & 4 Evaluation

GenAI & LLM Evaluation and Reward Modelling

Applied research focused on developing evaluation frameworks and reward models for Meta's foundational models, specifically Llama 3 and Llama 4.

  • Llama 3 & 4 Evaluation: Led evaluation strategies and developed AI judges/reward models for safety, reasoning, and multimodal capabilities, serving as critical signals for post-training.
  • Alignment Research: Utilized SFT and GRPO techniques to align AI judges with human ground truth, creating "universal judges" that strictly adhere to complex rubrics.
Agentic Systems

Agentic Systems & Automation

Focused on architecting agentic systems and automated toolsets to enhance the efficiency of the research lifecycle.

  • AI Evaluator System: Developed an agentic "AI Evaluator" system (Auto-Rubrics, APO) that significantly reduced cycle times and enabled automated hypothesis validation.
  • Hybrid Human-AI Protocol: Designed a strategic routing system for low-confidence samples to human annotators, optimizing costs while effectively detecting model drift.
Evaluation Platform

Evaluation Platform Architecture & Governance

Conceptualized and delivered a centralized evaluation hub, consolidating fragmented pipelines into a unified, company-wide infrastructure for GenAI assessment.

  • Standardized Infrastructure: Unified fragmented evaluation pipelines into a company-wide standard, eliminating redundancy and establishing rigorous governance.
  • Bias Mitigation: Developed methodologies to quantify and mitigate bias in AI judges, ensuring fair and safe benchmarking against industry competitors.

Past Research

On-Device AI

On-Device AI & Wearables

Research focused on deploying and optimizing machine learning models for resource-constrained wearable devices and mobile platforms.

  • On-device ML: Led research on deploying ML models on VR/AR headsets and Ray-Ban Stories, focusing on privacy-preserving personalization and efficient context-aware models.
  • Samsung AI: Developed tools for quantization, pruning, and NAS to accelerate inference. Applied meta-learning and self-supervision to reduce bias and memory usage in "over-provisioned" models (e.g., ASR) for embedded devices.
Distributed Learning

Distributed & Federated Learning

Focused on architecting and deploying distributed and federated learning systems at scale in heterogeneous client environments.

  • Federated Learning: Designed systems for large-scale FL with non-IID data, focusing on model convergence, bias mitigation, and privacy for personalized speech and vision models.
  • Heterogeneous Compute Scheduling: Developed AI workload schedulers to manage concurrent execution across diverse mobile processors (CPU/GPU/NPU), optimizing for throughput and energy efficiency.
Network Intelligence

Network Intelligence & Mobile Systems

Applying AI to country-wide cellular networks for automation, optimization, and enhanced operational efficiency.

  • Network Automation: Applied deep learning for zero-configuration automation, real-time anomaly detection, and self-healing in cellular networks (SDN/NFV).
  • User & App Modeling: Predicted customer satisfaction and mobility patterns from large-scale data, and automated Android app testing using imitation learning.