AI Agents & Autonomous Systems Portfolio
Context: Delivered while working Full-Time (ARGO DATA) + completing MS CS (UT Austin)
This portfolio showcases my expertise in building intelligent, autonomous AI agents that can reason, plan, and execute complex tasks across various domains.
🎙️ 2025: Gemini Live Voice Chat with Function Calling
Signal: Conversational AI / Real-time Processing / Tool Integration
🎯 Motivation & Technical Challenge
Building a real-time voice AI system that can seamlessly execute functions, search the web, and run code during natural conversation requires solving complex challenges: low-latency audio processing, dynamic tool orchestration, and maintaining conversation context across multiple modalities. The goal was creating a truly interactive AI assistant that feels natural and responsive.
🔧 Real-time Audio Processing Pipeline
Audio Engineering
- Speech-to-Text: Streaming ASR with 200ms latency using Whisper-turbo
- Voice Activity Detection: Real-time VAD to detect speech boundaries
- Audio Preprocessing: Noise reduction and normalization pipeline
- Text-to-Speech: Neural TTS with voice cloning capabilities
Streaming Architecture
- WebSocket Connections: Bidirectional real-time communication
- Audio Chunking: 100ms audio segments for low-latency processing
- Buffer Management: Circular buffers for continuous audio streaming
- Async Processing: Non-blocking audio pipeline with queue management
⚡ Dynamic Function Calling System
- Tool Registry:
- Web search integration with real-time result parsing
- Code execution sandbox with security constraints
- File system operations with permission management
- API integrations with rate limiting and error handling
- Execution Engine:
- Parallel Execution: Multiple tools can run simultaneously
- Safety Constraints: Sandboxed execution environment
- Result Streaming: Real-time function output integration
- Error Recovery: Graceful handling of tool failures
🧠 Conversation Context Management
Multi-Modal Context Fusion
The system maintains coherent conversation state across voice, text, code execution results, and web search data, enabling natural follow-up questions and context-aware responses.
📊 Performance Metrics
Latency Breakdown
- Speech Recognition: 200ms average
- LLM Processing: 300ms for function calling
- Tool Execution: Variable (50ms-2s)
- Speech Synthesis: 150ms average
System Performance
- Concurrent Users: 50+ simultaneous sessions
- Audio Quality: 16kHz, 16-bit PCM
- Function Success Rate: 94.2% successful executions
- Context Retention: 10+ turn conversations
🤖 2025: CV MCP (Model Context Protocol)
Signal: Agent Tooling / Resume Generation / Automation
🔗 MCP Tool
🎯 Motivation & MCP Integration Challenge
The Model Context Protocol (MCP) represents a standardized approach to agent tooling, but implementing production-ready MCP tools requires deep understanding of the protocol specifications, efficient document generation pipelines, and ATS (Applicant Tracking System) optimization. The challenge was creating a tool that seamlessly integrates with AI agent workflows while producing high-quality, ATS-friendly resumes.
🔧 MCP Protocol Implementation
Protocol Compliance
- MCP Specification: Full compliance with MCP v1.0 protocol
- Tool Registration: Dynamic tool discovery and capability advertisement
- Message Handling: Async request/response processing
- Error Handling: Standardized error codes and recovery mechanisms
Agent Integration
- Seamless Workflow: Zero-friction integration with agent pipelines
- Context Preservation: Maintains conversation state across tool calls
- Batch Processing: Handles multiple resume generation requests
- Quality Validation: Automated output quality checks
⚡ ATS Optimization Engine
- Keyword Analysis:
- Industry-specific keyword extraction and optimization
- Semantic matching with job descriptions
- Keyword density optimization for ATS parsing
- Skills taxonomy mapping and standardization
- Format Optimization:
- ATS-friendly formatting (no complex layouts or graphics)
- Standardized section headers and bullet points
- Consistent date formatting and contact information
- PDF generation with proper text extraction capabilities
🔬 Technical Implementation
Document Generation Pipeline
- Template Engine: Jinja2-based flexible resume templates
- Content Validation: Schema validation for resume data integrity
- ATS Testing: Automated testing against popular ATS systems
- Performance: <2s generation time for complete resume
📄 2025: LaTeX to PDF MCP
Signal: Document Processing / Agent Tooling / Automation
🔗 MCP Tool
Challenges Solved
Developed MCP tool for rendering PDFs from LaTeX, enabling agents to generate professional documents autonomously.
Technical Depth
- LaTeX Processing: Full LaTeX compilation pipeline with error handling
- PDF Generation: High-quality document rendering with proper formatting
- Agent Integration: MCP-compatible for seamless agent workflow integration
- Error Handling: Robust compilation error detection and reporting
- Template Support: Support for various document types and styles
🎬 2025: VLA Data Generator
Signal: Data Generation / Video Processing / Training Pipeline
Challenges Solved
Built AI system to generate Vision-Language-Action training data from videos, automating the creation of robotics training datasets.
Technical Depth
- Video Analysis: Automated extraction of action sequences from video content
- Multi-modal Processing: Vision, language, and action annotation pipeline
- Data Quality: Intelligent filtering and validation of generated training data
- Scalability: Batch processing capabilities for large video datasets
- Format Standardization: Consistent output format for VLA model training
🎯 2024: ReAct Agent in Simulated Environments
Signal: Embodied AI / Agent Reasoning / Real-time Decision Making
Challenges Solved
Implemented ReAct (Reasoning + Acting) agent achieving greater than 85% task completion in complex 3D Unity/Gym environments.
Technical Depth
- Reasoning Architecture: Custom ReAct implementation with step-by-step reasoning
- Performance: Sub-200ms P50 decision latency in real-time environments
- Environment Integration: Seamless interaction with Unity ML-Agents and Gym
- Task Completion: High success rate across diverse simulated scenarios
- LangChain Integration: Optimized agent workflow orchestration
🏭 2023-Present: Self-Healing RAG Agent (ARGO DATA)
Signal: Production Agents / Autonomous Systems / Real-time Adaptation
🎯 Motivation & Autonomous Systems Challenge
Traditional RAG systems require manual intervention when content changes, leading to stale information and degraded performance. The challenge was designing an autonomous agent that could monitor content changes, make intelligent decisions about when and how to update indices, and maintain system performance without human oversight - essentially creating a "self-healing" knowledge system.
🔧 Autonomous Monitoring Architecture
Intelligent Change Detection
- Multi-Level Monitoring: File system, content hash, and semantic change detection
- Change Significance Scoring: ML-based assessment of update importance
- Dependency Mapping: Automatic detection of related content that needs updating
- Batch Optimization: Intelligent grouping of related changes for efficient processing
Decision Engine
- Update Prioritization: Critical vs. non-critical change classification
- Resource Management: Load-aware scheduling of re-indexing operations
- Quality Assurance: Automated validation of updated indices
- Rollback Logic: Automatic reversion on quality degradation
⚡ Real-time Adaptation Mechanisms
- Zero-Downtime Updates:
- Blue-green indexing strategy for seamless transitions
- Gradual traffic shifting during index updates
- Atomic operations to prevent partial state corruption
- Health checks and automatic rollback on failures
- Performance Optimization:
- Incremental Processing: Only re-index changed content sections
- Parallel Execution: Multi-threaded embedding generation
- Cache Invalidation: Smart cache management for updated content
- Load Balancing: Distribute processing across available resources
🔬 Production Intelligence Features
Self-Healing Capabilities
- Anomaly Detection: Statistical analysis of query performance degradation
- Auto-Recovery: Automatic re-indexing when quality metrics drop
- Predictive Maintenance: Proactive updates based on content staleness
- Resource Scaling: Dynamic resource allocation based on workload
📊 Autonomous System Metrics
Operational Performance
- Uptime: 99.8% availability with autonomous recovery
- Update Latency: <5 minutes from content change to index update
- False Positive Rate: <2% unnecessary re-indexing operations
- Resource Efficiency: 40% reduction in manual intervention
Quality Metrics
- Content Freshness: 95% of queries use latest content
- Index Consistency: 99.9% accuracy in change detection
- Performance Stability: Maintained sub-100ms latency during updates
- User Satisfaction: No reported issues with stale content
🔬 Technical Innovations
- Autonomous Decision Making: ML-based change significance assessment
- Self-Healing Architecture: Automatic recovery from system degradation
- Zero-Downtime Operations: Seamless updates without service interruption
- Intelligent Resource Management: Dynamic scaling based on workload patterns
📊 Agent Architecture Patterns
Core Agent Capabilities
- Reasoning: ReAct patterns, chain-of-thought, multi-step planning
- Tool Integration: MCP compatibility, function calling, external API integration
- Real-time Processing: Sub-200ms decision latency, streaming responses
- Autonomous Operation: Self-monitoring, self-healing, adaptive behavior
Agent Communication Patterns
- Multi-modal: Voice, text, vision, and action coordination
- Tool Orchestration: Dynamic tool selection and execution
- Context Management: Long-term memory and conversation state
- Error Recovery: Robust handling of failures and edge cases
Production Agent Systems
- Scalability: Concurrent user handling and resource optimization
- Monitoring: Comprehensive observability and performance tracking
- Reliability: High availability and fault tolerance
- Security: Safe tool execution and input validation
🛠️ Technology Stack
Agent Frameworks
- LangChain: Agent orchestration and workflow management
- Model Context Protocol (MCP): Standardized agent tool integration
- Custom ReAct: Reasoning and acting pattern implementation
- Function Calling: Dynamic tool execution and API integration
Infrastructure
- Real-time Systems: WebSocket connections, streaming responses
- Cloud Deployment: Hugging Face Spaces, scalable inference
- Document Processing: LaTeX, PDF generation, template engines
- Multi-modal Processing: Audio, video, text, and vision pipelines
Integration Patterns
- API Design: RESTful and streaming API endpoints
- Tool Ecosystem: MCP-compatible tool development
- Workflow Orchestration: Complex multi-step agent processes
- Quality Assurance: Automated testing and validation pipelines
🎯 Impact & Innovation
Production Impact
- User Scale: Systems serving `200+` concurrent users
- Performance: Consistent sub-100ms response times
- Reliability: Self-healing systems with minimal downtime
- Cost Efficiency: Automated processes reducing manual intervention
Technical Innovation
- MCP Tooling: Early adoption and contribution to Model Context Protocol
- Self-Healing Systems: Autonomous monitoring and adaptation
- Multi-modal Agents: Seamless integration across different modalities
- Real-time Processing: Optimized latency for interactive applications
This portfolio demonstrates comprehensive expertise in building production-ready AI agents that can reason, plan, and execute complex tasks autonomously across diverse domains.