Real-Time Object Recognition and Task Execution Robot
· One min read
This project focuses on enabling a robot to perform complex tasks in real-time by leveraging Large Language Models and advanced computer vision.
Key Achievements​
- Reduced task failure rate by 20% by offloading compute-intensive tasks to a dedicated multi-node computing setup.
- Improved task sequencing efficiency by 15% by integrating a DINO image grounding model with LangChain tools, allowing a ReAct agent to dynamically plan and execute tasks based on visual input.
- Integrated LangChain ReAct agents for advanced reasoning and decision-making, enabling the robot to autonomously operate based on real-time data.
Technologies Used: LangChain, ReAct Agents, DINO, ROS, C++, Vision-Language Models, Multi-node Processing.