9. P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task
AI Paper By Hand
AGI is predicted to be the next frontier in AI and RAG has been incrementally instrumental in enabling this effort. I came across this new approach called P-RAG also called Progressive Retrieval Augmented Generation which leverages the powerful language processing capabilities of LLMs and also accumulates task-specific knowledge.
Traditionally RAG methods retrieve relevant information from the database in a one-shot manner to assist generation, however, P-RAG introduces an iterative approach to update the database. In each iteration, P-RAG retrieves the latest database and obtains information from the previous interaction which it uses as experiential references for the current interaction.
The main steps in the process are as follows:
1. Information is sent to agent as : Goal Instruction, Observation, Actions Space and Retrieval Result.
2. Next, an LLM is chosen to plan action based on information.
3. The 'environment' receives those actions and returns observations.
4. Based on the above, database is updated after each multi-task iteration.
5. During each update, the database stores the embedding vector of the goal instruction and the scene graph obtained through observation.
6. The goal instruction and observation are used for used as query in retrieval augmented process.
7. Finally, the similarity between query and each database item is computed, and the top K relevant database items are returned to agent.
P-RAG demonstrates significant improvements of 1.7% and 2.5% compared to the baselines of solely using GPT-4 and GPT-3.5, respectively. The idea of using historical information to teach the agent to perform more efficiently is quite interesting - a mix-bag of RAG and VLA ideas! 💡
Paper : https://arxiv.org/abs/2409.11279