Using LLamaSharp with RAG.

So, The data appears to take 6x-7x the amount of space to archive into the SQLlite memory storage.

Clearing history with a Chatsession does not appear to work correctly so I swapped over to a stateless built which helped so long that the RAG only needs to be called once per question. If additional questions need to be asked on the same data then optimization can be added to speed things up.

I will node that the prompt I’m using to search on provided much better results then the format documented on hugging face oddly enough. “Assistant:” {Query} “User:” instead of “instruct”, “query”.

I also used a function to pull filler words to speed up and create a more reliable keyword function, but the LLM itself wasn’t too bad.

Ultimately, Ill have to stick with the statelessexecutor until I can validate that the ChatSession operates correctly. Once I call LLMSession.load(CleanSession), readding in the prompts appears to not work correctly. During debugging, upon the second add, the count of the ChatSession remained zero.

Leave a comment

Your email address will not be published. Required fields are marked *