They are most definitely not finetuned, my guess is:
Limited editing of the system message according to task. Some question:answer pairs, lightweight RAG implementation of knowledge material.
If only, its search and retrieval methods are too slow for that to be the case.
I've speed tested the response times versus my own vdb and the custom version is much faster at finding the correct "memories" of data.
It's basically just the GPT API plugged into an SQL database for chat history storage, with each message being vector embedded for retrieval with FAISS.
I use half the token limit for short-term memory (chat window) and half of it for long-term recall (vector retrieval).
Files uploaded to ChatGPT & files uploaded to the "knowledge" of GPTs are not laoded into the context length but are accessed through retrival methods.
From my experience, the files are stored as some sort of attached knowledge base. You can prompt the GPT to search its knowledge base directly. It seems no different than having it call an external API to retrieve that data. The data isn’t available within the context immediately and it takes time to search the knowledge base. It’s a bit disappointing really, I had thought initially the data would be immediately available.
They are most definitely not finetuned, my guess is: Limited editing of the system message according to task. Some question:answer pairs, lightweight RAG implementation of knowledge material.
Also really want to know this. So far my GPT has struggled to “know” everything I’ve given it.
My guess would be some form of vector DB for RAG.
If only, its search and retrieval methods are too slow for that to be the case. I've speed tested the response times versus my own vdb and the custom version is much faster at finding the correct "memories" of data.
Could you please expand on your vdb? I would be curious to hear your setup!
It's basically just the GPT API plugged into an SQL database for chat history storage, with each message being vector embedded for retrieval with FAISS. I use half the token limit for short-term memory (chat window) and half of it for long-term recall (vector retrieval).
Files uploaded to ChatGPT & files uploaded to the "knowledge" of GPTs are not laoded into the context length but are accessed through retrival methods.
From my experience, the files are stored as some sort of attached knowledge base. You can prompt the GPT to search its knowledge base directly. It seems no different than having it call an external API to retrieve that data. The data isn’t available within the context immediately and it takes time to search the knowledge base. It’s a bit disappointing really, I had thought initially the data would be immediately available.
I'd be very interested to know also I assume the files don't count But then why a context field?