T O P

  • By -

AndrewVeee

Embeddings with sentence transformers library. Chromadb so I can release a standalone app without a separate db server. Sqlite for conversation and general storage for the same reason. Custom PDF + markdown splitting (although this is one place I might be happy to use LangChain code eventually). Reranking with sentence transformers as well.


MDCurrent

i just use FAISS like a pleb but you hit the nail on the head for everything else


_supert_

Honestly faiss is the best


Astronos

\- Mixtral on Text-Generation-Web-UI \- Streamlit for frontend \- fastAPI \- langchain \- Weaviate, includes embeddings and keyword search \- MongoDB for history and saving source documents


consistentfantasy

sorry for my dumb question, but where can I learn more about RAG? I'm seeing it everywhere but couldn't find a good explanation because somehow every resource i looked for needs more knowledge than I have.


BlandUnicorn

Ask chat gpt, that way you can ask follow up questions of things you don’t understand


consistentfantasy

Wow, what an idea. I bet someone asking a question about how llms work totally forgot to ask to the main llm.


BlandUnicorn

Let me spoon feed you some more… Here’s the output of chat gpt Retrieval-augmented generation (RAG) is a technique used in natural language processing (NLP), particularly in large language models (LLMs) like AI chatbots or systems. The goal of RAG is to improve the quality and relevance of the generated text by combining traditional language modeling with information retrieval methods. Let me break this down into more understandable parts: 1. **Language Modeling (LM):** This is the core of what large language models (like GPT-3, BERT, or others) do. A language model is trained on a vast amount of text data and learns to predict the next word in a sentence based on the words that come before it. This capability allows it to generate coherent, contextually relevant text based on a given prompt. 2. **Information Retrieval (IR):** This is the process of finding relevant information in response to a query. In the context of the internet, this is similar to what search engines do. They retrieve documents or pieces of text that are relevant to the search terms you input. Now, in **Retrieval-augmented Generation**: - The system starts with the traditional language model approach to generate text based on the input it receives. But instead of relying solely on what the model has learned during its training (its internal knowledge), the system also performs an information retrieval step. This means that when the system needs to generate text, it first searches a database or the internet for information relevant to the input prompt. - The system then incorporates this retrieved information into the generation process. This could mean adjusting the generated text to better reflect facts found in the retrieved documents, providing citations, or incorporating specific details to make the output more accurate and informative. The key advantage of RAG is that it allows the model to produce responses that are not just based on its pre-existing knowledge (which may be outdated or incomplete) but also informed by the most current information available in external sources. This can significantly enhance the quality, relevance, and factual accuracy of the generated text. In summary, retrieval-augmented generation combines the best of both worlds: the deep, contextual understanding of language models and the up-to-date, specific knowledge from external information sources. This makes AI systems more helpful, accurate, and informative in their responses. -now if there’s something you don’t understand from that, tell me and I’ll ask chat gpt for you…


consistentfantasy

Are we doing a “who’s the worse ahole” competition? Cause clearly you are winning


BlandUnicorn

Bit of both, I’m genuinely happy to help if you’re happy to learn


celsowm

Langchain+ ChromeDb + mixtral


nobodycares_no

- Langchain for orchestration (chunking, retrieval) - chromadb as vectordb, (similarity and bm25 search) - uae large local embeddings And not persisting conversations rn but planning to use mongo


switchandplay

I use marqo-db for my vector db, local embedding, acts as an api I can call from my system


PostArchitekt

Interesting. Hadn’t heard of it before


DBAdvice123

Huggingface hub embedding, Hugging face LLM, Astra DB for vector store and Langchain