Stream Langchain qa chain answer #224

thomassrour · 2024-08-13T13:00:25Z

Hello,

I would like to stream the answer of my langchain qa chain. Here is how I'm trying to do it in the pipe method (almost as in #141) :

    self.local_llm  = ChatOllama(model="llama3:70b", 
                   #format="json",
                    temperature=0,  
                    base_url="http://...:11434",
                    streaming=True,
                    keep_alive= -1,
                    callbacks=[StreamingStdOutCallbackHandler()]
                    )

   chain =  RetrievalQA.from_chain_type(
              llm=self.local_llm,
              retriever=retriever,
              return_source_documents=False,
              chain_type_kwargs={'prompt': prompt},
              verbose = True,
              input_key="question",
          )
  
    for chunk in chain.run(user_message):
        yield chunk

However, the words don't appear one by one as they should, instead I get large chunks of about 50 words at once. Any help would be much appreciated, thank you.

The text was updated successfully, but these errors were encountered:

InquestGeronimo · 2024-12-27T16:41:40Z

@thomassrour did you ever get your script to stream ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream Langchain qa chain answer #224

Stream Langchain qa chain answer #224

thomassrour commented Aug 13, 2024 •

edited

Loading

InquestGeronimo commented Dec 27, 2024

Stream Langchain qa chain answer #224

Stream Langchain qa chain answer #224

Comments

thomassrour commented Aug 13, 2024 • edited Loading

InquestGeronimo commented Dec 27, 2024

thomassrour commented Aug 13, 2024 •

edited

Loading