-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Opengpts #361
Upgrade Opengpts #361
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing two issues:
- (MAJOR) w/ migration
- (minor) not 100% sure but UI doesn't seem to load the full screen for creating a new bot --- requiring clicking through on a tab. This could be something associated w/ data returned to the UI through one of the endpoints
@@ -265,7 +264,7 @@ class ConfigurableRetrieval(RunnableBinding): | |||
llm_type: LLMType | |||
system_message: str = DEFAULT_SYSTEM_MESSAGE | |||
assistant_id: Optional[str] = None | |||
thread_id: Optional[str] = None | |||
thread_id: Optional[str] = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a None default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a lot of time debugging this part. The error indicates a conflict in the configuration specifications for thread_id
.
The error occurs during validation in the following code:
@router.get("/config_schema")
async def config_schema() -> dict:
"""Return the config schema of the runnable."""
return agent.config_schema().model_json_schema()
The issue seems to arise because there are two conflicting ConfigurableFieldSpec definitions for thread_id:
1. Definition 1: ConfigurableFieldSpec with annotation=typing.Optional[str] and default=None.
2. Definition 2: ConfigurableFieldSpec with annotation=<class 'str'> and default=''.
So, I decided to set the default to '', and it works. However, I would prefer to keep it as None. Do you know what might be causing the problem? The assistant_id
is similar, but I don’t encounter this issue with it.
@@ -135,7 +132,7 @@ class ConfigurableAgent(RunnableBinding): | |||
retrieval_description: str = RETRIEVAL_DESCRIPTION | |||
interrupt_before_action: bool = False | |||
assistant_id: Optional[str] = None | |||
thread_id: Optional[str] = None | |||
thread_id: Optional[str] = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this not a None?
ALTER TABLE checkpoints | ||
ADD COLUMN IF NOT EXISTS thread_ts TIMESTAMPTZ, | ||
ADD COLUMN IF NOT EXISTS parent_ts TIMESTAMPTZ; | ||
-- Drop existing checkpoints-related tables if they exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this migration fails for anyone that's run migrations 1 through 4 already. the migration state is kept of in the database.
Should this run as step 5 so it'll run at the end?
So if you try to run opengpts with the previous version, and then apply PR on top and run things -- the migrations will not work.
W/ current approach it seems like any old threads are no longer usable from the app. (I'm assuming not super easy to recover b/c of the pickle serde that was used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re right! I think the same thing happens in LangGraph when people decide to use the new checkpointer, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new checkpointers can be versioned as far as I understand
So at least going forward there's a way to carry out schema migrations automatically.
But yeah going from the pickle checkpointer -> new checkpointer was a breaking change. I'm OK if we don't worry about this, don't think this affects that many users.
I'd just prefer if we didn't wipe out any potential sql tables that users may want to recover data from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Eugene,
I've implemented the changes we discussed previously:
-
Migration Changes
- Up:
- Updated migration 5 to rename the existing checkpoints table (preserving old data) instead of modifying it.
- The
checkpoint_blobs
andcheckpoint_writes
tables will now be created at runtime.
- Down:
- The migration properly reverts these changes by restoring the original table name.
Note: This is a breaking change – old checkpoints won’t be accessible in the new system.
- The migration properly reverts these changes by restoring the original table name.
- Up:
-
Runtime Setup
- Added an
ensure_setup()
call in the lifespan event, which will call theasync_postgres_saver.setup()
method. - This ensures that the new checkpoint tables (checkpoints, writes, and blobs) are properly created during application startup.
- Added an
I’ve tested the migration path from the old version to the new one, and it works as expected. The old data is preserved in the renamed table, while the new checkpoint system operates seamlessly with its updated table structure.
if isinstance(value, list) and all(isinstance(v, BaseMessage) for v in value): | ||
loaded["channel_values"][key] = [v.__class__(**v.__dict__) for v in value] | ||
return loaded | ||
class AsyncPostgresCheckpoint(BasePostgresSaver): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered initializing on app start up and calling .setup()
to set up the migration, and then avoiding doing the wrapping of the checkpointer?
It'll help keep the checkpoints in sync and remove some extra code here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Eugene,
Checkpoint:
I’ve run into a few challenges while implementing langgraph’s checkpoint:
-
Global Checkpointer Initialization:
I defined the checkpointer in the lifespan and declare it as a global inagent.py
. However, I encountered the error “Checkpointer not initialized” because the global instance wasn’t properly initialized before being accessed.
This issue occurs because the checkpointer depends on the application startup completing before it can be used. -
Singleton Pattern Issues:
I tried using a singleton pattern forAsyncPostgresSaver
to manage the global instance, initializing it during the lifespan. However, the initialization ofAsyncPostgresSaver
requires an async event loop, which isn’t always available—such as during testing—resulting in the error: “no running event loop.” -
Current Implementation:
I implemented a solution inspired by the current approach in OpenGPTs, adapted to use the new checkpointer in LangGraph:
- Singleton with Lazy Initialization: Created a
BasePostgresSaver
class with a singleton pattern that assigns the instance before initialization. - Async Setup Method: Moved the connection pool creation to an async
setup()
method, ensuring it initializes during the lifespan of the application when an asynchronous loop is available.
I’m open to trying a different approach if you have any suggestions or recommendations!
Migration:
Finally, I decided to run the migrations before using the app to stay consistent with OpenGPTs and ensure the queries are ready. I am using the same .sql.
What do you think?
Fix state handling inconsistency between different agent types: The issue:
Changes:
|
@eyurtsev I've added the changes we discussed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! @lgesuellip changes look good
probably need to bump poetry version used on CI. i'm taking a look |
Removed the |
Hi Team,
As a member of the Pampa Team, I’ve been working on this PR to upgrade OpenGPTs to the latest version of Langchain dependencies. This update ensures compatibility with Pydantic 2 and resolves issues with related packages.
Code changes
Migration to Pydantic 2
Langchain Dependency Upgrades
langchain
dependencies to their latest versions (likelangchain
,langchain-core
,langgraph
, etc).langchain-robocorp
as it is currently incompatible with Pydantic 2.unstructured
dependency to resolve issues related tonltk
and its associated packages, such aspunkt
.Code Adaptations
AsyncPostgresSaver
from thelanggraph
implementation for improved compatibility and performance.Updated schemas to work with Pydantic 2's BaseModel.
Fixed bugs using GPT-4o.
Testing
Related issues
#352
Looking forward to your feedback
Thank you Team!