-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Independent operator level cache + doc update #74
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left some minor comments
docs/configurations.rst
Outdated
- cache_type: Type of caching (SQLITE or In_MEMORY) | ||
- max_size: maximum size of cache | ||
- cache_dir: Directory for where DB file is stored. Default: "~/.lotus/cache" | ||
* Note: It is recommended to enable operator level caching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: It is recommended to enable operator level caching
-> Note: It is recommended to enable caching
?
lotus/cache.py
Outdated
@@ -33,21 +33,30 @@ def operator_cache(func: Callable) -> Callable: | |||
@wraps(func) | |||
def wrapper(self, *args, **kwargs): | |||
model = lotus.settings.lm | |||
use_operator_cache = lotus.settings.enable_operator_cache | |||
use_operator_cache = lotus.settings.enable_cache | |||
|
|||
if use_operator_cache and model.cache: | |||
|
|||
def serialize(value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the serialize
function can be cleaner. Consider something like what ChatGPT spits out
def serialize(value: Any) -> Any:
"""
Serialize a value into a JSON-serializable format.
Supports basic types, pandas DataFrames, and objects with a `dict` or `__dict__` method.
"""
if value is None or isinstance(value, (str, int, float, bool)):
# Basic types
return value
elif isinstance(value, pd.DataFrame):
# Serialize DataFrame to JSON string in 'split' format
return value.to_json(orient="split")
elif isinstance(value, (list, tuple)):
# Recursively serialize lists and tuples
return [serialize(item) for item in value]
elif isinstance(value, dict):
# Recursively serialize dictionary keys and values
return {key: serialize(val) for key, val in value.items()}
elif hasattr(value, "dict"):
# Serialize objects with a `dict` method (e.g., Pydantic models)
return value.dict()
elif hasattr(value, "__dict__"):
# Serialize objects with a `__dict__` attribute
return {
key: serialize(val) # Recursively serialize attributes
for key, val in vars(value).items()
if not key.startswith("_") # Optionally ignore private attributes
}
else:
# For unsupported types, convert to string (last resort)
lotus.logger.warning(f"Unsupported type {type(value)} for serialization. Converting to string.")
return str(value)
lotus/models/lm.py
Outdated
if lotus.settings.enable_cache: | ||
# Add new responses to cache | ||
for resp, (_, hash) in zip(uncached_responses, uncached_data): | ||
if hash: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is if hash:
needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea not needed
Independent operator level caching + doc updates
Independent operator level caching + doc updates