Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom ontology #262

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Custom ontology #262

wants to merge 10 commits into from

Conversation

prasmussen15
Copy link
Collaborator

@prasmussen15 prasmussen15 commented Feb 7, 2025

Important

This pull request adds custom ontology support, enabling entity classification and attribute extraction, with updates to graphiti.py, node_db_queries.py, and extract_nodes.py.

  • Behavior:
    • Adds support for custom ontology by allowing entity classification and attribute extraction in graphiti.py and node_operations.py.
    • Introduces entity_types parameter in add_episode() in graphiti.py to specify custom entity models.
    • Updates extract_nodes() in node_operations.py to classify nodes using custom entity types.
  • Database:
    • Modifies ENTITY_NODE_SAVE and ENTITY_NODE_SAVE_BULK in node_db_queries.py to handle dynamic labels and attributes.
    • Updates get_by_uuid() and similar functions in nodes.py to retrieve node attributes and labels.
  • Prompts:
    • Adds classify_nodes prompt in extract_nodes.py for entity classification.
    • Updates summarize_context in summarize_nodes.py to extract entity properties.
  • Misc:
    • Deletes msc_eval.py, msc_runner.py, and parse_msc_messages.py from multi_session_conversation_memory examples.

This description was created by Ellipsis for 46b279b. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 7ee2755 in 2 minutes and 22 seconds

More details
  • Looked at 871 lines of code in 12 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 drafted comments based on config settings.
1. graphiti_core/utils/maintenance/node_operations.py:166
  • Draft comment:
    Using ast.literal_eval on LLM responses can be unsafe if the response is malformed. Consider using a safer JSON parsing approach if possible.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 80%
  1. The comment raises a valid security concern since ast.literal_eval() could potentially execute malicious code if the input is crafted carefully. 2. However, the code is using a pydantic response_model (EntityClassification) to validate the LLM response first. 3. The string being parsed is accessed via .get() with a default of '{}', providing some safety. 4. Still, using ast.literal_eval() on any external input is generally risky, and JSON parsing would be safer.
    The comment might be overstating the risk since there are some safeguards in place with the pydantic model. Also, ast.literal_eval() is actually designed to be a safe subset of eval() that only handles literals.
    While there are safeguards, using json.loads() would be strictly safer and more appropriate for parsing what is expected to be a JSON structure. The risk, while mitigated, is still present.
    Keep the comment as it suggests a concrete, actionable improvement to security by using a more appropriate parsing method for the expected JSON data.
2. graphiti_core/utils/maintenance/node_operations.py:320
  • Draft comment:
    The creation of the dynamic Pydantic model (entity_attributes_model) by merging entity type classes and Summary is complex. Adding an inline comment explaining the rationale would improve maintainability.
  • Reason this comment was not posted:
    Confidence changes required: 80% <= threshold 80%
    None
3. graphiti_core/utils/maintenance/node_operations.py:127
  • Draft comment:
    The iterative reflexion loop in the extract_nodes function is central for refining extracted entities. Consider adding more inline comments to clarify the iterative refinement process.
  • Reason this comment was not posted:
    Confidence changes required: 70% <= threshold 80%
    None
4. graphiti_core/utils/maintenance/node_operations.py:280
  • Draft comment:
    Context preparation for LLM in node extraction is repeated across functions (extract_message_nodes, extract_text_nodes, etc.). Consider refactoring these into a helper function to follow DRY principles.
  • Reason this comment was not posted:
    Confidence changes required: 80% <= threshold 80%
    None
5. graphiti_core/utils/maintenance/node_operations.py:152
  • Draft comment:
    Potential bug: 'entity_types.keys()' is called without checking if entity_types is None. Consider using 'entity_types.keys() if entity_types else []' to avoid an AttributeError.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
6. graphiti_core/utils/maintenance/node_operations.py:332
  • Draft comment:
    Danger: 'extracted_node.attributes.update(node_attributes_response)' may raise an error if attributes is None. Initialize attributes (e.g. to {}) before calling update.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 80%
  1. EntityNode is imported but we can't see its definition. 2. The code is using pydantic models and type hints throughout. 3. This is in an async function that's part of a larger node processing system. 4. Given the professional nature of the codebase and use of pydantic, it's likely EntityNode is properly initialized. 5. Without seeing EntityNode's definition, we can't be 100% certain.
    I can't see the EntityNode class definition, so I don't know for certain if attributes is guaranteed to be initialized. The codebase seems well-structured but we should be cautious about assumptions.
    While true that we can't see the EntityNode definition, the extensive use of pydantic and type hints throughout the codebase suggests this is a well-structured system where basic initialization would be handled properly.
    Without seeing the EntityNode definition, we can't be completely certain about the attributes initialization. The comment raises a valid concern but lacks strong evidence.
7. graphiti_core/utils/maintenance/node_operations.py:159
  • Draft comment:
    Consider adding error handling for ast.literal_eval in parsing the LLM response, in case the output format is unexpected.
  • Reason this comment was not posted:
    Confidence changes required: 80% <= threshold 80%
    None
8. graphiti_core/utils/maintenance/node_operations.py:148
  • Draft comment:
    Minor style: When building 'custom_prompt' with missed entities, the concatenation adds a trailing comma. Consider formatting the list of missed entities more cleanly.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 80%
    None

Workflow ID: wflow_7R2PM5DPT5kOQfM1


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

graphiti_core/utils/maintenance/node_operations.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@paul-paliychuk paul-paliychuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀🚀🚀🚀🚀🚀🚀

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 406b889 in 1 minute and 57 seconds

More details
  • Looked at 68 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 11 drafted comments based on config settings.
1. examples/podcast/podcast_runner.py:66
  • Draft comment:
    Uncommenting clear_data call will clear the DB; ensure this is intended for production.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    The comment is asking the author to ensure that uncommenting a line of code is intended for production. This is a form of asking for confirmation, which is against the rules. The comment does not provide a specific code suggestion or ask for a test to be written. Therefore, it should be removed.
2. graphiti_core/nodes.py:255
  • Draft comment:
    Changing the default of 'attributes' to {} avoids None checks; ensure this doesn’t mask missing data.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
3. graphiti_core/nodes.py:277
  • Draft comment:
    Using 'self.attributes or {}' is a good safeguard, ensuring update does not fail if attributes is None.
  • Reason this comment was not posted:
    Marked as duplicate.
4. graphiti_core/utils/maintenance/node_operations.py:156
  • Draft comment:
    Safely handling entity_types with a conditional ensures keys() isn’t called on None.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states what the code is doing without offering any critique or suggestion for change.
5. graphiti_core/utils/maintenance/node_operations.py:174
  • Draft comment:
    Using node_classifications.get(name) without an explicit default is clean since it returns None by default.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that the current implementation is clean, which does not align with the rules provided.
6. graphiti_core/utils/maintenance/node_operations.py:247
  • Draft comment:
    Type hint changed for entity_types (dict[str, BaseModel]) clarifies expectations; verify that downstream code accepts BaseModel values.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 80%
    This comment is asking the PR author to verify something, which is against the rules. It doesn't provide a specific suggestion or point out a specific issue with the code. Therefore, it should be removed.
7. examples/podcast/podcast_runner.py:66
  • Draft comment:
    Ensure that calling clear_data(client.driver) is intended in this example since it deletes existing graph data.
  • Reason this comment was not posted:
    Marked as duplicate.
8. graphiti_core/nodes.py:255
  • Draft comment:
    Defaulting 'attributes' to {} makes it non-nullable; consider removing the redundant 'or {}' in the update call.
  • Reason this comment was not posted:
    Marked as duplicate.
9. graphiti_core/utils/maintenance/node_operations.py:156
  • Draft comment:
    Using 'entity_types.keys() if entity_types is not None else []' safely handles the case when entity_types is None.
  • Reason this comment was not posted:
    Marked as duplicate.
10. graphiti_core/utils/maintenance/node_operations.py:171
  • Draft comment:
    Removing the explicit default None in 'node_classifications.get(name)' is cleaner, as dict.get returns None by default.
  • Reason this comment was not posted:
    Marked as duplicate.
11. graphiti_core/utils/maintenance/node_operations.py:243
  • Draft comment:
    The type for 'entity_types' parameter has been updated to use BaseModel; ensure that all provided models are BaseModel-derived and that this change is backward compatible.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_KPRFEnanaBNomjwC


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 25a43f8 in 2 minutes and 42 seconds

More details
  • Looked at 28 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. graphiti_core/utils/maintenance/node_operations.py:316
  • Draft comment:
    Using BaseModel() as the fallback creates an instance, but later code expects a model class (to access fields). Consider using BaseModel (the class) instead.
  • Reason this comment was not posted:
    Marked as duplicate.
2. graphiti_core/utils/maintenance/node_operations.py:314
  • Draft comment:
    Providing a default value (BaseModel()) guarantees non-None items, making the filter(lambda x: x is not None, ...) redundant.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_GEa7gDg1VNjys4Xp


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

graphiti_core/utils/maintenance/node_operations.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 8d2e7c6 in 2 minutes and 20 seconds

More details
  • Looked at 28 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. graphiti_core/utils/maintenance/node_operations.py:321
  • Draft comment:
    Prefer using direct key access on 'summary_context' instead of 'get' since 'attributes' is initialized.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 80%
    None
2. graphiti_core/utils/maintenance/node_operations.py:312
  • Draft comment:
    Result of tuple addition isn’t assigned. This line should update entity_type_classes (e.g. using '+=' or assignment) so that the filtered tuple is actually used.
  • Reason this comment was not posted:
    Marked as duplicate.
3. graphiti_core/utils/maintenance/node_operations.py:321
  • Draft comment:
    Switching from fields to model_fields is acceptable in pydantic v2, but using summary_context.get('attributes', []) is unnecessary if the key is already set. Consider reverting to summary_context['attributes'].append(field_name) for clarity.
  • Reason this comment was not posted:
    Confidence changes required: 66% <= threshold 80%
    None

Workflow ID: wflow_h3TA5qSSsmUxyFQX


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

graphiti_core/utils/maintenance/node_operations.py Outdated Show resolved Hide resolved
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 46b279b in 45 seconds

More details
  • Looked at 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. graphiti_core/utils/maintenance/node_operations.py:311
  • Draft comment:
    Good update: using tuple[BaseModel, ...] correctly indicates a tuple of arbitrary length. This improves type safety.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 80%
    None
2. graphiti_core/utils/maintenance/node_operations.py:311
  • Draft comment:
    Good fix: Changing the tuple type hint to tuple[BaseModel, ...] correctly indicates a variadic tuple of BaseModel instances.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 80%
    None

Workflow ID: wflow_0Y9Nz70jKSQPbqgg


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants