forked from probsys/hierarchical-irm
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sample_new parameter to gendb::incorporate #215
Merged
Merged
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
85a19dc
Add sample_new option to incorporate methods
ThomasColthurst 10e88b9
Debug printfs
ThomasColthurst 82bc4cd
Comment out brittle tests
ThomasColthurst 9e12a66
Fix build warning
ThomasColthurst 4e536ef
Add test and fix test
ThomasColthurst 73fb3d5
Rename sample_new
ThomasColthurst d6ba9ad
Add basic test of new_entities_have_new_parts
ThomasColthurst 8f10e0f
Respond to reviewer comments
ThomasColthurst File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of the machinery in sample_class_ancestors etc is for the purpose of walking the DAG of references to make sure the reference values are coherent (i.e. if Physician 5 went to School 3, that's correct in all samples). Am I correct that when sample_new is false, we never see the same Physician e.g. more than once, so we never actually have to walk the DAG? If so, then I think we should implement this by simply iterating over a relation's domains and pulling sequential values from the domain CRPs rather than adding complexity to sample_class_ancestors and the other methods that are primarily for inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added TODO. I don't think the sample_new = false case is as simple as you make out, if only because it's what pclean_lib::translate_observations did before and that was far from trivial (and it relied on the annotated_domains_for_relations datastructure, which is being eliminated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking again at translate_observations, it looks like it used annotated_domains_for_relations to get unique string names for all of the entities, which we don't need to do here since we're working with entity IDs directly, so I think it is pretty simple (and I'd prefer to do it in this PR though a TODO is ok too).