Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added django simple history models, and added modified_since param #147

Merged
merged 1 commit into from
Oct 27, 2023

Conversation

kjlippold
Copy link
Contributor

I've added a Django extension called django_simple_history to help track DB changes. This library adds a changelog table for each core model. The changelog table records create, update, and delete actions on those tables, including which fields were changed, by who, and when.

I've also added an optional query parameter called "modified_since" to the data management API's GET things and datastreams endpoints. If left null, the request should function normally. If a valid datetime is entered, HydroServer will return only records that have been modified since the given datetime based on what's been entered into the changelog table.

There is some code I added and then commented out related to handling HEAD requests and "Last-Modified" and "Is-Modified-Since" headers. We may want to add it back in for additional caching functionality in the future, but at the moment, Django Ninja isn't compatible with several important Django decorators such as "conditional". Apparently, Django Ninja v1.0 will support these decorators, so we may want to revisit that later.

I've removed query count checks in create, update, and delete tests. These were failing frequently due to varying savepoint queries. We really only care about query counts on GET requests.

The django_simple_history models should provide us a framework for adding "modified_since" query parameters to other endpoints later on if we need them. We can also use these tables for an auditing feature. We should consider adding some kind of automated cleanup script to these tables that drops old records. For caching, it's unlikely we'll ever care about data in these tables that's more than 24 hours old. For auditing functionality, we can archive older data to an external location such as S3.

@kjlippold kjlippold self-assigned this Oct 26, 2023
@kjlippold
Copy link
Contributor Author

A couple other things to note, django_simple_history does pick up changes made to the datastream table when observations are created, but it doesn't capture the user in that case. So if the frontend is querying for recently updated datastreams, it should capture when new observations have been created without needing to explicitly check the end date on the datastream.

For this version, it's not checking related models, so if a datastream's thing properties change, that would not be captured in a datastream query. If we decide to include nested related objects in certain responses, I'll update the modified_since param to check all objects included in the response.

@daniel-slaugh
Copy link
Contributor

daniel-slaugh commented Oct 27, 2023

Looks good. I verified by updating a thing lat and lon then using postman to send a GET request with the modified_since 1 minute before the patch request. It returns just the one thing as expected. I then requested the modified_since things after the date and it returned an empty [] as expected as long as I accounted for time zone differences. The rest of the website looks like it performs the same as before without the query parameter

@kjlippold kjlippold merged commit 60210cc into dev Oct 27, 2023
2 checks passed
@daniel-slaugh daniel-slaugh deleted the 33-cache-improvements branch December 22, 2023 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants