Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update TADA_CheckRequiredFields and TADA_AutoClean #557
update TADA_CheckRequiredFields and TADA_AutoClean #557
Changes from 1 commit
a6da94c
e86259e
16ee669
0c5ff57
883f23a
2494bbd
7880454
878d562
0a27060
96b6626
f776a29
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to add ActivityStartDateTime and convert times zones in our autoclean function here in the same format/way as the DR functions do. This is needed in cases where in the TADAShiny app, a user brings their own data file that was not downloaded using either EPATADA or dataRetrieval, but has all other required cols. It should replicate any additional things dataRetrieval readWQPdata does to a df.
Data retrieval produces "2022-06-08 16:00:00" while my code below produces "2023-05-11 11:45:00 UTC". DR also converts times and time zones to UTC, but that's not included in the value here. The time zone conversion and addition of ActivityStartDateTime occurs even when attributes are ignored (which is fine, we need that col):
I'd like to understand this piece of dataretreival better. If there is a stand along function that can change all date/time fields to UTC and also creates this ActivityStartDate col (and does anything else that is happening behind the scenes), we could leverage that here instead.
@ldecicco-USGS do you have any advice on this topic? Can you please point us to the location in your code where this occurs? I did a quick search of your repo but couldn't find it. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The internal code is here;
https://github.com/DOI-USGS/dataRetrieval/blob/main/R/importWQP.R#L223
(you can call it, you'd just need to do a triple colon:
dataRetrieval:::create_dateTime
)offsetLibrary is a dataframe saved in sysdata.rda
You can see where and how it gets called here:
https://github.com/DOI-USGS/dataRetrieval/blob/main/R/importWQP.R#L160
Let me know if there's something unclear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This really helped me understand how to interpret both the ActivityStartDateTime and offset cols (these starting showing up in TADA df's for the first time last month: ActivityStartTime.TimeZoneCode_offset and ActivityEndTime.TimeZoneCode_offset) that are returned from dataRetrieval. For dataRetrieval users, I think it might be more user friendly to include a column titled ActivityStartDateTime.TimeZoneCode (UTC in this case) instead of the ActivityStartTime.TimeZoneCode_offset (which includes number of hours). As is, the target time zone for ActivityStartDateTime (a function input here) is not documented anywhere in the returned df (see review_TADAProfile1 below). Alternatively, UTC could potentially be included in ActivityStartDateTime but that might break people workflows (e.g. "2023-05-11 11:45:00 UTC"). I am going to create a separate issue in TADA repo to document this issue and discuss how to address it. Is this something you are potentially interested in updating in dataRetrieval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created a separate issue here to continue the conversation: #558
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure we're both on the same page: In dataRetrieval, the default timezone is UTC set here:
https://github.com/DOI-USGS/dataRetrieval/blob/main/R/readWQPdata.R#L200
You can read about changing timezones here:
https://doi-usgs.github.io/dataRetrieval/reference/readWQPdata.html#arg-tz
This sets the time zone attribute of the POSIX object.
Like this:
So what you are asking for is a column that converts the offset number of hours to the timezone it was converted to?
Note there's also the link in the help to the OlsonNames() base R function which talks about how R handles timezones. The issue is that different operating systems and depending on where in the world the computer things you are will want different abbreviations for timezones (that's why using the OlsonNames is what has been working best for dataRetrieval).
https://rdrr.io/r/base/timezones.html
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.