-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to deal with large (many observations) datasets #44
Comments
Hmmm good question. I have no idea. However, I think if it takes that long to complete, there should definitely be something to let the user know the site is processing the request and if it is possible, the app could give an estimate time frame for completion?! |
I'm moving the conversation of #11 here. |
I suggest this be an optional argument rather than an arbitrary limit to number of rows that can be read in. If performance and wait times are a concern for users, we could address this issue by supplying the user with a status bar (which has proven difficult to do) or inform the user of limitations through an expectation matrix (suggested here) in the package documentation. |
@atn38, since the UI team has figured out how to return messages from a function to the GUI, you could add messages to each static report function to inform the user of status. Alternatively, as @CoastalPlainSoils suggests, you may be able to create a progress bar using the |
I kind of like the idea of being able to randomly sample a large data set. In that context, some useful options would be:
|
add printing to console for report status on data summary tab. @wetlandscapes will give this a go! |
can i also add that we might want to limit the size of the download to someone's computer too? for example, warn them (and maybe stop download) if they're about to download a huge .shp file. |
I suggest the random sampling and warnings become enhancements to be implemented after the production release. Until then, file size issues can be communicated in the GUI messages and project docs. Note: A user will have to find a data package to use with |
With a data table of ~400k observations and 60 variables, the complete static report takes upwards of 10 mins to complete. Does the dynamic plotting functionality have the same challenges? Could we do something with large datasets to reduce load? Randomly sample the dataset then generate report from that sample?
The text was updated successfully, but these errors were encountered: