Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have an option for a metric to declare that it can populate retrospectively to backfill data #67

Open
brendanheywood opened this issue May 5, 2022 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@brendanheywood
Copy link
Collaborator

If this is true for a metric then we probably want the interface for this to be quite different where we pass into the metric the time resolution (eg 5 mins) and the time period start and end date to backfill OR a max number of samples we want (or both), and it does all the logic and passes back the metrics which then the manager sends to the collectors.

Reason for this is it is going to be way more efficient this way.

Then wrap this in a new setting so we can also force set it, which is how back back in tie we should back fill. The manager is probably going to need to manage state on behalf of all the metrics, or each metric needs to be able to say what state it already has.

Need to then make some business decisions around which collectors support this, I would strongly assume that only the internal db collector will at first.

@brendanheywood brendanheywood added the enhancement New feature or request label May 5, 2022
@brendanheywood
Copy link
Collaborator Author

Actually in hindsite it is probably better to simply pass in the minimum time period to the metric, and then each internally handles all it's own logic so say if it needs to backfill or not. This will be idempotent and future runs will do nothing. This backfill can be run very infrequently like nightly.

@marcghaly marcghaly self-assigned this May 5, 2022
@marcghaly
Copy link
Contributor

Hi @brendanheywood,

I was wondering if there was a reason usort was used instead of uasort here :

https://github.com/catalyst/moodle-tool_cloudmetrics/blob/MOODLE_35_STABLE/classes/metric/manager.php#L117

This is provoking a moodle exception to be thrown.

Cheers,

Marc

@brendanheywood
Copy link
Collaborator Author

No reason, but it isn't failing for us so far. In any case we are expecting rough edges as this is all hot off the press and not battle tested so just make issues and fix them as you see fit

@jaypha
Copy link
Contributor

jaypha commented May 5, 2022

Actually, this is what caused issue #71.

@marcghaly
Copy link
Contributor

marcghaly commented May 9, 2022

Hi @brendanheywood,

Concerning the display of those informations, should it be accessible from this panel as well ?

Screenshot from 2022-05-05 17-26-46

Should we use a form? Is there some specific concerning retrieving data from logstore_standard_log table?

Cheers,

Marc

@brendanheywood
Copy link
Collaborator Author

@marcghaly a rough outline of how I see this going:

  1. a metric should declare that it can be populated retrospectively. Not all metrics can be as they only make sense in the 'now' and we won't have a specific log table or any other table

  2. even if a metric can be back filled retrospectively doesn't mean we want it to be. That would be an admin option to specify how far back it should go

  3. if a metric does declare itself to support back filled metrics then it needs to internally manage state around what data is has sent to the collector manager. The collectors so far cannot be queried to see what data they have, eg how would you query 'syslog' or something where it is just a black hole that you ping with data? So it must clearly store what data it has sent, ie the minimum and maximum timestamp range.

  4. there will be a new metric interface which a metric optionally implements when it does support backfilling. This is basically saying 'go backfill' with no instructions on how to do that. Each metric needs to use its own state and config to figure out the data and then it will send it to the collector to store, and it will update its state as it goes. This backfill method should be idempotent, ie if you call it then it backfills everything it needs to, and then it stops. If you call it again the second time it does nothing. It could also be incremental so that it limits how much data it sends in one go, so as you call it multiple times it gradually back fills the data.

  5. How much it back fills I think should be up to the metrics config. Ideally this is config which is just a date. I am leaning towards all metrics which support backfilling should have the admin UI for setting this date managed for them consistently by the parent plugin not by the metric itself. This means you can see and edit all the dates in one place in the admin screen, and also saves duplication. Importantly it also allows us to force set the backfill date in config.php so we can manage it across a fleet in one place.

  6. I also think that each collector should also declare whether it can handle backfilled data. Some like cloudwatch can be given an metric from the past. But some assume that the metric is for right now. The manage needs to make sure that metrics for old data are only passed to collectors that can handle it.

I would say to start with for this issue that we only support backfilled data for the database collector and not for cloudwatch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants