Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #RHIROS-1401 - Dropping csv records with missing resource usage… #144

Merged
merged 3 commits into from
Nov 15, 2023

Conversation

patilsuraj767
Copy link
Collaborator

… metrics

@@ -40,6 +42,33 @@ func Aggregate_data(df dataframe.DataFrame) dataframe.DataFrame {

df = df.Mutate(s.Col("X0")).Rename("k8s_object_type", "X0")
df = df.Mutate(s.Col("X1")).Rename("k8s_object_name", "X1")

// filter out only valid workload type
df = df.Filter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get this out as separate function? I understand its just a call to Filter however the Aggregate_data is already lengthy. With that can we have tests then? So looking at those one would understand what is invalid?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test cases

// Validation to check if metrics for cpuUsage, memoryUsage and memoryRSS are missing
df, no_of_dropped_records := filter_valid_csv_records(df)
if no_of_dropped_records != 0 {
invalidDataPoints.Add(float64(no_of_dropped_records))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand then we are never going to return float from filter_valid_csv_records, is it prometheus which expects float value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, prometheus Add method requires the float value.

Comment on lines 10 to 11
Name: "rosocp_invalid_datapoints_total",
Help: "The total number of invalid datapoints(rows) found in CSVs recevied",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
Name: "rosocp_invalid_datapoints_total",
Help: "The total number of invalid datapoints(rows) found in CSVs recevied",
Name: "rosocp_total_invalid_datapoints",
Help: "The total number of invalid datapoints(rows) found in received CSVs",

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to prometheus metric naming convention - https://prometheus.io/docs/practices/naming/ suffix should describe the unit.

internal/utils/aggregator.go Show resolved Hide resolved
df, no_of_dropped_records := filter_valid_csv_records(df)
if no_of_dropped_records != 0 {
invalidDataPoints.Add(float64(no_of_dropped_records))
log.Infof("Invalid records in CSV - %v", no_of_dropped_records)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be at Info level? and do you think we should also print more about owner_name and workload?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good to have at info level so that using request_id we can check in kibana how many rows where dropped for particular request/CSV. Our logging system by default logs all the request related info - https://github.com/RedHatInsights/ros-ocp-backend/blob/main/internal/logging/logging.go#L66-L71

Copy link
Contributor

@saltgen saltgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patilsuraj767 The Aggregate_data function seems quite long.
Suggestion: Could you split the function into 2, one could just filter the records and another aggregates?

internal/utils/metrics.go Outdated Show resolved Hide resolved
internal/utils/aggregator.go Outdated Show resolved Hide resolved
internal/utils/aggregator.go Show resolved Hide resolved
@patilsuraj767
Copy link
Collaborator Author

/retest

1 similar comment
@anurag03
Copy link

/retest

Copy link
Contributor

@saltgen saltgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patilsuraj767 LGTM 👍 Thank you for adding the tests 👌

@patilsuraj767
Copy link
Collaborator Author

/retest

@patilsuraj767 patilsuraj767 merged commit 85ac7b5 into RedHatInsights:main Nov 15, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants