From 23a65461ccb5cf7a6e01c345037f225e66b54c08 Mon Sep 17 00:00:00 2001 From: Shreya Date: Mon, 25 Nov 2024 17:21:04 +0530 Subject: [PATCH 01/85] Add test plan & results template for kruize rel 0.2 --- tests/test_plans/test_plan_rel_0.2.md | 144 ++++++++++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 tests/test_plans/test_plan_rel_0.2.md diff --git a/tests/test_plans/test_plan_rel_0.2.md b/tests/test_plans/test_plan_rel_0.2.md new file mode 100644 index 000000000..420b2066d --- /dev/null +++ b/tests/test_plans/test_plan_rel_0.2.md @@ -0,0 +1,144 @@ +# KRUIZE TEST PLAN RELEASE 0.2 + +- [INTRODUCTION](#introduction) +- [FEATURES TO BE TESTED](#features-to-be-tested) +- [BUG FIXES TO BE TESTED](#bug-fixes-to-be-tested) +- [TEST ENVIRONMENT](#test-environment) +- [TEST DELIVERABLES](#test-deliverables) + - [New Test Cases Developed](#new-test-cases-developed) + - [Regression Testing](#regresion-testing) +- [SCALABILITY TESTING](#scalability-testing) +- [RELEASE TESTING](#release-testing) +- [TEST METRICS](#test-metrics) +- [RISKS AND CONTINGENCIES](#risks-and-contingencies) +- [APPROVALS](#approvals) + +----- + +## INTRODUCTION + +This document describes the test plan for Kruize remote monitoring release 0.2 + +---- + +## FEATURES TO BE TESTED + +* Webhook implementation +* Add time range filter for bulk API +* Bulk API error handling with New JSON +* Database updates for authentication +* Add test script for authentication +* Bulk test cases +* Datasource Exception handler + + +------ + +## BUG FIXES TO BE TESTED + +* Time_range fix +* Fix datasource missing issue on pod restart +* Environment variable error found in the log +* Removed the autotune job from the PR check workflow + +--- + +## TEST ENVIRONMENT + +* Minikube Cluster +* Openshift Cluster + +--- + +## TEST DELIVERABLES + +### New Test Cases Developed + +| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | +|---|---------------------------------------|----------------------------------------------------------------------|-------------------|---------|----------| +| 1 | Webhook implementation | | | | | +| 2 | Add time range filter for bulk API | | | | | +| 3 | Bulk API error handling with New JSON | | | | +| 4 | Database updates for authentication | New tests added [1307](https://github.com/kruize/autotune/pull/1307) | | | | +| 5 | Datasource Exception handler | | | | | + + +### Regression Testing + +| # | ISSUE (BUG/NEW FEATURE) | TEST CASE | RESULTS | COMMENTS | +|---|-----------------------------------------------------|-----------|---------|----------| +| 1 | Fix datasource missing issue on pod restart | | | | +| 2 | Time_range fix | | | | +| 3 | Environment variable error found in the log | | | | +| 4 | Removed the autotune job from the PR check workflow | | | | + +--- + +## SCALABILITY TESTING + +Evaluate Kruize Scalability on OCP, with 5k experiments by uploading resource usage data for 15 days and update recommendations. +Changes do not have scalability implications. Short scalability test will be run as part of the release testing + +Short Scalability run +- 5K exps / 15 days of results / 2 containers per exp +- Kruize replicas - 10 +- OCP - Scalelab cluster + +| Kruize Release | Exps / Results / Recos | Execution time | Latency (Max/ Avg) in seconds | | | Postgres DB size(MB) | Kruize Max CPU | Kruize Max Memory (GB) | +|----------------|------------------------|----------------|-------------------------------|---------------|----------------------|----------------------|----------------|------------------------| +| | | | UpdateRecommendations | UpdateResults | LoadResultsByExpName | | | | +| 0.1 | 5K / 72L / 3L | 5h 02 mins | 0.97 / 0.55 | 0.16 / 0.14 | 0.52 / 0.36 | 21757 | 7.3 | 33.67 | +| 0.2 | 5K / 72L / 3L | | | | | | | | + +---- +## RELEASE TESTING + +As part of the release testing, following tests will be executed: +- [Kruize Remote monitoring Functional tests](/tests/scripts/remote_monitoring_tests/Remote_monitoring_tests.md) +- [Fault tolerant test](/tests/scripts/remote_monitoring_tests/fault_tolerant_tests.md) +- [Stress test](/tests/scripts/remote_monitoring_tests/README.md) +- [DB Migration test](/tests/scripts/remote_monitoring_tests/db_migration_test.md) +- [Recommendation and box plot values validation test](https://github.com/kruize/kruize-demos/blob/main/monitoring/remote_monitoring_demo/recommendations_infra_demo/README.md) +- [Scalability test (On openshift)](/tests/scripts/remote_monitoring_tests/scalability_test.md) - scalability test with 5000 exps / 15 days usage data +- [Kruize remote monitoring demo (On minikube)](https://github.com/kruize/kruize-demos/blob/main/monitoring/remote_monitoring_demo/README.md) +- [Kruize local monitoring demo (On openshift)](https://github.com/kruize/kruize-demos/blob/main/monitoring/local_monitoring_demo) +- [Kruize local monitoring Functional tests](/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md) + + +| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | +|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | +| 2 | Fault tolerant test | PASSED | | | +| 3 | Stress test | PASSED | | | +| 4 | Scalability test (short run) | | | | +| 5 | DB Migration test | PASSED | | | +| 6 | Recommendation and box plot values validations | PASSED | | | +| 7 | Kruize remote monitoring demo | PASSED | | | +| 8 | Kruize Local monitoring demo | PASSED | | | +| 9 | Kruize Local Functional tests | TOTAL - 78, PASSED - 75 / FAILED - 3 | | | + +--- + +## TEST METRICS + +### Test Completion Criteria + +* All must_fix defects identified for the release are fixed +* New features work as expected and tests have been added to validate these +* No new regressions in the functional tests +* All non-functional tests work as expected without major issues +* Documentation updates have been completed + +---- + +## RISKS AND CONTINGENCIES + +* None + +---- +## APPROVALS + +Sign-off + +---- + From 8bf74c84036b123d3e86f7dde39da902f8544baa Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Tue, 26 Nov 2024 02:10:13 +0530 Subject: [PATCH 02/85] add filtration changes in the bulk API Signed-off-by: Saad Khan --- .../analyzer/services/DSMetadataService.java | 2 +- .../analyzer/workerimpl/BulkJobManager.java | 34 +++++++-- .../dataSourceQueries/DataSourceQueries.java | 8 +- .../common/datasource/DataSourceManager.java | 21 ++++-- .../DataSourceMetadataOperator.java | 75 ++++++++++++++++--- 5 files changed, 113 insertions(+), 27 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/services/DSMetadataService.java b/src/main/java/com/autotune/analyzer/services/DSMetadataService.java index 4f786b419..f24bf190a 100644 --- a/src/main/java/com/autotune/analyzer/services/DSMetadataService.java +++ b/src/main/java/com/autotune/analyzer/services/DSMetadataService.java @@ -133,7 +133,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) return; } - DataSourceMetadataInfo metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource,"",0,0,0); + DataSourceMetadataInfo metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource,"",0,0,0, null, null); // Validate imported metadataInfo object DataSourceMetadataValidation validationObject = new DataSourceMetadataValidation(); diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index c4eb77237..ebf2a6aa4 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -50,6 +50,7 @@ import java.util.concurrent.Executors; import java.util.regex.Matcher; import java.util.regex.Pattern; +import java.util.stream.Collectors; import static com.autotune.operator.KruizeDeploymentInfo.bulk_thread_pool_size; import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.*; @@ -121,8 +122,15 @@ public void run() { DataSourceMetadataInfo metadataInfo = null; DataSourceManager dataSourceManager = new DataSourceManager(); DataSourceInfo datasource = null; + String labelString = null; + Map includeResourcesMap = null; + Map excludeResourcesMap = null; try { - String labelString = getLabels(this.bulkInput.getFilter()); + if (this.bulkInput.getFilter() != null) { + labelString = getLabels(this.bulkInput.getFilter()); + includeResourcesMap = buildRegexFilters(this.bulkInput.getFilter().getInclude()); + excludeResourcesMap = buildRegexFilters(this.bulkInput.getFilter().getExclude()); + } if (null == this.bulkInput.getDatasource()) { this.bulkInput.setDatasource(CREATE_EXPERIMENT_CONFIG_BEAN.getDatasourceName()); } @@ -137,10 +145,13 @@ public void run() { } if (null != datasource) { JSONObject daterange = processDateRange(this.bulkInput.getTime_range()); - if (null != daterange) - metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, (Long) daterange.get(START_TIME), (Long) daterange.get(END_TIME), (Integer) daterange.get(STEPS)); + if (null != daterange) { + metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, (Long) daterange.get(START_TIME), + (Long) daterange.get(END_TIME), (Integer) daterange.get(STEPS), includeResourcesMap, excludeResourcesMap); + } else { - metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, 0, 0, 0); + metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, 0, 0, + 0, includeResourcesMap, excludeResourcesMap); } if (null == metadataInfo) { setFinalJobStatus(COMPLETED,String.valueOf(HttpURLConnection.HTTP_OK),NOTHING_INFO,datasource); @@ -314,7 +325,7 @@ private String getLabels(BulkInput.FilterWrapper filter) { String uniqueKey = null; try { // Process labels in the 'include' section - if (filter != null && filter.getInclude() != null) { + if (filter.getInclude() != null) { // Initialize StringBuilder for uniqueKey StringBuilder includeLabelsBuilder = new StringBuilder(); Map includeLabels = filter.getInclude().getLabels(); @@ -337,6 +348,19 @@ private String getLabels(BulkInput.FilterWrapper filter) { return uniqueKey; } + private Map buildRegexFilters(BulkInput.Filter filter) { + Map resourceFilters = new HashMap<>(); + if (filter != null) { + resourceFilters.put("namespaceRegex", filter.getNamespace() != null ? + filter.getNamespace().stream().map(String::trim).collect(Collectors.joining("|")) : ""); + resourceFilters.put("workloadRegex", filter.getWorkload() != null ? + filter.getWorkload().stream().map(String::trim).collect(Collectors.joining("|")) : ""); + resourceFilters.put("containerRegex", filter.getContainers() != null ? + filter.getContainers().stream().map(String::trim).collect(Collectors.joining("|")) : ""); + } + return resourceFilters; + } + private JSONObject processDateRange(BulkInput.TimeRange timeRange) { //TODO: add validations for the time range JSONObject dateRange = null; diff --git a/src/main/java/com/autotune/common/data/dataSourceQueries/DataSourceQueries.java b/src/main/java/com/autotune/common/data/dataSourceQueries/DataSourceQueries.java index a06d016aa..a889566c4 100644 --- a/src/main/java/com/autotune/common/data/dataSourceQueries/DataSourceQueries.java +++ b/src/main/java/com/autotune/common/data/dataSourceQueries/DataSourceQueries.java @@ -7,12 +7,12 @@ */ public class DataSourceQueries { public enum PromQLQuery { - NAMESPACE_QUERY("sum by (namespace) ( avg_over_time(kube_namespace_status_phase{namespace!=\"\" ADDITIONAL_LABEL}[15d]))"), - WORKLOAD_INFO_QUERY("sum by (namespace, workload, workload_type) ( avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=\"\" ADDITIONAL_LABEL}[15d]))"), + NAMESPACE_QUERY("sum by (namespace) ( avg_over_time(kube_namespace_status_phase{%s ADDITIONAL_LABEL}[15d]))"), + WORKLOAD_INFO_QUERY("sum by (namespace, workload, workload_type) ( avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{%s ADDITIONAL_LABEL}[15d]))"), CONTAINER_INFO_QUERY("sum by (container, image, workload, workload_type, namespace) (" + - " avg_over_time(kube_pod_container_info{container!=\"\" ADDITIONAL_LABEL}[15d]) *" + + " avg_over_time(kube_pod_container_info{%s ADDITIONAL_LABEL}[15d]) *" + " on (pod, namespace,prometheus_replica) group_left(workload, workload_type)" + - " avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=\"\" ADDITIONAL_LABEL}[15d])" + + " avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!~\"\" ADDITIONAL_LABEL}[15d])" + ")"); private final String query; diff --git a/src/main/java/com/autotune/common/datasource/DataSourceManager.java b/src/main/java/com/autotune/common/datasource/DataSourceManager.java index a8401970c..142f97c4b 100644 --- a/src/main/java/com/autotune/common/datasource/DataSourceManager.java +++ b/src/main/java/com/autotune/common/datasource/DataSourceManager.java @@ -62,15 +62,20 @@ public DataSourceManager() { * @param startTime Get metadata from starttime to endtime * @param endTime Get metadata from starttime to endtime * @param steps the interval between data points in a range query + * @param includeResources + * @param excludeResources * @return */ - public DataSourceMetadataInfo importMetadataFromDataSource(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, long endTime, int steps) throws DataSourceDoesNotExist, IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { + public DataSourceMetadataInfo importMetadataFromDataSource(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, + long endTime, int steps, Map includeResources, + Map excludeResources) throws DataSourceDoesNotExist, IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { if (null == dataSourceInfo) { throw new DataSourceDoesNotExist(KruizeConstants.DataSourceConstants.DataSourceErrorMsgs.MISSING_DATASOURCE_INFO); } - DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.createDataSourceMetadata(dataSourceInfo, uniqueKey, startTime, endTime, steps); + DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.createDataSourceMetadata(dataSourceInfo, + uniqueKey, startTime, endTime, steps, includeResources, excludeResources); if (null == dataSourceMetadataInfo) { - LOGGER.error(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceInfo.getName()); + LOGGER.error(DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceInfo.getName()); return null; } return dataSourceMetadataInfo; @@ -91,7 +96,7 @@ public DataSourceMetadataInfo getMetadataFromDataSource(DataSourceInfo dataSourc String dataSourceName = dataSource.getName(); DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.getDataSourceMetadataInfo(dataSource); if (null == dataSourceMetadataInfo) { - LOGGER.error(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceName); + LOGGER.error(DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceName); return null; } return dataSourceMetadataInfo; @@ -116,9 +121,9 @@ public void updateMetadataFromDataSource(DataSourceInfo dataSource, DataSourceMe throw new DataSourceDoesNotExist(KruizeConstants.DataSourceConstants.DataSourceErrorMsgs.MISSING_DATASOURCE_INFO); } if (null == dataSourceMetadataInfo) { - throw new DataSourceDoesNotExist(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.DATASOURCE_METADATA_INFO_NOT_AVAILABLE); + throw new DataSourceDoesNotExist(DATASOURCE_METADATA_INFO_NOT_AVAILABLE); } - dataSourceMetadataOperator.updateDataSourceMetadata(dataSource, "", 0, 0, 0); + dataSourceMetadataOperator.updateDataSourceMetadata(dataSource, "", 0, 0, 0, null, null); } catch (Exception e) { LOGGER.error(e.getMessage()); } @@ -236,7 +241,7 @@ public DataSourceInfo fetchDataSourceFromDBByName(String dataSourceName) { DataSourceInfo datasource = new ExperimentDBService().loadDataSourceFromDBByName(dataSourceName); return datasource; } catch (Exception e) { - LOGGER.error(String.format(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.LOAD_DATASOURCE_FROM_DB_ERROR, dataSourceName, e.getMessage())); + LOGGER.error(String.format(LOAD_DATASOURCE_FROM_DB_ERROR, dataSourceName, e.getMessage())); } return null; } @@ -256,7 +261,7 @@ public DataSourceMetadataInfo fetchDataSourceMetadataFromDBByName(String dataSou DataSourceMetadataInfo metadataInfo = new ExperimentDBService().loadMetadataFromDBByName(dataSourceName, verbose); return metadataInfo; } catch (Exception e) { - LOGGER.error(String.format(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.LOAD_DATASOURCE_METADATA_FROM_DB_ERROR, dataSourceName, e.getMessage())); + LOGGER.error(String.format(LOAD_DATASOURCE_METADATA_FROM_DB_ERROR, dataSourceName, e.getMessage())); } return null; } diff --git a/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java b/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java index ff50be82d..b74b4c63e 100644 --- a/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java +++ b/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java @@ -31,7 +31,10 @@ import java.security.KeyManagementException; import java.security.KeyStoreException; import java.security.NoSuchAlgorithmException; +import java.util.Arrays; import java.util.HashMap; +import java.util.List; +import java.util.Map; import static com.autotune.analyzer.utils.AnalyzerConstants.ServiceConstants.CHARACTER_ENCODING; @@ -66,10 +69,14 @@ public static DataSourceMetadataOperator getInstance() { * @param startTime Get metadata from starttime to endtime * @param endTime Get metadata from starttime to endtime * @param steps the interval between data points in a range query - * TODO - support multiple data sources + * TODO - support multiple data sources + * @param includeResources + * @param excludeResources */ - public DataSourceMetadataInfo createDataSourceMetadata(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, long endTime, int steps) throws IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { - return processQueriesAndPopulateDataSourceMetadataInfo(dataSourceInfo, uniqueKey, startTime, endTime, steps); + public DataSourceMetadataInfo createDataSourceMetadata(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, + long endTime, int steps, Map includeResources, + Map excludeResources) throws IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { + return processQueriesAndPopulateDataSourceMetadataInfo(dataSourceInfo, uniqueKey, startTime, endTime, steps, includeResources, excludeResources); } /** @@ -111,8 +118,10 @@ public DataSourceMetadataInfo getDataSourceMetadataInfo(DataSourceInfo dataSourc * TODO - Currently Create and Update functions have identical functionalities, based on UI workflow and requirements * need to further enhance updateDataSourceMetadata() to support namespace, workload level granular updates */ - public DataSourceMetadataInfo updateDataSourceMetadata(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, long endTime, int steps) throws Exception { - return processQueriesAndPopulateDataSourceMetadataInfo(dataSourceInfo, uniqueKey, startTime, endTime, steps); + public DataSourceMetadataInfo updateDataSourceMetadata(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, + long endTime, int steps, Map includeResources, + Map excludeResources) throws Exception { + return processQueriesAndPopulateDataSourceMetadataInfo(dataSourceInfo, uniqueKey, startTime, endTime, steps, includeResources, excludeResources); } /** @@ -149,10 +158,15 @@ public void deleteDataSourceMetadata(DataSourceInfo dataSourceInfo) { * @param startTime Get metadata from starttime to endtime * @param endTime Get metadata from starttime to endtime * @param steps the interval between data points in a range query + * @param includeResources + * @param excludeResources * @return DataSourceMetadataInfo object with populated metadata fields * todo rename processQueriesAndFetchClusterMetadataInfo */ - public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, long endTime, int steps) throws IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { + public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(DataSourceInfo dataSourceInfo, String uniqueKey, + long startTime, long endTime, int steps, + Map includeResources, + Map excludeResources) throws IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { DataSourceMetadataHelper dataSourceDetailsHelper = new DataSourceMetadataHelper(); /** * Get DataSourceOperatorImpl instance on runtime based on dataSource provider @@ -168,11 +182,26 @@ public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(Da * creating a comprehensive DataSourceMetadataInfo object that is then added to a list. * TODO - Process cluster metadata using a custom query */ + // Keys for the map + List fields = Arrays.asList("namespace", "workload", "container"); + // Map for storing queries + Map queries = new HashMap<>(); + + // Populate filters for each field + fields.forEach(field -> { + String includeRegex = includeResources.getOrDefault(field + "Regex", ""); + String excludeRegex = excludeResources.getOrDefault(field + "Regex", ""); + String filter = constructDynamicFilter(field, includeRegex, excludeRegex); + String queryTemplate = getQueryTemplate(field); // Helper to map fields to PromQL queries + queries.put(field, String.format(queryTemplate, filter)); + }); + + // Construct queries + String namespaceQuery = queries.get("namespace"); + String workloadQuery = queries.get("workload"); + String containerQuery = queries.get("container"); String dataSourceName = dataSourceInfo.getName(); - String namespaceQuery = PromQLDataSourceQueries.NAMESPACE_QUERY; - String workloadQuery = PromQLDataSourceQueries.WORKLOAD_QUERY; - String containerQuery = PromQLDataSourceQueries.CONTAINER_QUERY; if (null != uniqueKey && !uniqueKey.isEmpty()) { LOGGER.debug("uniquekey: {}", uniqueKey); namespaceQuery = namespaceQuery.replace(KruizeConstants.KRUIZE_BULK_API.ADDITIONAL_LABEL, "," + uniqueKey); @@ -244,6 +273,34 @@ public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(Da } + // Helper function to map fields to query templates + private String getQueryTemplate(String field) { + return switch (field) { + case "namespace" -> PromQLDataSourceQueries.NAMESPACE_QUERY; + case "workload" -> PromQLDataSourceQueries.WORKLOAD_QUERY; + case "container" -> PromQLDataSourceQueries.CONTAINER_QUERY; + default -> throw new IllegalArgumentException("Unknown field: " + field); + }; + } + + String constructDynamicFilter(String field, String includeRegex, String excludeRegex) { + StringBuilder filterBuilder = new StringBuilder(); + if (includeRegex.isEmpty() && excludeRegex.isEmpty()) { + filterBuilder.append(String.format("%s!=''", field)); + } + if (!includeRegex.isEmpty()) { + filterBuilder.append(String.format("%s=~\"%s\"", field, includeRegex)); + } + if (!excludeRegex.isEmpty()) { + if (!filterBuilder.isEmpty()) { + filterBuilder.append(","); + } + filterBuilder.append(String.format("%s!~\"%s\"", field, excludeRegex)); + } + LOGGER.info("filterBuilder: {}", filterBuilder); + return filterBuilder.toString(); + } + private JsonArray fetchQueryResults(DataSourceInfo dataSourceInfo, String query, long startTime, long endTime, int steps) throws IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { GenericRestApiClient client = new GenericRestApiClient(dataSourceInfo); String metricsUrl; From 2593473d642343bc69e395af95197d1fa910397e Mon Sep 17 00:00:00 2001 From: Shreya Date: Wed, 27 Nov 2024 19:24:33 +0530 Subject: [PATCH 03/85] Update test description and results Signed-off-by: Shreya --- tests/test_plans/test_plan_rel_0.2.md | 54 ++++++++++++++------------- 1 file changed, 28 insertions(+), 26 deletions(-) diff --git a/tests/test_plans/test_plan_rel_0.2.md b/tests/test_plans/test_plan_rel_0.2.md index 420b2066d..ea3b969cd 100644 --- a/tests/test_plans/test_plan_rel_0.2.md +++ b/tests/test_plans/test_plan_rel_0.2.md @@ -40,6 +40,7 @@ This document describes the test plan for Kruize remote monitoring release 0.2 * Fix datasource missing issue on pod restart * Environment variable error found in the log * Removed the autotune job from the PR check workflow +* Ensure access to the variable is synchronized --- @@ -54,23 +55,24 @@ This document describes the test plan for Kruize remote monitoring release 0.2 ### New Test Cases Developed -| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | -|---|---------------------------------------|----------------------------------------------------------------------|-------------------|---------|----------| -| 1 | Webhook implementation | | | | | -| 2 | Add time range filter for bulk API | | | | | -| 3 | Bulk API error handling with New JSON | | | | -| 4 | Database updates for authentication | New tests added [1307](https://github.com/kruize/autotune/pull/1307) | | | | -| 5 | Datasource Exception handler | | | | | - +| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | +|---|---------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|---------|-------------------------------------------------------------------| +| 1 | Webhook implementation | Tests will be added later, tested using the Bulk demo | | PASSED | | +| 2 | Add time range filter for bulk API | Tests will be added later, tested using the Bulk demo | | PASSED | | +| 3 | Bulk API error handling with New JSON | Tests will be added later, tested using the Bulk demo | | PASSED | +| 4 | Database updates for authentication | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [1307](https://github.com/kruize/autotune/pull/1307) | PASSED | | +| 5 | Datasource Exception handler | Regression testing | | | Issue seen [1395](https://github.com/kruize/autotune/issues/1395) | +| 6 | Bulk test cases | [New tests added](https://github.com/kruize/autotune/blob/master/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#bulk-api-tests) | [1372](https://github.com/kruize/autotune/pull/1372) | PASSED | | ### Regression Testing -| # | ISSUE (BUG/NEW FEATURE) | TEST CASE | RESULTS | COMMENTS | -|---|-----------------------------------------------------|-----------|---------|----------| -| 1 | Fix datasource missing issue on pod restart | | | | -| 2 | Time_range fix | | | | -| 3 | Environment variable error found in the log | | | | -| 4 | Removed the autotune job from the PR check workflow | | | | +| # | ISSUE (BUG/NEW FEATURE) | TEST CASE | RESULTS | COMMENTS | +|---|-----------------------------------------------------|----------------------------------------------------------|---------|----------| +| 1 | Fix datasource missing issue on pod restart | Kruize local monitoring Bulk demo with thanos datasource | PASSED | | +| 2 | Time_range fix | Kruize local monitoring bulk service demo | PASSED | | +| 3 | Environment variable error found in the log | Kruize local monitoring functional tests | PASSED | | +| 4 | Removed the autotune job from the PR check workflow | Autotune PR check is removed from github workflows | PASSED | | +| 5 | Ensure access to the variable is synchronized | Kruize local monitoring bulk service demo | PASSED | | --- @@ -88,7 +90,7 @@ Short Scalability run |----------------|------------------------|----------------|-------------------------------|---------------|----------------------|----------------------|----------------|------------------------| | | | | UpdateRecommendations | UpdateResults | LoadResultsByExpName | | | | | 0.1 | 5K / 72L / 3L | 5h 02 mins | 0.97 / 0.55 | 0.16 / 0.14 | 0.52 / 0.36 | 21757 | 7.3 | 33.67 | -| 0.2 | 5K / 72L / 3L | | | | | | | | +| 0.2 | 5K / 72L / 3L | 4h 08 mins | 0.81 / 0.48 | 0.14 / 0.12 | 0.55 / 0.38 | 21749 | 4.78 | 25.31 GB | ---- ## RELEASE TESTING @@ -105,17 +107,17 @@ As part of the release testing, following tests will be executed: - [Kruize local monitoring Functional tests](/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md) -| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | -|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | -| 2 | Fault tolerant test | PASSED | | | -| 3 | Stress test | PASSED | | | -| 4 | Scalability test (short run) | | | | -| 5 | DB Migration test | PASSED | | | -| 6 | Recommendation and box plot values validations | PASSED | | | -| 7 | Kruize remote monitoring demo | PASSED | | | -| 8 | Kruize Local monitoring demo | PASSED | | | -| 9 | Kruize Local Functional tests | TOTAL - 78, PASSED - 75 / FAILED - 3 | | | +| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | +|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | +| 2 | Fault tolerant test | PASSED | PASSED | | +| 3 | Stress test | PASSED | | | +| 4 | Scalability test (short run) | PASSED | PASSED | Exps - 5000, Results - 72000, execution time - 4 hours, 8 mins | +| 5 | DB Migration test | PASSED | PASSED | | +| 6 | Recommendation and box plot values validations | PASSED | | | +| 7 | Kruize remote monitoring demo | PASSED | PASSED | | +| 8 | Kruize Local monitoring demo | PASSED | PASSED | | +| 9 | Kruize Local Functional tests | TOTAL - 81, PASSED - 78 / FAILED - 3 | TOTAL - 81, PASSED - 61 / FAILED - 20 | Intermittent issue seen [1395](https://github.com/kruize/autotune/issues/1395) | --- From ba8d7d8cfe36a1471bdda73393b763fc8c93732c Mon Sep 17 00:00:00 2001 From: Shreya Date: Thu, 28 Nov 2024 12:17:20 +0530 Subject: [PATCH 04/85] Update stress and validation test results --- tests/test_plans/test_plan_rel_0.2.md | 38 ++++++++++++++------------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/tests/test_plans/test_plan_rel_0.2.md b/tests/test_plans/test_plan_rel_0.2.md index ea3b969cd..5c5ced240 100644 --- a/tests/test_plans/test_plan_rel_0.2.md +++ b/tests/test_plans/test_plan_rel_0.2.md @@ -30,6 +30,7 @@ This document describes the test plan for Kruize remote monitoring release 0.2 * Add test script for authentication * Bulk test cases * Datasource Exception handler +* Update kruize default cpu/memory resources for openshift ------ @@ -55,14 +56,15 @@ This document describes the test plan for Kruize remote monitoring release 0.2 ### New Test Cases Developed -| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | -|---|---------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|---------|-------------------------------------------------------------------| -| 1 | Webhook implementation | Tests will be added later, tested using the Bulk demo | | PASSED | | -| 2 | Add time range filter for bulk API | Tests will be added later, tested using the Bulk demo | | PASSED | | -| 3 | Bulk API error handling with New JSON | Tests will be added later, tested using the Bulk demo | | PASSED | -| 4 | Database updates for authentication | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [1307](https://github.com/kruize/autotune/pull/1307) | PASSED | | -| 5 | Datasource Exception handler | Regression testing | | | Issue seen [1395](https://github.com/kruize/autotune/issues/1395) | -| 6 | Bulk test cases | [New tests added](https://github.com/kruize/autotune/blob/master/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#bulk-api-tests) | [1372](https://github.com/kruize/autotune/pull/1372) | PASSED | | +| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | +|---|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|---------|--------------------------------------------------------------------------| +| 1 | Webhook implementation | Tests will be added later, tested using the Bulk demo | | PASSED | | +| 2 | Add time range filter for bulk API | Tests will be added later, tested using the Bulk demo | | PASSED | | +| 3 | Bulk API error handling with New JSON | Tests will be added later, tested using the Bulk demo | | PASSED | +| 4 | Database updates for authentication | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [1307](https://github.com/kruize/autotune/pull/1307) | PASSED | | +| 5 | Datasource Exception handler | Regression testing | | | Issue seen [1395](https://github.com/kruize/autotune/issues/1395) | +| 6 | Bulk test cases | [New tests added](https://github.com/kruize/autotune/blob/master/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#bulk-api-tests) | [1372](https://github.com/kruize/autotune/pull/1372) | PASSED | | +| 7 | Update kruize default cpu/memory resources for openshift | Tested with existing remote monitoring tests | | FAILED | DB Shutdown issue [1393](https://github.com/kruize/autotune/issues/1393) | ### Regression Testing @@ -107,17 +109,17 @@ As part of the release testing, following tests will be executed: - [Kruize local monitoring Functional tests](/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md) -| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | -|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | +|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | -| 2 | Fault tolerant test | PASSED | PASSED | | -| 3 | Stress test | PASSED | | | -| 4 | Scalability test (short run) | PASSED | PASSED | Exps - 5000, Results - 72000, execution time - 4 hours, 8 mins | -| 5 | DB Migration test | PASSED | PASSED | | -| 6 | Recommendation and box plot values validations | PASSED | | | -| 7 | Kruize remote monitoring demo | PASSED | PASSED | | -| 8 | Kruize Local monitoring demo | PASSED | PASSED | | -| 9 | Kruize Local Functional tests | TOTAL - 81, PASSED - 78 / FAILED - 3 | TOTAL - 81, PASSED - 61 / FAILED - 20 | Intermittent issue seen [1395](https://github.com/kruize/autotune/issues/1395) | +| 2 | Fault tolerant test | PASSED | PASSED | | +| 3 | Stress test | PASSED | FAILED | [Intermittent failure](https://github.com/kruize/autotune/issues/1106) | +| 4 | Scalability test (short run) | PASSED | PASSED | Exps - 5000, Results - 72000, execution time - 4 hours, 8 mins | +| 5 | DB Migration test | PASSED | PASSED | Tested on openshift | +| 6 | Recommendation and box plot values validations | PASSED | PASSED | Tested on minikube | +| 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | +| 8 | Kruize Local monitoring demo | PASSED | PASSED | | +| 9 | Kruize Local Functional tests | TOTAL - 81, PASSED - 78 / FAILED - 3 | TOTAL - 81, PASSED - 61 / FAILED - 20 | Intermittent issue seen [1395](https://github.com/kruize/autotune/issues/1395) | --- From f841236efb949dff3290e545a061211205dfc0b6 Mon Sep 17 00:00:00 2001 From: Shreya Date: Thu, 28 Nov 2024 13:46:56 +0530 Subject: [PATCH 05/85] Update results --- tests/test_plans/test_plan_rel_0.2.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/tests/test_plans/test_plan_rel_0.2.md b/tests/test_plans/test_plan_rel_0.2.md index 5c5ced240..499155ccb 100644 --- a/tests/test_plans/test_plan_rel_0.2.md +++ b/tests/test_plans/test_plan_rel_0.2.md @@ -56,15 +56,15 @@ This document describes the test plan for Kruize remote monitoring release 0.2 ### New Test Cases Developed -| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | -|---|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|---------|--------------------------------------------------------------------------| -| 1 | Webhook implementation | Tests will be added later, tested using the Bulk demo | | PASSED | | -| 2 | Add time range filter for bulk API | Tests will be added later, tested using the Bulk demo | | PASSED | | +| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | +|---|----------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------|---------|-----------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | Webhook implementation | Tests will be added later, tested using the Bulk demo | | PASSED | | +| 2 | Add time range filter for bulk API | Tests will be added later, tested using the Bulk demo | | PASSED | | | 3 | Bulk API error handling with New JSON | Tests will be added later, tested using the Bulk demo | | PASSED | -| 4 | Database updates for authentication | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [1307](https://github.com/kruize/autotune/pull/1307) | PASSED | | -| 5 | Datasource Exception handler | Regression testing | | | Issue seen [1395](https://github.com/kruize/autotune/issues/1395) | -| 6 | Bulk test cases | [New tests added](https://github.com/kruize/autotune/blob/master/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#bulk-api-tests) | [1372](https://github.com/kruize/autotune/pull/1372) | PASSED | | -| 7 | Update kruize default cpu/memory resources for openshift | Tested with existing remote monitoring tests | | FAILED | DB Shutdown issue [1393](https://github.com/kruize/autotune/issues/1393) | +| 4 | Database updates for authentication | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [1307](https://github.com/kruize/autotune/pull/1307) | PASSED | | +| 5 | Datasource Exception handler | Regression testing | | | Issue seen [1395](https://github.com/kruize/autotune/issues/1395) | +| 6 | Bulk test cases | [New tests added](https://github.com/kruize/autotune/blob/master/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#bulk-api-tests) | [1372](https://github.com/kruize/autotune/pull/1372) | PASSED | | +| 7 | Update kruize default cpu/memory resources for openshift | Tested with existing remote monitoring tests | | PASSED | [DB Shutdown issue](https://github.com/kruize/autotune/issues/1393), tests passed after restoring resource config for remote monitoring tests | ### Regression Testing @@ -119,7 +119,7 @@ As part of the release testing, following tests will be executed: | 6 | Recommendation and box plot values validations | PASSED | PASSED | Tested on minikube | | 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | | 8 | Kruize Local monitoring demo | PASSED | PASSED | | -| 9 | Kruize Local Functional tests | TOTAL - 81, PASSED - 78 / FAILED - 3 | TOTAL - 81, PASSED - 61 / FAILED - 20 | Intermittent issue seen [1395](https://github.com/kruize/autotune/issues/1395) | +| 9 | Kruize Local Functional tests | TOTAL - 81, PASSED - 78 / FAILED - 3 | TOTAL - 81, PASSED - 61 / FAILED - 20 | [Issue 1395](https://github.com/kruize/autotune/issues/1395), [Issue 1217](https://github.com/kruize/autotune/issues/1217), [Issue 1273](https://github.com/kruize/autotune/issues/1273) GPU accelerator test failed, failure can be ignored for now | --- From 93bdb9642648a07ef44f99617556474b85d7dbb5 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Fri, 29 Nov 2024 12:12:37 +0530 Subject: [PATCH 06/85] fix issue with duplicate experiment_id while running parallel experiments with time-range Signed-off-by: Saad Khan --- src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index ac37fc50e..6a1215216 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -58,7 +58,7 @@ public class ExperimentDAOImpl implements ExperimentDAO { private static final Logger LOGGER = LoggerFactory.getLogger(ExperimentDAOImpl.class); @Override - public ValidationOutputData addExperimentToDB(KruizeExperimentEntry kruizeExperimentEntry) { + public synchronized ValidationOutputData addExperimentToDB(KruizeExperimentEntry kruizeExperimentEntry) { ValidationOutputData validationOutputData = new ValidationOutputData(false, null, null); Transaction tx = null; String statusValue = "failure"; From c0e0d2c774a03ff18fe31feb43aaf40b52550374 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 29 Nov 2024 12:49:45 +0530 Subject: [PATCH 07/85] updating protocol to TLSv1.2 Signed-off-by: Shekhar Saxena --- src/main/java/com/autotune/utils/HttpUtils.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/utils/HttpUtils.java b/src/main/java/com/autotune/utils/HttpUtils.java index 16858ff2a..ef92a089b 100644 --- a/src/main/java/com/autotune/utils/HttpUtils.java +++ b/src/main/java/com/autotune/utils/HttpUtils.java @@ -106,7 +106,7 @@ public void checkServerTrusted(X509Certificate[] certs, String authType) { } SSLContext sslContext = null; try { - sslContext = SSLContext.getInstance("SSL"); + sslContext = SSLContext.getInstance("TLSv1.2"); sslContext.init(null, dummyTrustManager, new java.security.SecureRandom()); } catch (NoSuchAlgorithmException | KeyManagementException e) { e.printStackTrace(); From 5a8b77ae51800c86be2e2ca73cd87d46a2cf0671 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Mon, 2 Dec 2024 20:24:15 +0530 Subject: [PATCH 08/85] upgrade jetty-http dependency version Signed-off-by: Saad Khan --- pom.xml | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/pom.xml b/pom.xml index b76cb9812..d1e8d0311 100644 --- a/pom.xml +++ b/pom.xml @@ -11,6 +11,7 @@ 4.13.2 20240303 10.0.24 + 12.0.12 2.17.1 17 0.14.1 @@ -84,6 +85,12 @@ org.eclipse.jetty jetty-server ${jetty-version} + + + org.eclipse.jetty + jetty-http + + @@ -91,6 +98,19 @@ org.eclipse.jetty jetty-servlets ${jetty-version} + + + org.eclipse.jetty + jetty-http + + + + + + + org.eclipse.jetty + jetty-http + ${jetty-http-version} From 8b4ba32e49bd935fffdb215d2ff9dd31181dca56 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Tue, 3 Dec 2024 22:57:15 +0530 Subject: [PATCH 09/85] Update dependency version to fix the test failures Signed-off-by: Saad Khan --- pom.xml | 30 ++++--------------- src/main/java/com/autotune/Autotune.java | 4 +-- .../java/com/autotune/analyzer/Analyzer.java | 2 +- .../exceptions/KruizeErrorHandler.java | 4 +-- .../core/ExperimentManager.java | 2 +- .../utils/filter/KruizeCORSFilter.java | 4 +-- 6 files changed, 13 insertions(+), 33 deletions(-) diff --git a/pom.xml b/pom.xml index d1e8d0311..446f185dc 100644 --- a/pom.xml +++ b/pom.xml @@ -10,8 +10,7 @@ 4.13.2 20240303 - 10.0.24 - 12.0.12 + 12.0.12 2.17.1 17 0.14.1 @@ -85,38 +84,19 @@ org.eclipse.jetty jetty-server ${jetty-version} - - - org.eclipse.jetty - jetty-http - - - org.eclipse.jetty - jetty-servlets + org.eclipse.jetty.ee8 + jetty-ee8-servlets ${jetty-version} - - - org.eclipse.jetty - jetty-http - - - - - - - org.eclipse.jetty - jetty-http - ${jetty-http-version} - org.eclipse.jetty - jetty-servlet + org.eclipse.jetty.ee8 + jetty-ee8-servlet ${jetty-version} diff --git a/src/main/java/com/autotune/Autotune.java b/src/main/java/com/autotune/Autotune.java index cd5725b81..32af70950 100644 --- a/src/main/java/com/autotune/Autotune.java +++ b/src/main/java/com/autotune/Autotune.java @@ -45,8 +45,8 @@ import org.apache.logging.log4j.core.config.Configurator; import org.eclipse.jetty.server.Server; import org.eclipse.jetty.server.ServerConnector; -import org.eclipse.jetty.servlet.ServletContextHandler; -import org.eclipse.jetty.servlet.ServletHolder; +import org.eclipse.jetty.ee8.servlet.ServletContextHandler; +import org.eclipse.jetty.ee8.servlet.ServletHolder; import org.eclipse.jetty.util.thread.QueuedThreadPool; import org.hibernate.Session; import org.hibernate.SessionFactory; diff --git a/src/main/java/com/autotune/analyzer/Analyzer.java b/src/main/java/com/autotune/analyzer/Analyzer.java index 0c2cea55b..12a0ab262 100644 --- a/src/main/java/com/autotune/analyzer/Analyzer.java +++ b/src/main/java/com/autotune/analyzer/Analyzer.java @@ -21,7 +21,7 @@ import com.autotune.operator.KruizeDeploymentInfo; import com.autotune.operator.KruizeOperator; import com.autotune.utils.ServerContext; -import org.eclipse.jetty.servlet.ServletContextHandler; +import org.eclipse.jetty.ee8.servlet.ServletContextHandler; public class Analyzer { public static void start(ServletContextHandler contextHandler) { diff --git a/src/main/java/com/autotune/analyzer/exceptions/KruizeErrorHandler.java b/src/main/java/com/autotune/analyzer/exceptions/KruizeErrorHandler.java index 1de629485..042c727cb 100644 --- a/src/main/java/com/autotune/analyzer/exceptions/KruizeErrorHandler.java +++ b/src/main/java/com/autotune/analyzer/exceptions/KruizeErrorHandler.java @@ -21,8 +21,8 @@ import com.autotune.analyzer.utils.GsonUTCDateAdapter; import com.google.gson.Gson; import com.google.gson.GsonBuilder; -import org.eclipse.jetty.server.Request; -import org.eclipse.jetty.servlet.ErrorPageErrorHandler; +import org.eclipse.jetty.ee8.nested.Request; +import org.eclipse.jetty.ee8.servlet.ErrorPageErrorHandler; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/src/main/java/com/autotune/experimentManager/core/ExperimentManager.java b/src/main/java/com/autotune/experimentManager/core/ExperimentManager.java index bde457660..f1210bda4 100644 --- a/src/main/java/com/autotune/experimentManager/core/ExperimentManager.java +++ b/src/main/java/com/autotune/experimentManager/core/ExperimentManager.java @@ -9,7 +9,7 @@ import com.autotune.utils.ServerContext; -import org.eclipse.jetty.servlet.ServletContextHandler; +import org.eclipse.jetty.ee8.servlet.ServletContextHandler; import org.slf4j.Logger; import org.slf4j.LoggerFactory; diff --git a/src/main/java/com/autotune/utils/filter/KruizeCORSFilter.java b/src/main/java/com/autotune/utils/filter/KruizeCORSFilter.java index d415e312e..264ef9182 100644 --- a/src/main/java/com/autotune/utils/filter/KruizeCORSFilter.java +++ b/src/main/java/com/autotune/utils/filter/KruizeCORSFilter.java @@ -17,8 +17,8 @@ package com.autotune.utils.filter; import com.autotune.utils.KruizeConstants; -import org.eclipse.jetty.servlet.FilterHolder; -import org.eclipse.jetty.servlets.CrossOriginFilter; +import org.eclipse.jetty.ee8.servlet.FilterHolder; +import org.eclipse.jetty.ee8.servlets.CrossOriginFilter; import java.util.HashMap; import java.util.Map; From 3aeb4d40523976cf7e142c58194267fee4028879 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Thu, 5 Dec 2024 22:32:37 +0530 Subject: [PATCH 10/85] updating manifest with vpa rolebindings Signed-off-by: Shekhar Saxena --- .../minikube/kruize-crc-minikube.yaml | 52 +++++++++++++++++++ .../openshift/kruize-crc-openshift.yaml | 52 +++++++++++++++++++ 2 files changed, 104 insertions(+) diff --git a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml index 66d0b8733..a0996e497 100644 --- a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml +++ b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml @@ -1,3 +1,55 @@ +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: kruize-recommendation-updater +rules: + - apiGroups: + - "" + resources: + - pods + - customresourcedefinitions + verbs: + - '*' + - apiGroups: + - apiextensions.k8s.io + resources: + - customresourcedefinitions + verbs: + - '*' + - apiGroups: + - autoscaling.k8s.io + resources: + - verticalpodautoscalers + - verticalpodautoscalers/status + - verticalpodautoscalercheckpoints + verbs: + - '*' + - apiGroups: + - rbac.authorization.k8s.io + resources: + - clusterrolebindings + verbs: + - '*' + - apiGroups: + - apps + resources: + - deployments + verbs: + - "*" +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: kruize-recommendation-updater-crb +subjects: + - kind: ServiceAccount + name: default + namespace: monitoring +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: kruize-recommendation-updater +--- apiVersion: v1 kind: PersistentVolume metadata: diff --git a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml index 2deb3b954..c07afaae8 100644 --- a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml +++ b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml @@ -9,6 +9,58 @@ metadata: name: kruize-sa namespace: openshift-tuning --- +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: kruize-recommendation-updater +rules: + - apiGroups: + - "" + resources: + - pods + - customresourcedefinitions + verbs: + - '*' + - apiGroups: + - apiextensions.k8s.io + resources: + - customresourcedefinitions + verbs: + - '*' + - apiGroups: + - autoscaling.k8s.io + resources: + - verticalpodautoscalers + - verticalpodautoscalers/status + - verticalpodautoscalercheckpoints + verbs: + - '*' + - apiGroups: + - rbac.authorization.k8s.io + resources: + - clusterrolebindings + verbs: + - '*' + - apiGroups: + - apps + resources: + - deployments + verbs: + - "*" +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: kruize-recommendation-updater-crb +subjects: + - kind: ServiceAccount + name: kruize-sa + namespace: openshift-tuning +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: kruize-recommendation-updater +--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: From 5c2972d606c98d09d18bec6187a2e56959bfccdd Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Thu, 5 Dec 2024 23:52:13 +0530 Subject: [PATCH 11/85] updating fabric8 version Signed-off-by: Shekhar Saxena --- pom.xml | 8 +++- .../PerformanceProfilesDeployment.java | 3 +- .../service/impl/KubernetesServicesImpl.java | 37 ++++++++++--------- .../transitions/TransitionToCreateConfig.java | 14 ++++++- .../com/autotune/operator/KruizeOperator.java | 5 ++- 5 files changed, 45 insertions(+), 22 deletions(-) diff --git a/pom.xml b/pom.xml index 446f185dc..62eb4bc65 100644 --- a/pom.xml +++ b/pom.xml @@ -8,7 +8,7 @@ autotune 0.2 - 4.13.2 + 6.13.4 20240303 12.0.12 2.17.1 @@ -72,6 +72,12 @@ ${fabric8-version} + + io.fabric8 + verticalpodautoscaler-client + ${fabric8-version} + + org.json diff --git a/src/main/java/com/autotune/analyzer/performanceProfiles/PerformanceProfilesDeployment.java b/src/main/java/com/autotune/analyzer/performanceProfiles/PerformanceProfilesDeployment.java index a8aebcb2c..50ce90d25 100644 --- a/src/main/java/com/autotune/analyzer/performanceProfiles/PerformanceProfilesDeployment.java +++ b/src/main/java/com/autotune/analyzer/performanceProfiles/PerformanceProfilesDeployment.java @@ -11,6 +11,7 @@ import com.google.gson.Gson; import io.fabric8.kubernetes.client.KubernetesClientException; import io.fabric8.kubernetes.client.Watcher; +import io.fabric8.kubernetes.client.WatcherException; import org.json.JSONException; import org.json.JSONObject; import org.slf4j.Logger; @@ -65,7 +66,7 @@ public void eventReceived(Action action, String resource) { } @Override - public void onClose(KubernetesClientException e) { } + public void onClose(WatcherException e) { } }; KubernetesServices kubernetesServices = new KubernetesServicesImpl(); diff --git a/src/main/java/com/autotune/common/target/kubernetes/service/impl/KubernetesServicesImpl.java b/src/main/java/com/autotune/common/target/kubernetes/service/impl/KubernetesServicesImpl.java index 1a6724790..e5fb626e8 100644 --- a/src/main/java/com/autotune/common/target/kubernetes/service/impl/KubernetesServicesImpl.java +++ b/src/main/java/com/autotune/common/target/kubernetes/service/impl/KubernetesServicesImpl.java @@ -26,12 +26,8 @@ import io.fabric8.kubernetes.api.model.apps.DeploymentSpec; import io.fabric8.kubernetes.api.model.apps.DeploymentStatus; import io.fabric8.kubernetes.api.model.apps.ReplicaSet; -import io.fabric8.kubernetes.client.DefaultKubernetesClient; -import io.fabric8.kubernetes.client.KubernetesClient; -import io.fabric8.kubernetes.client.KubernetesClientException; -import io.fabric8.kubernetes.client.Watcher; +import io.fabric8.kubernetes.client.*; import io.fabric8.kubernetes.client.dsl.base.CustomResourceDefinitionContext; -import io.fabric8.kubernetes.client.dsl.internal.RawCustomResourceOperationsImpl; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -224,9 +220,16 @@ public List getReplicasBy(String namespace, String labelKey, String */ @Override public Map getCRDEnvMap(CustomResourceDefinitionContext crd, String namespace, String kubernetesType) { + GenericKubernetesResource kubernetesResource = null; Map envMap = null; try { - envMap = kubernetesClient.customResource(crd).get(namespace, kubernetesType); + kubernetesResource = kubernetesClient.genericKubernetesResources(crd) + .inNamespace(namespace) + .withName(crd.getName()) + .get(); + if (kubernetesResource != null) { + envMap = kubernetesResource.getAdditionalProperties(); + } } catch (Exception e) { new TargetHandlerException(e, "getCRDEnvMap failed!"); } @@ -418,9 +421,9 @@ public Deployment amendDeployment(String namespace, String deploymentName, Deplo @Override public void addWatcher(CustomResourceDefinitionContext crd, Watcher watcher) { try { - RawCustomResourceOperationsImpl rawCustomResourceOperations = kubernetesClient.customResource(crd); - if (null != rawCustomResourceOperations.list()) - rawCustomResourceOperations.watch(watcher); + kubernetesClient.genericKubernetesResources(crd) + .inAnyNamespace() + .watch(watcher); } catch (Exception e) { LOGGER.warn("Watcher not added! Only REST API access is enabled."); } @@ -437,7 +440,7 @@ public void addWatcher(CustomResourceDefinitionContext crd, Watcher watcher) { public Event getEvent(String namespace, String eventName) { Event event = null; try { - event = kubernetesClient.events().inNamespace(namespace).withName(eventName).get(); + event = kubernetesClient.v1().events().inNamespace(namespace).withName(eventName).get(); } catch (Exception e) { new TargetHandlerException(e, "getEvent failed! Event : " + event + " not found!"); } @@ -456,7 +459,7 @@ public Event getEvent(String namespace, String eventName) { public boolean replaceEvent(String namespace, String eventName, Event newEvent) { boolean replaced = false; try { - kubernetesClient.events().inNamespace(namespace).withName(eventName).replace(newEvent); + kubernetesClient.v1().events().inNamespace(namespace).withName(eventName).replace(newEvent); replaced = true; } catch (Exception e) { new TargetHandlerException(e, "replaceEvent for the eventName " + eventName + " failed!"); @@ -476,7 +479,7 @@ public boolean replaceEvent(String namespace, String eventName, Event newEvent) public boolean createEvent(String namespace, String eventName, Event newEvent) { boolean created = false; try { - kubernetesClient.events().inNamespace(namespace).withName(eventName).create(newEvent); + kubernetesClient.v1().events().inNamespace(namespace).withName(eventName).create(newEvent); created = true; } catch (Exception e) { new TargetHandlerException(e, "createEvent for the eventName " + eventName + " failed!"); @@ -491,19 +494,19 @@ public boolean createEvent(String namespace, String eventName, Event newEvent) { */ @Override public void watchEndpoints(CustomResourceDefinitionContext crd) { - Watcher autotuneObjectWatcher = new Watcher<>() { + Watcher autotuneObjectWatcher = new Watcher<>() { @Override - public void eventReceived(Action action, String resource) { + public void eventReceived(Action action, GenericKubernetesResource genericKubernetesResource) { } @Override - public void onClose(KubernetesClientException cause) { + public void onClose(WatcherException e) { } }; try { - kubernetesClient.customResource(crd).watch(autotuneObjectWatcher); - } catch (IOException e) { + kubernetesClient.genericKubernetesResources(crd).watch(autotuneObjectWatcher); + } catch (Exception e) { e.printStackTrace(); } } diff --git a/src/main/java/com/autotune/experimentManager/transitions/TransitionToCreateConfig.java b/src/main/java/com/autotune/experimentManager/transitions/TransitionToCreateConfig.java index c833a8d30..83ea18534 100644 --- a/src/main/java/com/autotune/experimentManager/transitions/TransitionToCreateConfig.java +++ b/src/main/java/com/autotune/experimentManager/transitions/TransitionToCreateConfig.java @@ -5,6 +5,8 @@ import com.autotune.experimentManager.transitions.util.TransistionHelper; import io.fabric8.kubernetes.api.model.*; import io.fabric8.kubernetes.api.model.apps.Deployment; +import io.fabric8.kubernetes.api.model.apps.DeploymentSpec; +import io.fabric8.kubernetes.api.model.apps.DeploymentStrategy; import io.fabric8.kubernetes.api.model.apps.RollingUpdateDeployment; import io.fabric8.kubernetes.client.DefaultKubernetesClient; import io.fabric8.kubernetes.client.KubernetesClient; @@ -32,7 +34,17 @@ public void transit(String runId) { IntOrString maxUnavailable = new IntOrString(0); rud.setMaxSurge(maxSurge); rud.setMaxUnavailable(maxUnavailable); - client.apps().deployments().inNamespace(trialData.getConfig().getDeploymentNamespace()).withName(trialData.getConfig().getDeploymentName()).edit().editSpec().editOrNewStrategy().withRollingUpdate(rud).endStrategy().endSpec().done(); + client.apps().deployments().inNamespace(trialData.getConfig().getDeploymentNamespace()).withName(trialData.getConfig().getDeploymentName()).edit(deployment -> { + DeploymentSpec spec = deployment.getSpec(); + if (spec != null) { + if (spec.getStrategy() == null) { + spec.setStrategy(new DeploymentStrategy()); + } + spec.getStrategy().setRollingUpdate(rud); + spec.getStrategy().setType("RollingUpdate"); + } + return deployment; + }); Deployment defaultDeployment = client.apps().deployments().inNamespace(trialData.getConfig().getDeploymentNamespace()).withName(trialData.getConfig().getDeploymentName()).get(); try { ByteArrayOutputStream bos = new ByteArrayOutputStream(); diff --git a/src/main/java/com/autotune/operator/KruizeOperator.java b/src/main/java/com/autotune/operator/KruizeOperator.java index 9baeb14a9..aacd0b3f7 100644 --- a/src/main/java/com/autotune/operator/KruizeOperator.java +++ b/src/main/java/com/autotune/operator/KruizeOperator.java @@ -51,6 +51,7 @@ import io.fabric8.kubernetes.api.model.apps.ReplicaSet; import io.fabric8.kubernetes.client.KubernetesClientException; import io.fabric8.kubernetes.client.Watcher; +import io.fabric8.kubernetes.client.WatcherException; import io.fabric8.kubernetes.client.dsl.base.CustomResourceDefinitionContext; import org.json.JSONArray; import org.json.JSONException; @@ -119,7 +120,7 @@ public void eventReceived(Action action, String resource) { @Override - public void onClose(KubernetesClientException e) { + public void onClose(WatcherException e) { } }; @@ -154,7 +155,7 @@ public void eventReceived(Action action, String resource) { } @Override - public void onClose(KubernetesClientException e) { + public void onClose(WatcherException e) { } }; From 55b2613367f35fde944146976837568c6191e7a1 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 6 Dec 2024 14:11:10 +0530 Subject: [PATCH 12/85] updating fabric to 7.0.0 Signed-off-by: Shekhar Saxena --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index 62eb4bc65..7d9f8655a 100644 --- a/pom.xml +++ b/pom.xml @@ -8,7 +8,7 @@ autotune 0.2 - 6.13.4 + 7.0.0 20240303 12.0.12 2.17.1 From 9e1d5e0e3be5a43aa57e5cdaa4ee3229d1207fad Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 6 Dec 2024 02:15:28 +0530 Subject: [PATCH 13/85] adding basic abstraction for updaters Signed-off-by: Shekhar Saxena --- .../exceptions/ApplyRecommendationsError.java | 26 +++++ .../InvalidRecommendationUpdaterType.java | 26 +++++ .../updater/RecommendationUpdater.java | 66 +++++++++++ .../updater/RecommendationUpdaterImpl.java | 108 ++++++++++++++++++ .../updater/vpa/VpaUpdaterImpl.java | 67 +++++++++++ .../analyzer/utils/AnalyzerConstants.java | 33 ++++++ .../utils/AnalyzerErrorConstants.java | 10 ++ 7 files changed, 336 insertions(+) create mode 100644 src/main/java/com/autotune/analyzer/exceptions/ApplyRecommendationsError.java create mode 100644 src/main/java/com/autotune/analyzer/exceptions/InvalidRecommendationUpdaterType.java create mode 100644 src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdater.java create mode 100644 src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java create mode 100644 src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java diff --git a/src/main/java/com/autotune/analyzer/exceptions/ApplyRecommendationsError.java b/src/main/java/com/autotune/analyzer/exceptions/ApplyRecommendationsError.java new file mode 100644 index 000000000..45fa5df64 --- /dev/null +++ b/src/main/java/com/autotune/analyzer/exceptions/ApplyRecommendationsError.java @@ -0,0 +1,26 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.exceptions; + +public class ApplyRecommendationsError extends Exception { + public ApplyRecommendationsError() { + } + + public ApplyRecommendationsError(String message) { + super(message); + } +} diff --git a/src/main/java/com/autotune/analyzer/exceptions/InvalidRecommendationUpdaterType.java b/src/main/java/com/autotune/analyzer/exceptions/InvalidRecommendationUpdaterType.java new file mode 100644 index 000000000..67ed3ed78 --- /dev/null +++ b/src/main/java/com/autotune/analyzer/exceptions/InvalidRecommendationUpdaterType.java @@ -0,0 +1,26 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.exceptions; + +public class InvalidRecommendationUpdaterType extends Exception { + public InvalidRecommendationUpdaterType() { + } + + public InvalidRecommendationUpdaterType(String message) { + super(message); + } +} diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdater.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdater.java new file mode 100644 index 000000000..2ddd75cee --- /dev/null +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdater.java @@ -0,0 +1,66 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.recommendations.updater; + +import com.autotune.analyzer.exceptions.ApplyRecommendationsError; +import com.autotune.analyzer.exceptions.InvalidRecommendationUpdaterType; +import com.autotune.analyzer.kruizeObject.KruizeObject; + +/** + * This interface defines the abstraction for updating resource recommendations in a system. + * Implementing classes will provide the logic to update resources with recommendations for a specific resources, + * such as CPU, memory, or any other resources that require periodic or dynamic adjustments. + * + * The RecommendationUpdater interface is designed to be extended by different updater classes. + * For example, vpaUpdaterImpl for updating resources with recommendations related to CPU and memory resources. + */ + +public interface RecommendationUpdater { + /** + * Retrieves an instance of a specific updater implementation based on the provided updater type + * + * @param updaterType String the type of updater to retrieve + * @return RecommendationUpdaterImpl An instance of provided updater type class + * @throws InvalidRecommendationUpdaterType If the provided updater type doesn't match any valid type of updater. + */ + RecommendationUpdaterImpl getUpdaterInstance(String updaterType) throws InvalidRecommendationUpdaterType; + + /** + * Checks whether the necessary updater dependencies are installed or available in the system. + * + * @return boolean true if the required updaters are installed, false otherwise. + */ + boolean isUpdaterInstalled(); + + /** + * Generates resource recommendations for a specific experiment based on the experiment's name. + * + * @param experimentName String The name of the experiment for which the resource recommendations should be generated. + * @return KruizeObject containing recommendations + */ + KruizeObject generateResourceRecommendationsForExperiment(String experimentName); + + /** + * Applies the resource recommendations contained within the provided KruizeObject + * This method will take the KruizeObject, which contains the resource recommendations, + * and apply them to the desired resources. + * + * @param kruizeObject KruizeObject containing the resource recommendations to be applied. + * @throws ApplyRecommendationsError in case of any error. + */ + void applyResourceRecommendationsForExperiment(KruizeObject kruizeObject) throws ApplyRecommendationsError; +} diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java new file mode 100644 index 000000000..a3cfcc379 --- /dev/null +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java @@ -0,0 +1,108 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.recommendations.updater; + +import com.autotune.analyzer.exceptions.ApplyRecommendationsError; +import com.autotune.analyzer.exceptions.FetchMetricsError; +import com.autotune.analyzer.exceptions.InvalidRecommendationUpdaterType; +import com.autotune.analyzer.kruizeObject.KruizeObject; +import com.autotune.analyzer.recommendations.engine.RecommendationEngine; +import com.autotune.analyzer.recommendations.updater.vpa.VpaUpdaterImpl; +import com.autotune.analyzer.utils.AnalyzerConstants; +import com.autotune.analyzer.utils.AnalyzerErrorConstants; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class RecommendationUpdaterImpl implements RecommendationUpdater { + + private static final Logger LOGGER = LoggerFactory.getLogger(RecommendationUpdaterImpl.class); + + /** + * Retrieves an instance of a specific updater implementation based on the provided updater type + * + * @param updaterType String the type of updater to retrieve + * @return RecommendationUpdaterImpl An instance of provided updater type class + * @throws InvalidRecommendationUpdaterType If the provided updater type doesn't match any valid type of updater. + */ + @Override + public RecommendationUpdaterImpl getUpdaterInstance(String updaterType) throws InvalidRecommendationUpdaterType { + if (AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA.equalsIgnoreCase(updaterType)) { + return VpaUpdaterImpl.getInstance(); + } else { + throw new InvalidRecommendationUpdaterType(String.format(AnalyzerErrorConstants.RecommendationUpdaterErrors.UNSUPPORTED_UPDATER_TYPE, updaterType)); + } + } + + /** + * Checks whether the necessary updater dependencies are installed or available in the system. + * @return boolean true if the required updaters are installed, false otherwise. + */ + @Override + public boolean isUpdaterInstalled() { + /* + * This function will be implemented by specific updater type child classes + */ + return false; + } + + /** + * Generates resource recommendations for a specific experiment based on the experiment's name. + * + * @param experimentName String The name of the experiment for which the resource recommendations should be generated. + * @return KruizeObject containing recommendations + */ + @Override + public KruizeObject generateResourceRecommendationsForExperiment(String experimentName) { + try { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATING_RECOMMENDATIONS, experimentName); + // generating latest recommendations for experiment + RecommendationEngine recommendationEngine = new RecommendationEngine(experimentName, null, null); + int calCount = 0; + String validationMessage = recommendationEngine.validate_local(); + if (validationMessage.isEmpty()) { + KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount); + if (kruizeObject.getValidation_data().isSuccess()) { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATED_RECOMMENDATIONS, experimentName); + return kruizeObject; + } else { + throw new Exception(kruizeObject.getValidation_data().getMessage()); + } + } else { + throw new Exception(validationMessage); + } + } catch (Exception | FetchMetricsError e) { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.GENERATE_RECOMMNEDATION_FAILED, experimentName); + LOGGER.debug(e.getMessage()); + return null; + } + } + + /** + * Applies the resource recommendations contained within the provided KruizeObject + * This method will take the KruizeObject, which contains the resource recommendations, + * and apply them to the desired resources. + * + * @param kruizeObject KruizeObject containing the resource recommendations to be applied. + * @throws ApplyRecommendationsError in case of any error. + */ + @Override + public void applyResourceRecommendationsForExperiment(KruizeObject kruizeObject) throws ApplyRecommendationsError { + /* + * This function will be implemented by specific updater type child classes + */ + } +} diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java new file mode 100644 index 000000000..66c2b80f9 --- /dev/null +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -0,0 +1,67 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.recommendations.updater.vpa; + +import com.autotune.analyzer.recommendations.updater.RecommendationUpdaterImpl; +import com.autotune.analyzer.utils.AnalyzerConstants; +import com.autotune.analyzer.utils.AnalyzerErrorConstants; +import io.fabric8.kubernetes.api.model.apiextensions.v1.CustomResourceDefinitionList; +import io.fabric8.kubernetes.client.DefaultKubernetesClient; +import io.fabric8.kubernetes.client.KubernetesClient; +import io.fabric8.kubernetes.client.dsl.ApiextensionsAPIGroupDSL; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class VpaUpdaterImpl extends RecommendationUpdaterImpl { + private static final Logger LOGGER = LoggerFactory.getLogger(VpaUpdaterImpl.class); + private static VpaUpdaterImpl vpaUpdater = new VpaUpdaterImpl(); + + private KubernetesClient kubernetesClient; + private ApiextensionsAPIGroupDSL apiextensionsClient; + + + private VpaUpdaterImpl() { + this.kubernetesClient = new DefaultKubernetesClient(); + this.apiextensionsClient = kubernetesClient.apiextensions(); + } + + public static VpaUpdaterImpl getInstance() { + if (vpaUpdater == null) { + vpaUpdater = new VpaUpdaterImpl(); + } + return vpaUpdater; + } + + /** + * Checks whether the necessary updater dependencies are installed or available in the system. + * @return boolean true if the required updaters are installed, false otherwise. + */ + @Override + public boolean isUpdaterInstalled() { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_UPDATER_INSTALLED, + AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + // checking if VPA CRD is present or not + CustomResourceDefinitionList crdList = apiextensionsClient.v1().customResourceDefinitions().list(); + boolean isVpaInstalled = crdList.getItems().stream().anyMatch(crd -> AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_PLURAL.equalsIgnoreCase(crd.getSpec().getNames().getKind())); + if (isVpaInstalled) { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.FOUND_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + } else { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + } + return isVpaInstalled; + } +} diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index f7ae69c9f..c0dbbe783 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -666,4 +666,37 @@ private APIVersionConstants() { } } } + + public static final class RecommendationUpdaterConstants { + private RecommendationUpdaterConstants() { + + } + + public static final class SupportedUpdaters { + public static final String VPA = "vpa"; + + private SupportedUpdaters() { + + } + } + + public static final class VPA { + public static final String VPA_PLURAL = "VerticalPodAutoscaler"; + + private VPA() { + + } + } + + public static final class InfoMsgs { + public static final String GENERATING_RECOMMENDATIONS = "Generating recommendations for experiment: {}"; + public static final String GENERATED_RECOMMENDATIONS = "Generated recommendations for experiment: {}"; + public static final String CHECKING_IF_UPDATER_INSTALLED = "Verifying if the updater is installed: {}"; + public static final String FOUND_UPDATER_INSTALLED = "Found updater is installed: {}"; + + private InfoMsgs() { + + } + } + } } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java index a279ea77a..148494d38 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java @@ -286,4 +286,14 @@ private KruizeRecommendationError() { } } } + + public static final class RecommendationUpdaterErrors { + private RecommendationUpdaterErrors() { + + } + + public static final String UNSUPPORTED_UPDATER_TYPE = "Updater type %s is not supported."; + public static final String GENERATE_RECOMMNEDATION_FAILED = "Failed to generate recommendations for experiment: {}"; + public static final String UPDATER_NOT_INSTALLED = "Updater is not installed: {}"; + } } From 252320d63213842de84ebd63b19c8b5a02db5cb1 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 2 Dec 2024 15:41:50 +0530 Subject: [PATCH 14/85] adding ROS enable flag Signed-off-by: msvinaykumar --- src/main/java/com/autotune/Autotune.java | 1 + src/main/java/com/autotune/operator/KruizeDeploymentInfo.java | 1 + src/main/java/com/autotune/utils/KruizeConstants.java | 1 + 3 files changed, 3 insertions(+) diff --git a/src/main/java/com/autotune/Autotune.java b/src/main/java/com/autotune/Autotune.java index 32af70950..e18652ed4 100644 --- a/src/main/java/com/autotune/Autotune.java +++ b/src/main/java/com/autotune/Autotune.java @@ -115,6 +115,7 @@ public static void main(String[] args) { InitializeDeployment.setup_deployment_info(); // Configure AWS CloudWatch CloudWatchAppender.configureLoggerForCloudWatchLog(); + LOGGER.debug("ROS enabled : {}" ,KruizeDeploymentInfo.is_ros_enabled); // Read and execute the DDLs here executeDDLs(AnalyzerConstants.ROS_DDL_SQL); if (KruizeDeploymentInfo.local == true) { diff --git a/src/main/java/com/autotune/operator/KruizeDeploymentInfo.java b/src/main/java/com/autotune/operator/KruizeDeploymentInfo.java index 31715f6f0..d9c01fa7f 100644 --- a/src/main/java/com/autotune/operator/KruizeDeploymentInfo.java +++ b/src/main/java/com/autotune/operator/KruizeDeploymentInfo.java @@ -89,6 +89,7 @@ public class KruizeDeploymentInfo { private static Hashtable tunableLayerPair; //private static KubernetesClient kubernetesClient; private static KubeEventLogger kubeEventLogger; + public static Boolean is_ros_enabled = false; private KruizeDeploymentInfo() { diff --git a/src/main/java/com/autotune/utils/KruizeConstants.java b/src/main/java/com/autotune/utils/KruizeConstants.java index 34b90d0e5..ff0b52592 100644 --- a/src/main/java/com/autotune/utils/KruizeConstants.java +++ b/src/main/java/com/autotune/utils/KruizeConstants.java @@ -710,6 +710,7 @@ public static final class KRUIZE_CONFIG_ENV_NAME { public static final String BULK_API_LIMIT = "bulkapilimit"; public static final String BULK_THREAD_POOL_SIZE = "bulkThreadPoolSize"; public static final String EXPERIMENT_NAME_FORMAT = "experimentNameFormat"; + public static final String IS_ROS_ENABLED = "isROSEnabled"; } public static final class RecommendationEngineConstants { From af3c0bc526c866ef32c2c5fc8d6c0981905f86f0 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 2 Dec 2024 16:02:01 +0530 Subject: [PATCH 15/85] adding ROS enable flag Signed-off-by: msvinaykumar --- design/KruizeConfiguration.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/design/KruizeConfiguration.md b/design/KruizeConfiguration.md index 0b53f9008..fa204dfdb 100644 --- a/design/KruizeConfiguration.md +++ b/design/KruizeConfiguration.md @@ -139,3 +139,7 @@ The following environment variables are set using the `kubectl apply` command wi - Value: "true" - Details: This flag is added for getting the details of the inputs passed to the APIs and the corresponding response generated by it. This helps us in debugging the API easily in case of failures. +- **isROSEnabled** + - Description: if set to True, the ROS application will use the old architecture, where ROS handles experiment creation, result updates, and recommendation updates. If set to False, ROS will utilize the Bulk API, allowing Kruize to manage experiment creation and generate recommendations on its behalf. + Default value is false. + - value: "false" From 369e225d3671067c01d53c42f63fe30395119486 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Fri, 6 Dec 2024 14:45:27 +0530 Subject: [PATCH 16/85] testsuit update for new flag Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index a3dd9bc5b..f4f9b2aa4 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1919,6 +1919,8 @@ function kruize_remote_patch() { sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } From 0e6ad028c6db177240b254fbe72a970cd50d04d7 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 07:35:51 +0530 Subject: [PATCH 17/85] review comment incorporated Signed-off-by: msvinaykumar --- design/KruizeConfiguration.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/design/KruizeConfiguration.md b/design/KruizeConfiguration.md index fa204dfdb..bd88b74b2 100644 --- a/design/KruizeConfiguration.md +++ b/design/KruizeConfiguration.md @@ -140,6 +140,8 @@ The following environment variables are set using the `kubectl apply` command wi - Details: This flag is added for getting the details of the inputs passed to the APIs and the corresponding response generated by it. This helps us in debugging the API easily in case of failures. - **isROSEnabled** - - Description: if set to True, the ROS application will use the old architecture, where ROS handles experiment creation, result updates, and recommendation updates. If set to False, ROS will utilize the Bulk API, allowing Kruize to manage experiment creation and generate recommendations on its behalf. - Default value is false. + - Description: When set to True, the ROS application operates using the legacy architecture, where it manages experiment creation, result updates, and recommendation updates. Setting it to False enables ROS to use the Bulk API, allowing Kruize to handle experiment creation and generate recommendations, while disabling legacy architecture features. - value: "false" + - Details: + - Default value: False. + Bulk API functionality is also supported when the value is set to True From b2383c7df563f57e33a16d80287649e73cf7ba31 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Mon, 9 Dec 2024 10:42:10 +0530 Subject: [PATCH 18/85] docs update and minor fix for filter Signed-off-by: Saad Khan --- design/BulkAPI.md | 29 ++++++++++++------- .../analyzer/services/DSMetadataService.java | 2 +- 2 files changed, 19 insertions(+), 12 deletions(-) diff --git a/design/BulkAPI.md b/design/BulkAPI.md index 2d73b8d0c..566dd5de1 100644 --- a/design/BulkAPI.md +++ b/design/BulkAPI.md @@ -28,18 +28,23 @@ progress of the job. { "filter": { "exclude": { - "namespace": [], - "workload": [], - "containers": [], - "labels": {} + "namespace": ["cadvisor", "openshift-tuning", "openshift-monitoring", "thanos-bench"], + "workload": ["osd-rebalance-infra-nodes-28887030", "blackbox-exporter", "thanos-query"], + "containers": ["tfb-0", "alertmanager"], + "labels": { + "org_id": "ABCOrga", + "source_id": "ZZZ", + "cluster_id": "ABG" + } }, "include": { - "namespace": [], - "workload": [], - "containers": [], + "namespace": ["cadvisor", "openshift-tuning", "openshift-monitoring", "thanos-bench"], + "workload": ["osd-rebalance-infra-nodes-28887030", "blackbox-exporter", "thanos-query"], + "containers": ["tfb-0", "alertmanager"], "labels": { - "key1": "value1", - "key2": "value2" + "org_id": "ABCOrga", + "source_id": "ZZZ", + "cluster_id": "ABG" } } }, @@ -105,10 +110,12 @@ The specified time range determines the period over which the data is analyzed t - The `start` timestamp precedes the `end` timestamp. #### 2. **Request Payload with `exclude` filter specified:** -TBA + +- **`exclude`** filters out namespaces like `"cadvisor"` and workloads like `"blackbox-exporter"`, along with containers and labels that match the specified values. So, we'll generate create experiments and generate recommendations for every namespace, workload and containers except those. #### 3. **Request Payload with `include` filter specified:** -TBA + +- **`include`** explicitly selects the namespaces, workloads, containers, and labels to be queried. So, for only those we'll create experiments and get the recommendations. ### GET Request: diff --git a/src/main/java/com/autotune/analyzer/services/DSMetadataService.java b/src/main/java/com/autotune/analyzer/services/DSMetadataService.java index f24bf190a..762a99d7a 100644 --- a/src/main/java/com/autotune/analyzer/services/DSMetadataService.java +++ b/src/main/java/com/autotune/analyzer/services/DSMetadataService.java @@ -133,7 +133,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) return; } - DataSourceMetadataInfo metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource,"",0,0,0, null, null); + DataSourceMetadataInfo metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource,"",0,0,0, new HashMap<>(), new HashMap<>()); // Validate imported metadataInfo object DataSourceMetadataValidation validationObject = new DataSourceMetadataValidation(); From 165d34575a2ed76a2af3c2199a6f2e5c6c6c967b Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 13:35:53 +0530 Subject: [PATCH 19/85] incorporated review comments Signed-off-by: msvinaykumar --- design/KruizeConfiguration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/design/KruizeConfiguration.md b/design/KruizeConfiguration.md index bd88b74b2..b80116fd0 100644 --- a/design/KruizeConfiguration.md +++ b/design/KruizeConfiguration.md @@ -140,8 +140,8 @@ The following environment variables are set using the `kubectl apply` command wi - Details: This flag is added for getting the details of the inputs passed to the APIs and the corresponding response generated by it. This helps us in debugging the API easily in case of failures. - **isROSEnabled** - - Description: When set to True, the ROS application operates using the legacy architecture, where it manages experiment creation, result updates, and recommendation updates. Setting it to False enables ROS to use the Bulk API, allowing Kruize to handle experiment creation and generate recommendations, while disabling legacy architecture features. + - Description: This flag enables the remote APIs such a updateResults and the corresponding DB Tables. If set to false, the corresponding APIs and the DB tables are not supported. - value: "false" - Details: - Default value: False. - Bulk API functionality is also supported when the value is set to True + Bulk API functionality is supported when the value is set to either True or False. From accc6dcfd9dbf7403346c3ff8cb9fb35b314c1f4 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Wed, 4 Dec 2024 12:58:43 +0530 Subject: [PATCH 20/85] added new tables for create experiment Signed-off-by: msvinaykumar --- migrations/kruize_local_ddl.sql | 4 +- .../experiment/ExperimentValidation.java | 3 +- .../analyzer/workerimpl/BulkJobManager.java | 42 +-- .../autotune/database/dao/ExperimentDAO.java | 3 + .../database/dao/ExperimentDAOImpl.java | 60 +++- .../autotune/database/helper/DBHelpers.java | 298 +++++++++--------- .../database/init/KruizeHibernateUtil.java | 2 + .../database/service/ExperimentDBService.java | 13 +- .../database/table/KruizeExperimentEntry.java | 19 +- .../table/lm/KruizeLMExperimentEntry.java | 168 ++++++++++ 10 files changed, 441 insertions(+), 171 deletions(-) create mode 100644 src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java diff --git a/migrations/kruize_local_ddl.sql b/migrations/kruize_local_ddl.sql index 0c25a38a5..b5844d850 100644 --- a/migrations/kruize_local_ddl.sql +++ b/migrations/kruize_local_ddl.sql @@ -1,6 +1,8 @@ +create table IF NOT EXISTS kruize_lm_experiments (experiment_id varchar(255) not null, cluster_name varchar(255), datasource jsonb, experiment_name varchar(255), extended_data jsonb, meta_data jsonb, mode varchar(255), performance_profile varchar(255), status varchar(255), target_cluster varchar(255), version varchar(255), primary key (experiment_id)); create table IF NOT EXISTS kruize_authentication (id serial, authentication_type varchar(255), credentials jsonb, service_type varchar(255), primary key (id)); create table IF NOT EXISTS kruize_datasources (version varchar(255), name varchar(255), provider varchar(255), serviceName varchar(255), namespace varchar(255), url varchar(255), authentication_id serial, FOREIGN KEY (authentication_id) REFERENCES kruize_authentication(id), primary key (name)); create table IF NOT EXISTS kruize_dsmetadata (id serial, version varchar(255), datasource_name varchar(255), cluster_name varchar(255), namespace varchar(255), workload_type varchar(255), workload_name varchar(255), container_name varchar(255), container_image_name varchar(255), primary key (id)); -alter table kruize_experiments add column experiment_type varchar(255), add column metadata_id bigint references kruize_dsmetadata(id), alter column datasource type varchar(255); +alter table kruize_lm_experiments add column experiment_type varchar(255), add column metadata_id bigint references kruize_dsmetadata(id), alter column datasource type varchar(255); +alter table if exists kruize_lm_experiments add constraint UK_lm_experiment_name unique (experiment_name); create table IF NOT EXISTS kruize_metric_profiles (api_version varchar(255), kind varchar(255), metadata jsonb, name varchar(255) not null, k8s_type varchar(255), profile_version float(53) not null, slo jsonb, primary key (name)); alter table kruize_recommendations add column experiment_type varchar(255); diff --git a/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java b/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java index 246b153dd..243e652ab 100644 --- a/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java +++ b/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java @@ -17,7 +17,6 @@ import com.autotune.analyzer.kruizeObject.KruizeObject; import com.autotune.analyzer.performanceProfiles.PerformanceProfile; -import com.autotune.analyzer.performanceProfiles.PerformanceProfilesDeployment; import com.autotune.analyzer.recommendations.ContainerRecommendations; import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.AnalyzerErrorConstants; @@ -108,7 +107,7 @@ public void validate(List kruizeExptList) { } else { // fetch the Performance / Metric Profile from the DB try { - if (!KruizeDeploymentInfo.local) { + if (KruizeDeploymentInfo.is_ros_enabled && target_cluster.equalsIgnoreCase(AnalyzerConstants.REMOTE)) { // todo call this in function and use across every where new ExperimentDBService().loadPerformanceProfileFromDBByName(performanceProfilesMap, kruizeObject.getPerformanceProfile()); } else { new ExperimentDBService().loadMetricProfileFromDBByName(performanceProfilesMap, kruizeObject.getPerformanceProfile()); diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index c4eb77237..5b9425000 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -133,7 +133,7 @@ public void run() { e.printStackTrace(); BulkJobStatus.Notification notification = DATASOURCE_NOT_REG_INFO; notification.setMessage(String.format(notification.getMessage(), e.getMessage())); - setFinalJobStatus(FAILED,String.valueOf(HttpURLConnection.HTTP_BAD_REQUEST),notification,datasource); + setFinalJobStatus(FAILED, String.valueOf(HttpURLConnection.HTTP_BAD_REQUEST), notification, datasource); } if (null != datasource) { JSONObject daterange = processDateRange(this.bulkInput.getTime_range()); @@ -143,21 +143,21 @@ public void run() { metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, 0, 0, 0); } if (null == metadataInfo) { - setFinalJobStatus(COMPLETED,String.valueOf(HttpURLConnection.HTTP_OK),NOTHING_INFO,datasource); + setFinalJobStatus(COMPLETED, String.valueOf(HttpURLConnection.HTTP_OK), NOTHING_INFO, datasource); } else { Map createExperimentAPIObjectMap = getExperimentMap(labelString, jobData, metadataInfo, datasource); //Todo Store this map in buffer and use it if BulkAPI pods restarts and support experiment_type jobData.setTotal_experiments(createExperimentAPIObjectMap.size()); jobData.setProcessed_experiments(0); if (jobData.getTotal_experiments() > KruizeDeploymentInfo.bulk_api_limit) { - setFinalJobStatus(FAILED,String.valueOf(HttpURLConnection.HTTP_BAD_REQUEST),LIMIT_INFO,datasource); + setFinalJobStatus(FAILED, String.valueOf(HttpURLConnection.HTTP_BAD_REQUEST), LIMIT_INFO, datasource); } else { ExecutorService createExecutor = Executors.newFixedThreadPool(bulk_thread_pool_size); ExecutorService generateExecutor = Executors.newFixedThreadPool(bulk_thread_pool_size); for (CreateExperimentAPIObject apiObject : createExperimentAPIObjectMap.values()) { - String experiment_name = apiObject.getExperimentName(); - BulkJobStatus.Experiment experiment = jobData.addExperiment(experiment_name); DataSourceInfo finalDatasource = datasource; createExecutor.submit(() -> { + String experiment_name = apiObject.getExperimentName(); + BulkJobStatus.Experiment experiment = jobData.addExperiment(experiment_name); try { // send request to createExperiment API for experiment creation GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); @@ -172,16 +172,20 @@ public void run() { } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { expriment_exists = true; } else { - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, responseCode.getResponseBody().toString(), responseCode.getStatusCode())); } } catch (Exception e) { e.printStackTrace(); - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_BAD_REQUEST)); } finally { - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { - setFinalJobStatus(COMPLETED,null,null,finalDatasource); + if (!expriment_exists) { + LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); + jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); + } + synchronized (new Object()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + setFinalJobStatus(COMPLETED, null, null, finalDatasource); + } } } @@ -208,8 +212,10 @@ public void run() { experiment.getRecommendations().setNotifications(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); } finally { jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { - setFinalJobStatus(COMPLETED,null,null,finalDatasource); + synchronized (new Object()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + setFinalJobStatus(COMPLETED, null, null, finalDatasource); + } } } }); @@ -219,7 +225,7 @@ public void run() { experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { - setFinalJobStatus(COMPLETED,null,null,finalDatasource); + setFinalJobStatus(COMPLETED, null, null, finalDatasource); } } }); @@ -238,21 +244,21 @@ public void run() { notification = DATASOURCE_DOWN_INFO; } notification.setMessage(String.format(notification.getMessage(), e.getMessage())); - setFinalJobStatus(FAILED,String.valueOf(HttpURLConnection.HTTP_UNAVAILABLE),notification,datasource); + setFinalJobStatus(FAILED, String.valueOf(HttpURLConnection.HTTP_UNAVAILABLE), notification, datasource); } catch (Exception e) { LOGGER.error(e.getMessage()); e.printStackTrace(); - setFinalJobStatus(FAILED,String.valueOf(HttpURLConnection.HTTP_INTERNAL_ERROR),new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR),datasource); + setFinalJobStatus(FAILED, String.valueOf(HttpURLConnection.HTTP_INTERNAL_ERROR), new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR), datasource); } } - public void setFinalJobStatus(String status,String notificationKey,BulkJobStatus.Notification notification,DataSourceInfo finalDatasource) { + public void setFinalJobStatus(String status, String notificationKey, BulkJobStatus.Notification notification, DataSourceInfo finalDatasource) { jobData.setStatus(status); jobData.setEndTime(Instant.now()); - if(null!=notification) - jobData.setNotification(notificationKey,notification); + if (null != notification) + jobData.setNotification(notificationKey, notification); GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); - if(null != bulkInput.getWebhook() && null != bulkInput.getWebhook().getUrl()) { + if (null != bulkInput.getWebhook() && null != bulkInput.getWebhook().getUrl()) { apiClient.setBaseURL(bulkInput.getWebhook().getUrl()); GenericRestApiClient.HttpResponseWrapper responseCode; BulkJobStatus.Webhook webhook = new BulkJobStatus.Webhook(WebHookStatus.IN_PROGRESS); diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAO.java b/src/main/java/com/autotune/database/dao/ExperimentDAO.java index a2fcab3f0..df72a14d7 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAO.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAO.java @@ -5,6 +5,7 @@ import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.common.data.ValidationOutputData; import com.autotune.database.table.*; +import com.autotune.database.table.lm.KruizeLMExperimentEntry; import java.sql.Timestamp; import java.util.List; @@ -14,6 +15,8 @@ public interface ExperimentDAO { // Add New experiments from local storage to DB and set status to Inprogress public ValidationOutputData addExperimentToDB(KruizeExperimentEntry kruizeExperimentEntry); + public ValidationOutputData addExperimentToDB(KruizeLMExperimentEntry kruizeLMExperimentEntry); + // Add experiment results from local storage to DB and set status to Inprogress public ValidationOutputData addResultsToDB(KruizeResultsEntry resultsEntry); diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index ac37fc50e..472dddc74 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -24,6 +24,7 @@ import com.autotune.database.helper.DBConstants; import com.autotune.database.init.KruizeHibernateUtil; import com.autotune.database.table.*; +import com.autotune.database.table.lm.KruizeLMExperimentEntry; import com.autotune.utils.KruizeConstants; import com.autotune.utils.MetricsConfig; import io.micrometer.core.instrument.Timer; @@ -70,7 +71,7 @@ public ValidationOutputData addExperimentToDB(KruizeExperimentEntry kruizeExperi session.persist(kruizeExperimentEntry); tx.commit(); // TODO: remove native sql query and transient - updateExperimentTypeInKruizeExperimentEntry(kruizeExperimentEntry); + //updateExperimentTypeInKruizeExperimentEntry(kruizeExperimentEntry); #Todo this function no more required and see if it can applied without using update sql validationOutputData.setSuccess(true); statusValue = "success"; } catch (HibernateException e) { @@ -94,6 +95,44 @@ public ValidationOutputData addExperimentToDB(KruizeExperimentEntry kruizeExperi return validationOutputData; } + @Override + public ValidationOutputData addExperimentToDB(KruizeLMExperimentEntry kruizeLMExperimentEntry) { + ValidationOutputData validationOutputData = new ValidationOutputData(false, null, null); + Transaction tx = null; + String statusValue = "failure"; + Timer.Sample timerAddExpDB = Timer.start(MetricsConfig.meterRegistry()); + try { + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + try { + tx = session.beginTransaction(); + session.persist(kruizeLMExperimentEntry); + tx.commit(); + // TODO: remove native sql query and transient + //updateExperimentTypeInKruizeExperimentEntry(kruizeLMExperimentEntry); + validationOutputData.setSuccess(true); + statusValue = "success"; + } catch (HibernateException e) { + LOGGER.error("Not able to save experiment due to {}", e.getMessage()); + if (tx != null) tx.rollback(); + e.printStackTrace(); + validationOutputData.setSuccess(false); + validationOutputData.setMessage(e.getMessage()); + //TODO: save error to API_ERROR_LOG + } + } + } catch (Exception e) { + LOGGER.error("Not able to save experiment due to {}", e.getMessage()); + validationOutputData.setMessage(e.getMessage()); + } finally { + if (null != timerAddExpDB) { + MetricsConfig.timerAddExpDB = MetricsConfig.timerBAddExpDB.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerAddExpDB.stop(MetricsConfig.timerAddExpDB); + } + } + return validationOutputData; + } + + /** * Deletes database partitions based on a specified threshold day count. *

@@ -1144,6 +1183,23 @@ private void getExperimentTypeInKruizeExperimentEntry(List entries) throws Exception { for (KruizeRecommendationEntry recomEntry : entries) { diff --git a/src/main/java/com/autotune/database/helper/DBHelpers.java b/src/main/java/com/autotune/database/helper/DBHelpers.java index 0ebbd0acc..0be649b30 100644 --- a/src/main/java/com/autotune/database/helper/DBHelpers.java +++ b/src/main/java/com/autotune/database/helper/DBHelpers.java @@ -40,6 +40,7 @@ import com.autotune.common.datasource.DataSourceMetadataOperator; import com.autotune.common.k8sObjects.K8sObject; import com.autotune.database.table.*; +import com.autotune.database.table.lm.KruizeLMExperimentEntry; import com.autotune.utils.KruizeConstants; import com.autotune.utils.Utils; import com.fasterxml.jackson.core.JsonProcessingException; @@ -275,6 +276,130 @@ public static void setRecommendationsToKruizeObject(List()); + dataSourceMetadataInfo.getDataSourceHashMap().put(dataSourceName, dataSource); + + return dataSource; + } + + /** + * Retrieves an existing DataSourceCluster from the DB entry or creates a new one if not found. + * + * @param kruizeMetadata KruizeDSMetadataEntry object + * @param dataSource DataSource object + * @return The DataSourceCluster instance associated with the DB entry + */ + private static DataSourceCluster getOrCreateDataSourceClusterFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSource dataSource) { + String clusterName = kruizeMetadata.getClusterName(); + + // Check if the cluster already exists in the DataSource + if (dataSource.getDataSourceClusterHashMap().containsKey(clusterName)) { + return dataSource.getDataSourceClusterHashMap().get(clusterName); + } + + DataSourceCluster dataSourceCluster = new DataSourceCluster(clusterName, new HashMap<>()); + dataSource.getDataSourceClusterHashMap().put(clusterName, dataSourceCluster); + + return dataSourceCluster; + } + + /** + * Retrieves an existing DataSourceNamespace from the DB entry or creates a new one if not found. + * + * @param kruizeMetadata KruizeDSMetadataEntry object + * @param dataSourceCluster DataSourceCluster object + * @return The DataSourceNamespace instance associated with the DB entry + */ + private static DataSourceNamespace getOrCreateDataSourceNamespaceFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceCluster dataSourceCluster) { + String namespaceName = kruizeMetadata.getNamespace(); + + // Check if the namespace already exists in the DataSourceCluster + if (dataSourceCluster.getDataSourceNamespaceHashMap().containsKey(namespaceName)) { + return dataSourceCluster.getDataSourceNamespaceHashMap().get(namespaceName); + } + + DataSourceNamespace dataSourceNamespace = new DataSourceNamespace(namespaceName, new HashMap<>()); + dataSourceCluster.getDataSourceNamespaceHashMap().put(namespaceName, dataSourceNamespace); + + return dataSourceNamespace; + } + + /** + * Retrieves an existing DataSourceWorkload from the DB entry or creates a new one if not found. + * + * @param kruizeMetadata KruizeDSMetadataEntry object + * @param dataSourceNamespace DataSourceNamespace object + * @return The DataSourceWorkload instance associated with the DB entry + */ + private static DataSourceWorkload getOrCreateDataSourceWorkloadFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceNamespace dataSourceNamespace) { + String workloadName = kruizeMetadata.getWorkloadName(); + + if (null == workloadName) { + return null; + } + + // Check if the workload already exists in the DataSourceNamespace + if (dataSourceNamespace.getDataSourceWorkloadHashMap().containsKey(workloadName)) { + return dataSourceNamespace.getDataSourceWorkloadHashMap().get(workloadName); + } + + DataSourceWorkload dataSourceWorkload = new DataSourceWorkload(workloadName, kruizeMetadata.getWorkloadType(), new HashMap<>()); + dataSourceNamespace.getDataSourceWorkloadHashMap().put(workloadName, dataSourceWorkload); + + return dataSourceWorkload; + } + + /** + * Retrieves an existing DataSourceContainer from the DB entry or creates a new one if not found. + * + * @param kruizeMetadata KruizeDSMetadataEntry object + * @param dataSourceWorkload DataSourceWorkload object + * @return The DataSourceContainer instance associated with the DB entry + */ + private static DataSourceContainer getOrCreateDataSourceContainerFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceWorkload dataSourceWorkload) { + String containerName = kruizeMetadata.getContainerName(); + + if (null == containerName) { + return null; + } + + // Check if the container already exists in the DataSourceWorkload + if (dataSourceWorkload.getDataSourceContainerHashMap().containsKey(containerName)) { + return dataSourceWorkload.getDataSourceContainerHashMap().get(containerName); + } + + DataSourceContainer dataSourceContainer = new DataSourceContainer(containerName, kruizeMetadata.getContainerImageName()); + dataSourceWorkload.getDataSourceContainerHashMap().put(containerName, dataSourceContainer); + + return dataSourceContainer; + } + + private static KruizeDSMetadataEntry getMetadata(String datasource) { + DataSourceMetadataOperator dataSourceMetadataOperator = DataSourceMetadataOperator.getInstance(); + HashMap dataSources = DataSourceCollection.getInstance().getDataSourcesCollection(); + DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.getDataSourceMetadataInfo(dataSources.get(datasource)); + List kruizeMetadataList = Converters.KruizeObjectConverters.convertDataSourceMetadataToMetadataObj(dataSourceMetadataInfo); + if (kruizeMetadataList.isEmpty()) + return null; + else + return kruizeMetadataList.get(0); + } + public static class Converters { private Converters() { @@ -291,25 +416,25 @@ private KruizeObjectConverters() { * @return KruizeExperimentEntry * This methode facilitate to store data into db by accumulating required data from KruizeObject. */ - public static KruizeExperimentEntry convertCreateAPIObjToExperimentDBObj(CreateExperimentAPIObject apiObject) { - KruizeExperimentEntry kruizeExperimentEntry = null; + public static KruizeLMExperimentEntry convertCreateAPIObjToExperimentDBObj(CreateExperimentAPIObject apiObject) { + KruizeLMExperimentEntry kruizeLMExperimentEntry = null; try { - kruizeExperimentEntry = new KruizeExperimentEntry(); - kruizeExperimentEntry.setExperiment_name(apiObject.getExperimentName()); - kruizeExperimentEntry.setExperiment_id(Utils.generateID(apiObject)); - kruizeExperimentEntry.setCluster_name(apiObject.getClusterName()); - kruizeExperimentEntry.setMode(apiObject.getMode()); - kruizeExperimentEntry.setPerformance_profile(apiObject.getPerformanceProfile()); - kruizeExperimentEntry.setVersion(apiObject.getApiVersion()); - kruizeExperimentEntry.setTarget_cluster(apiObject.getTargetCluster()); - kruizeExperimentEntry.setStatus(AnalyzerConstants.ExperimentStatus.IN_PROGRESS); - kruizeExperimentEntry.setMeta_data(null); - kruizeExperimentEntry.setDatasource(null); - kruizeExperimentEntry.setExperimentType(apiObject.getExperimentType()); + kruizeLMExperimentEntry = new KruizeLMExperimentEntry(); + kruizeLMExperimentEntry.setExperiment_name(apiObject.getExperimentName()); + kruizeLMExperimentEntry.setExperiment_id(Utils.generateID(apiObject)); + kruizeLMExperimentEntry.setCluster_name(apiObject.getClusterName()); + kruizeLMExperimentEntry.setMode(apiObject.getMode()); + kruizeLMExperimentEntry.setPerformance_profile(apiObject.getPerformanceProfile()); + kruizeLMExperimentEntry.setVersion(apiObject.getApiVersion()); + kruizeLMExperimentEntry.setTarget_cluster(apiObject.getTargetCluster()); + kruizeLMExperimentEntry.setStatus(AnalyzerConstants.ExperimentStatus.IN_PROGRESS); + kruizeLMExperimentEntry.setMeta_data(null); + kruizeLMExperimentEntry.setDatasource(null); + kruizeLMExperimentEntry.setExperimentType(apiObject.getExperimentType()); ObjectMapper objectMapper = new ObjectMapper(); try { - kruizeExperimentEntry.setExtended_data( + kruizeLMExperimentEntry.setExtended_data( objectMapper.readTree( new Gson().toJson(apiObject) ) @@ -318,11 +443,11 @@ public static KruizeExperimentEntry convertCreateAPIObjToExperimentDBObj(CreateE throw new Exception("Error while creating Extended data due to : " + e.getMessage()); } } catch (Exception e) { - kruizeExperimentEntry = null; + kruizeLMExperimentEntry = null; LOGGER.error("Error while converting Kruize Object to experimentDetailTable due to {}", e.getMessage()); e.printStackTrace(); } - return kruizeExperimentEntry; + return kruizeLMExperimentEntry; } /** @@ -745,6 +870,7 @@ public static List convertPerformanceProfileEntryToPerforman /** * converts MetricProfile object to KruizeMetricProfileEntry table object + * * @param metricProfile metricProfile object to be converted * @return KruizeMetricProfileEntry table object */ @@ -782,6 +908,7 @@ public static KruizeMetricProfileEntry convertMetricProfileObjToMetricProfileDBO /** * converts KruizeMetricProfileEntry table objects to MetricProfile objects + * * @param kruizeMetricProfileEntryList List of KruizeMetricProfileEntry table objects to be converted * @return List containing the MetricProfile objects * @throws Exception @@ -814,6 +941,7 @@ public static List convertMetricProfileEntryToMetricProfileO /** * converts KruizeDataSourceEntry table objects to DataSourceInfo objects + * * @param kruizeDataSourceList List containing the KruizeDataSourceEntry table objects * @return List containing the DataSourceInfo objects */ @@ -842,7 +970,7 @@ public static List convertKruizeDataSourceToDataSourceObject(Lis if (kruizeDataSource.getServiceName().isEmpty() && null != kruizeDataSource.getUrl()) { dataSourceInfo = new DataSourceInfo(kruizeDataSource.getName(), kruizeDataSource .getProvider(), null, null, new URL(kruizeDataSource.getUrl()), authConfig); - } else{ + } else { dataSourceInfo = new DataSourceInfo(kruizeDataSource.getName(), kruizeDataSource .getProvider(), kruizeDataSource.getServiceName(), kruizeDataSource.getNamespace(), null, authConfig); } @@ -861,6 +989,7 @@ public static List convertKruizeDataSourceToDataSourceObject(Lis /** * converts DataSourceInfo objects to KruizeDataSourceEntry table objects + * * @param dataSourceInfo DataSourceInfo objects * @return KruizeDataSourceEntry table object */ @@ -885,7 +1014,8 @@ public static KruizeDataSourceEntry convertDataSourceToDataSourceDBObj(DataSourc /** * converts DataSourceMetadataInfo objects to KruizeDSMetadataEntry table objects - * @param kruizeMetadataList List of KruizeDSMetadataEntry objects + * + * @param kruizeMetadataList List of KruizeDSMetadataEntry objects * @return DataSourceMetadataInfo object * @throws Exception */ @@ -933,7 +1063,8 @@ public static List convertKruizeMetadataToDataSourceMeta /** * Converts KruizeDSMetadataEntry table objects to DataSourceMetadataInfo with only cluster-level metadata - * @param kruizeMetadataList KruizeDSMetadataEntry objects + * + * @param kruizeMetadataList KruizeDSMetadataEntry objects * @return DataSourceMetadataInfo object with only cluster-level metadata * @throws Exception */ @@ -974,8 +1105,9 @@ public static List convertKruizeMetadataToClusterLevelDa /** * Converts KruizeDSMetadataEntry table objects to DataSourceMetadataInfo with only namespace-level metadata - * @param kruizeMetadataList List of KruizeDSMetadataEntry objects - * @return DataSourceMetadataInfo with only namespace-level metadata + * + * @param kruizeMetadataList List of KruizeDSMetadataEntry objects + * @return DataSourceMetadataInfo with only namespace-level metadata * @throws Exception */ public static List convertKruizeMetadataToNamespaceLevelDataSourceMetadata(List kruizeMetadataList) throws Exception { @@ -1019,6 +1151,7 @@ public static List convertKruizeMetadataToNamespaceLevel /** * Converts DataSourceMetadataInfo object to KruizeDSMetadataEntry objects + * * @param dataSourceMetadataInfo DataSourceMetadataInfo object * @return List of KruizeDSMetadataEntry objects */ @@ -1051,7 +1184,7 @@ public static List convertDataSourceMetadataToMetadataObj for (DataSourceWorkload dataSourceWorkload : dataSourceNamespace.getDataSourceWorkloadHashMap().values()) { // handles 'job' workload type with no containers - if(null == dataSourceWorkload.getDataSourceContainerHashMap()) { + if (null == dataSourceWorkload.getDataSourceContainerHashMap()) { KruizeDSMetadataEntry kruizeMetadata = new KruizeDSMetadataEntry(); kruizeMetadata.setVersion(KruizeConstants.DataSourceConstants.DataSourceMetadataInfoConstants.version); @@ -1120,123 +1253,4 @@ public static KruizeAuthenticationEntry convertAuthDetailsToAuthDetailsDBObj(Aut } } - - /** - * Retrieves an existing DataSource from the DB entry or creates a new one if not found. - * @param kruizeMetadata KruizeDSMetadataEntry object - * @param dataSourceMetadataInfo DataSourceMetadataInfo object - * @return The DataSource instance associated with the DB entry. - */ - private static DataSource getOrCreateDataSourceFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceMetadataInfo dataSourceMetadataInfo) { - String dataSourceName = kruizeMetadata.getDataSourceName(); - - // Check if the data source already exists - if (dataSourceMetadataInfo.getDataSourceHashMap().containsKey(dataSourceName)) { - return dataSourceMetadataInfo.getDataSourceHashMap().get(dataSourceName); - } - - DataSource dataSource = new DataSource(dataSourceName, new HashMap<>()); - dataSourceMetadataInfo.getDataSourceHashMap().put(dataSourceName, dataSource); - - return dataSource; - } - - /** - * Retrieves an existing DataSourceCluster from the DB entry or creates a new one if not found. - * @param kruizeMetadata KruizeDSMetadataEntry object - * @param dataSource DataSource object - * @return The DataSourceCluster instance associated with the DB entry - */ - private static DataSourceCluster getOrCreateDataSourceClusterFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSource dataSource) { - String clusterName = kruizeMetadata.getClusterName(); - - // Check if the cluster already exists in the DataSource - if (dataSource.getDataSourceClusterHashMap().containsKey(clusterName)) { - return dataSource.getDataSourceClusterHashMap().get(clusterName); - } - - DataSourceCluster dataSourceCluster = new DataSourceCluster(clusterName, new HashMap<>()); - dataSource.getDataSourceClusterHashMap().put(clusterName, dataSourceCluster); - - return dataSourceCluster; - } - - /** - * Retrieves an existing DataSourceNamespace from the DB entry or creates a new one if not found. - * @param kruizeMetadata KruizeDSMetadataEntry object - * @param dataSourceCluster DataSourceCluster object - * @return The DataSourceNamespace instance associated with the DB entry - */ - private static DataSourceNamespace getOrCreateDataSourceNamespaceFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceCluster dataSourceCluster) { - String namespaceName = kruizeMetadata.getNamespace(); - - // Check if the namespace already exists in the DataSourceCluster - if (dataSourceCluster.getDataSourceNamespaceHashMap().containsKey(namespaceName)) { - return dataSourceCluster.getDataSourceNamespaceHashMap().get(namespaceName); - } - - DataSourceNamespace dataSourceNamespace = new DataSourceNamespace(namespaceName, new HashMap<>()); - dataSourceCluster.getDataSourceNamespaceHashMap().put(namespaceName, dataSourceNamespace); - - return dataSourceNamespace; - } - - /** - * Retrieves an existing DataSourceWorkload from the DB entry or creates a new one if not found. - * @param kruizeMetadata KruizeDSMetadataEntry object - * @param dataSourceNamespace DataSourceNamespace object - * @return The DataSourceWorkload instance associated with the DB entry - */ - private static DataSourceWorkload getOrCreateDataSourceWorkloadFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceNamespace dataSourceNamespace) { - String workloadName = kruizeMetadata.getWorkloadName(); - - if (null == workloadName) { - return null; - } - - // Check if the workload already exists in the DataSourceNamespace - if (dataSourceNamespace.getDataSourceWorkloadHashMap().containsKey(workloadName)) { - return dataSourceNamespace.getDataSourceWorkloadHashMap().get(workloadName); - } - - DataSourceWorkload dataSourceWorkload = new DataSourceWorkload(workloadName, kruizeMetadata.getWorkloadType(), new HashMap<>()); - dataSourceNamespace.getDataSourceWorkloadHashMap().put(workloadName, dataSourceWorkload); - - return dataSourceWorkload; - } - - /** - * Retrieves an existing DataSourceContainer from the DB entry or creates a new one if not found. - * @param kruizeMetadata KruizeDSMetadataEntry object - * @param dataSourceWorkload DataSourceWorkload object - * @return The DataSourceContainer instance associated with the DB entry - */ - private static DataSourceContainer getOrCreateDataSourceContainerFromDB(KruizeDSMetadataEntry kruizeMetadata, DataSourceWorkload dataSourceWorkload) { - String containerName = kruizeMetadata.getContainerName(); - - if (null == containerName) { - return null; - } - - // Check if the container already exists in the DataSourceWorkload - if (dataSourceWorkload.getDataSourceContainerHashMap().containsKey(containerName)) { - return dataSourceWorkload.getDataSourceContainerHashMap().get(containerName); - } - - DataSourceContainer dataSourceContainer = new DataSourceContainer(containerName, kruizeMetadata.getContainerImageName()); - dataSourceWorkload.getDataSourceContainerHashMap().put(containerName, dataSourceContainer); - - return dataSourceContainer; - } - - private static KruizeDSMetadataEntry getMetadata(String datasource) { - DataSourceMetadataOperator dataSourceMetadataOperator = DataSourceMetadataOperator.getInstance(); - HashMap dataSources = DataSourceCollection.getInstance().getDataSourcesCollection(); - DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.getDataSourceMetadataInfo(dataSources.get(datasource)); - List kruizeMetadataList = Converters.KruizeObjectConverters.convertDataSourceMetadataToMetadataObj(dataSourceMetadataInfo); - if (kruizeMetadataList.isEmpty()) - return null; - else - return kruizeMetadataList.get(0); - } } diff --git a/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java b/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java index 554cb051a..7d41041f1 100644 --- a/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java +++ b/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java @@ -17,6 +17,7 @@ import com.autotune.database.table.*; +import com.autotune.database.table.lm.KruizeLMExperimentEntry; import com.autotune.operator.KruizeDeploymentInfo; import org.hibernate.Session; import org.hibernate.SessionFactory; @@ -57,6 +58,7 @@ public static void buildSessionFactory() { configuration.addAnnotatedClass(KruizeRecommendationEntry.class); configuration.addAnnotatedClass(KruizePerformanceProfileEntry.class); if (KruizeDeploymentInfo.local) { + configuration.addAnnotatedClass(KruizeLMExperimentEntry.class); configuration.addAnnotatedClass(KruizeDataSourceEntry.class); configuration.addAnnotatedClass(KruizeDSMetadataEntry.class); configuration.addAnnotatedClass(KruizeMetricProfileEntry.class); diff --git a/src/main/java/com/autotune/database/service/ExperimentDBService.java b/src/main/java/com/autotune/database/service/ExperimentDBService.java index 765438b60..79a1d2eaa 100644 --- a/src/main/java/com/autotune/database/service/ExperimentDBService.java +++ b/src/main/java/com/autotune/database/service/ExperimentDBService.java @@ -33,6 +33,7 @@ import com.autotune.database.helper.DBConstants; import com.autotune.database.helper.DBHelpers; import com.autotune.database.table.*; +import com.autotune.database.table.lm.KruizeLMExperimentEntry; import com.autotune.operator.KruizeDeploymentInfo; import com.autotune.operator.KruizeOperator; import org.slf4j.Logger; @@ -204,8 +205,14 @@ public void loadRecommendationsFromDBByName(Map mainKruize public ValidationOutputData addExperimentToDB(CreateExperimentAPIObject createExperimentAPIObject) { ValidationOutputData validationOutputData = new ValidationOutputData(false, null, null); try { - KruizeExperimentEntry kruizeExperimentEntry = DBHelpers.Converters.KruizeObjectConverters.convertCreateAPIObjToExperimentDBObj(createExperimentAPIObject); - validationOutputData = this.experimentDAO.addExperimentToDB(kruizeExperimentEntry); + KruizeLMExperimentEntry kruizeLMExperimentEntry = DBHelpers.Converters.KruizeObjectConverters.convertCreateAPIObjToExperimentDBObj(createExperimentAPIObject); + LOGGER.debug("is_ros_enabled:{} , targetCluster:{} ", KruizeDeploymentInfo.is_ros_enabled, createExperimentAPIObject.getTargetCluster()); + if (KruizeDeploymentInfo.is_ros_enabled && createExperimentAPIObject.getTargetCluster().equalsIgnoreCase(AnalyzerConstants.REMOTE)) { + KruizeExperimentEntry oldKruizeExperimentEntry = new KruizeExperimentEntry(kruizeLMExperimentEntry); + validationOutputData = this.experimentDAO.addExperimentToDB(oldKruizeExperimentEntry); + } else { + validationOutputData = this.experimentDAO.addExperimentToDB(kruizeLMExperimentEntry); + } } catch (Exception e) { LOGGER.error("Not able to save experiment due to {}", e.getMessage()); } @@ -445,7 +452,7 @@ public List getExperimentResultData(String experiment_name /** * adds datasource to database table * - * @param dataSourceInfo DataSourceInfo object + * @param dataSourceInfo DataSourceInfo object * @param validationOutputData contains validation data * @return ValidationOutputData object */ diff --git a/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java b/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java index 01908cdcd..a794a4f84 100644 --- a/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java +++ b/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java @@ -17,13 +17,11 @@ import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.database.helper.GenerateExperimentID; -import com.autotune.utils.KruizeConstants; +import com.autotune.database.table.lm.KruizeLMExperimentEntry; import com.fasterxml.jackson.databind.JsonNode; -import com.google.gson.annotations.SerializedName; import jakarta.persistence.*; import org.hibernate.annotations.JdbcTypeCode; import org.hibernate.type.SqlTypes; -import java.util.List; /** * This is a Java class named KruizeExperimentEntry annotated with JPA annotations. @@ -71,6 +69,21 @@ public class KruizeExperimentEntry { // TODO: update KruizeDSMetadataEntry + public KruizeExperimentEntry(KruizeLMExperimentEntry kruizeLMExperimentEntry) { + this.experiment_id = kruizeLMExperimentEntry.getExperiment_id(); + this.version = kruizeLMExperimentEntry.getVersion(); + this.experiment_name = kruizeLMExperimentEntry.getExperiment_name(); + this.cluster_name = kruizeLMExperimentEntry.getCluster_name(); + this.mode = kruizeLMExperimentEntry.getMode(); + this.target_cluster = kruizeLMExperimentEntry.getTarget_cluster(); + this.performance_profile = kruizeLMExperimentEntry.getPerformance_profile(); + this.experiment_type = kruizeLMExperimentEntry.getExperimentType(); + this.status = kruizeLMExperimentEntry.getStatus(); + this.datasource = kruizeLMExperimentEntry.getDatasource(); + this.extended_data = kruizeLMExperimentEntry.getExtended_data(); + this.meta_data = kruizeLMExperimentEntry.getMeta_data(); + } + public String getVersion() { return version; } diff --git a/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java b/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java new file mode 100644 index 000000000..30e25f026 --- /dev/null +++ b/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java @@ -0,0 +1,168 @@ +/******************************************************************************* + * Copyright (c) 2023 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ +package com.autotune.database.table.lm; + +import com.autotune.analyzer.utils.AnalyzerConstants; +import com.autotune.database.helper.GenerateExperimentID; +import com.fasterxml.jackson.databind.JsonNode; +import jakarta.persistence.*; +import org.hibernate.annotations.JdbcTypeCode; +import org.hibernate.type.SqlTypes; + +/** + * This is a Java class named KruizeExperimentEntry annotated with JPA annotations. + * It represents a table named kruize_experiment in a relational database. + *

+ * The class has the following fields: + *

+ * id: A unique identifier for each experiment detail. + * version: A string representing the version of the experiment. + * experimentName: A string representing the name of the experiment. + * clusterName: A string representing the name of the cluster. + * mode: A string representing the mode of the experiment. + * targetCluster: A string representing the target cluster for the experiment. + * performance_profile: A string representing the performance profile for the experiment. + * status: An enum representing the status of the experiment, defined in AnalyzerConstants.ExperimentStatus. + * extended_data: A JSON object representing extended data for the experiment. + * meta_data: A string representing metadata for the experiment. + * The ExperimentDetail class also has getters and setters for all its fields. + */ +@Entity +@Table(name = "kruize_lm_experiments") +@IdClass(GenerateExperimentID.class) +public class KruizeLMExperimentEntry { + @Id + //@GeneratedValue(strategy = GenerationType.IDENTITY) + private String experiment_id; + private String version; + @Column(unique = true) + private String experiment_name; + private String cluster_name; + private String mode; + private String target_cluster; + private String performance_profile; + @Transient + private String experiment_type; + @Enumerated(EnumType.STRING) + private AnalyzerConstants.ExperimentStatus status; + @JdbcTypeCode(SqlTypes.JSON) + private JsonNode datasource; + @JdbcTypeCode(SqlTypes.JSON) + private JsonNode extended_data; + @JdbcTypeCode(SqlTypes.JSON) + private JsonNode meta_data; + +// TODO: update KruizeDSMetadataEntry + + + public String getVersion() { + return version; + } + + public void setVersion(String version) { + this.version = version; + } + + public String getExperiment_name() { + return experiment_name; + } + + public void setExperiment_name(String experiment_name) { + this.experiment_name = experiment_name; + } + + public String getCluster_name() { + return cluster_name; + } + + public void setCluster_name(String cluster_name) { + this.cluster_name = cluster_name; + } + + public String getMode() { + return mode; + } + + public void setMode(String mode) { + this.mode = mode; + } + + public String getTarget_cluster() { + return target_cluster; + } + + public void setTarget_cluster(String target_cluster) { + this.target_cluster = target_cluster; + } + + public String getPerformance_profile() { + return performance_profile; + } + + public void setPerformance_profile(String performance_profile) { + this.performance_profile = performance_profile; + } + + public JsonNode getExtended_data() { + return extended_data; + } + + public void setExtended_data(JsonNode extended_data) { + this.extended_data = extended_data; + } + + public JsonNode getMeta_data() { + return meta_data; + } + + public void setMeta_data(JsonNode meta_data) { + this.meta_data = meta_data; + } + + public AnalyzerConstants.ExperimentStatus getStatus() { + return status; + } + + public void setStatus(AnalyzerConstants.ExperimentStatus status) { + this.status = status; + } + + public String getExperiment_id() { + return experiment_id; + } + + public void setExperiment_id(String experiment_id) { + this.experiment_id = experiment_id; + } + + public JsonNode getDatasource() { + return datasource; + } + + public void setDatasource(JsonNode datasource) { + this.datasource = datasource; + } + + public String getExperimentType() { + return experiment_type; + } + + public void setExperimentType(String experimentType) { + this.experiment_type = experimentType; + } + + +} From 51682bf4f54b6649ccaf3274f228b587a4a0a7b3 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Fri, 6 Dec 2024 12:34:59 +0530 Subject: [PATCH 21/85] 3-Migration CreateExperiment API rm and lm support contd.. Signed-off-by: msvinaykumar --- migrations/kruize_local_ddl.sql | 4 +- .../experiment/ExperimentValidation.java | 6 ++- .../analyzer/kruizeObject/KruizeObject.java | 30 +++++------ .../engine/RecommendationEngine.java | 16 ++++-- .../analyzer/serviceObjects/Converters.java | 9 ++-- .../CreateExperimentAPIObject.java | 27 +++++----- .../ListRecommendationsAPIObject.java | 8 +-- .../analyzer/services/CreateExperiment.java | 21 ++++---- .../services/GenerateRecommendations.java | 22 +++----- .../services/UpdateRecommendations.java | 3 +- .../analyzer/utils/AnalyzerConstants.java | 9 ++++ .../analyzer/utils/ExperimentTypeAware.java | 4 +- .../analyzer/utils/ExperimentTypeUtil.java | 8 +-- .../analyzer/workerimpl/BulkJobManager.java | 2 +- .../autotune/database/dao/ExperimentDAO.java | 4 ++ .../database/dao/ExperimentDAOImpl.java | 36 +++++++++++-- .../autotune/database/helper/DBConstants.java | 1 + .../autotune/database/helper/DBHelpers.java | 29 ++++++++-- .../database/service/ExperimentDBService.java | 26 +++++++++ .../database/table/KruizeExperimentEntry.java | 14 ++--- .../table/lm/KruizeLMExperimentEntry.java | 54 +++++++++++++++++-- 21 files changed, 230 insertions(+), 103 deletions(-) diff --git a/migrations/kruize_local_ddl.sql b/migrations/kruize_local_ddl.sql index b5844d850..fc5474b26 100644 --- a/migrations/kruize_local_ddl.sql +++ b/migrations/kruize_local_ddl.sql @@ -1,8 +1,8 @@ -create table IF NOT EXISTS kruize_lm_experiments (experiment_id varchar(255) not null, cluster_name varchar(255), datasource jsonb, experiment_name varchar(255), extended_data jsonb, meta_data jsonb, mode varchar(255), performance_profile varchar(255), status varchar(255), target_cluster varchar(255), version varchar(255), primary key (experiment_id)); +create table IF NOT EXISTS kruize_lm_experiments (experiment_id varchar(255) not null, cluster_name varchar(255), experiment_name varchar(255), extended_data jsonb, meta_data jsonb, mode varchar(255), performance_profile varchar(255), status varchar(255), target_cluster varchar(255), version varchar(255),experiment_type varchar(255),datasource varchar(255),creation_date timestamp(6) ,updated_date timestamp(6) , primary key (experiment_id)); create table IF NOT EXISTS kruize_authentication (id serial, authentication_type varchar(255), credentials jsonb, service_type varchar(255), primary key (id)); create table IF NOT EXISTS kruize_datasources (version varchar(255), name varchar(255), provider varchar(255), serviceName varchar(255), namespace varchar(255), url varchar(255), authentication_id serial, FOREIGN KEY (authentication_id) REFERENCES kruize_authentication(id), primary key (name)); create table IF NOT EXISTS kruize_dsmetadata (id serial, version varchar(255), datasource_name varchar(255), cluster_name varchar(255), namespace varchar(255), workload_type varchar(255), workload_name varchar(255), container_name varchar(255), container_image_name varchar(255), primary key (id)); -alter table kruize_lm_experiments add column experiment_type varchar(255), add column metadata_id bigint references kruize_dsmetadata(id), alter column datasource type varchar(255); +alter table kruize_lm_experiments add column metadata_id bigint references kruize_dsmetadata(id); alter table if exists kruize_lm_experiments add constraint UK_lm_experiment_name unique (experiment_name); create table IF NOT EXISTS kruize_metric_profiles (api_version varchar(255), kind varchar(255), metadata jsonb, name varchar(255) not null, k8s_type varchar(255), profile_version float(53) not null, slo jsonb, primary key (name)); alter table kruize_recommendations add column experiment_type varchar(255); diff --git a/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java b/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java index 243e652ab..2a2729593 100644 --- a/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java +++ b/src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java @@ -89,7 +89,11 @@ public void validate(List kruizeExptList) { if (validationOutputData.isSuccess()) { String expName = kruizeObject.getExperimentName(); try { - new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, expName); + if (KruizeDeploymentInfo.is_ros_enabled && kruizeObject.getTarget_cluster().equalsIgnoreCase(AnalyzerConstants.REMOTE)) { // todo call this in function and use across every where + new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, expName); + } else { + new ExperimentDBService().loadLMExperimentFromDBByName(mainKruizeExperimentMAP, expName); + } } catch (Exception e) { LOGGER.error("Loading saved experiment {} failed: {} ", expName, e.getMessage()); } diff --git a/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java b/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java index d86d399a0..28f18a3e4 100644 --- a/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java +++ b/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java @@ -50,7 +50,7 @@ public final class KruizeObject implements ExperimentTypeAware { @SerializedName("datasource") private String datasource; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) //TODO: to be used in future - private String experimentType; + private AnalyzerConstants.ExperimentType experimentType; private String namespace; // TODO: Currently adding it at this level with an assumption that there is only one entry in k8s object needs to be changed private String mode; //Todo convert into Enum @SerializedName("target_cluster") @@ -122,10 +122,10 @@ public KruizeObject() { * Sets default terms for a KruizeObject. * This method initializes a map with predefined terms like "SHORT_TERM", "MEDIUM_TERM", and "LONG_TERM". * Each term is defined by a Terms object containing: Name of the term (e.g., "SHORT_TERM"), Duration (in days) to - be considered under that term, Threshold for the duration. + be considered under that term, Threshold for the duration. * Note: Currently, specific term names like "daily", "weekly", and "fortnightly" are not defined. * This method also requires implementing CustomResourceDefinition yaml for managing terms. This - functionality is not currently included. + functionality is not currently included. @param terms A map to store the default terms with term name as the key and Terms object as the value. @param kruizeObject The KruizeObject for which the default terms are being set. */ @@ -301,24 +301,14 @@ public void setDataSource(String datasource) { this.datasource = datasource; } - @Override - public String getExperimentType() { + public AnalyzerConstants.ExperimentType getExperimentType() { return experimentType; } - public void setExperimentType(String experimentType) { + public void setExperimentType(AnalyzerConstants.ExperimentType experimentType) { this.experimentType = experimentType; } - @Override - public boolean isNamespaceExperiment() { - return ExperimentTypeUtil.isNamespaceExperiment(experimentType); - } - - @Override - public boolean isContainerExperiment() { - return ExperimentTypeUtil.isContainerExperiment(experimentType); - } @Override public String toString() { @@ -347,4 +337,14 @@ public String toString() { ", kubernetes_objects=" + kubernetes_objects + '}'; } + + @Override + public boolean isNamespaceExperiment() { + return ExperimentTypeUtil.isNamespaceExperiment(experimentType); + } + + @Override + public boolean isContainerExperiment() { + return ExperimentTypeUtil.isContainerExperiment(experimentType); + } } diff --git a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java index a52bd1bc7..68465e4c6 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java +++ b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java @@ -192,11 +192,17 @@ public void setKruizeObject(KruizeObject kruizeObject) { this.kruizeObject = kruizeObject; } - private KruizeObject createKruizeObject() { + private KruizeObject createKruizeObject(String target_cluster) { Map mainKruizeExperimentMAP = new ConcurrentHashMap<>(); KruizeObject kruizeObject = new KruizeObject(); try { - new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); + + if (KruizeDeploymentInfo.is_ros_enabled && null != target_cluster && target_cluster.equalsIgnoreCase(AnalyzerConstants.REMOTE)) { // todo call this in function and use across every where + new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); + } else { + new ExperimentDBService().loadLMExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); + } + if (null != mainKruizeExperimentMAP.get(experimentName)) { kruizeObject = mainKruizeExperimentMAP.get(experimentName); kruizeObject.setValidation_data(new ValidationOutputData(true, null, null)); @@ -259,7 +265,7 @@ public String validate_local() { //TODO Instead of relying on the 'lo * @param calCount The count of incoming requests. * @return The KruizeObject containing the prepared recommendations. */ - public KruizeObject prepareRecommendations(int calCount) throws FetchMetricsError { + public KruizeObject prepareRecommendations(int calCount, String target_cluster) throws FetchMetricsError { Map mainKruizeExperimentMAP = new ConcurrentHashMap<>(); Map terms = new HashMap<>(); ValidationOutputData validationOutputData; @@ -269,7 +275,7 @@ public KruizeObject prepareRecommendations(int calCount) throws FetchMetricsErro intervalEndTimeStr); setInterval_end_time(interval_end_time); } - KruizeObject kruizeObject = createKruizeObject(); + KruizeObject kruizeObject = createKruizeObject(target_cluster); if (!kruizeObject.getValidation_data().isSuccess()) return kruizeObject; setKruizeObject(kruizeObject); @@ -2388,7 +2394,7 @@ private void prepareIntervalResults(Map dataResultsM * @param maxDateQuery maxDateQuery metric to be filtered out * @param experimentType experiment type */ - public List filterMetricsBasedOnExpTypeAndK8sObject(PerformanceProfile metricProfile, String maxDateQuery, String experimentType) { + public List filterMetricsBasedOnExpTypeAndK8sObject(PerformanceProfile metricProfile, String maxDateQuery, AnalyzerConstants.ExperimentType experimentType) { String namespace = KruizeConstants.JSONKeys.NAMESPACE; String container = KruizeConstants.JSONKeys.CONTAINER; return metricProfile.getSloInfo().getFunctionVariables().stream() diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java b/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java index eab5ec99e..176d678d0 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java @@ -10,8 +10,6 @@ import com.autotune.analyzer.recommendations.NamespaceRecommendations; import com.autotune.analyzer.recommendations.objects.MappedRecommendationForTimestamp; import com.autotune.analyzer.utils.AnalyzerConstants; -import com.autotune.analyzer.utils.AnalyzerErrorConstants; -import com.autotune.analyzer.utils.ExperimentTypeUtil; import com.autotune.common.data.ValidationOutputData; import com.autotune.common.data.metrics.AggregationFunctions; import com.autotune.common.data.metrics.Metric; @@ -23,6 +21,8 @@ import com.autotune.common.k8sObjects.K8sObject; import com.autotune.utils.KruizeConstants; import com.autotune.utils.Utils; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.databind.node.ObjectNode; import com.google.gson.Gson; import org.json.JSONArray; import org.json.JSONObject; @@ -31,12 +31,9 @@ import java.sql.Timestamp; import java.util.ArrayList; -import java.util.Date; import java.util.HashMap; import java.util.List; import java.util.concurrent.ConcurrentHashMap; -import com.fasterxml.jackson.databind.ObjectMapper; -import com.fasterxml.jackson.databind.node.ObjectNode; public class Converters { private Converters() { @@ -70,7 +67,7 @@ public static KruizeObject convertCreateExperimentAPIObjToKruizeObject(CreateExp // namespace recommendations experiment type k8sObject = createNamespaceExperiment(kubernetesAPIObject); } - LOGGER.debug("Experiment Type: " + createExperimentAPIObject.getExperimentType()); + LOGGER.debug("Experiment Type: {}", createExperimentAPIObject.getExperimentType()); k8sObjectList.add(k8sObject); } kruizeObject.setKubernetes_objects(k8sObjectList); diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java b/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java index 6985beff2..76ae02f69 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java @@ -50,7 +50,7 @@ public class CreateExperimentAPIObject extends BaseSO implements ExperimentTypeA @SerializedName(KruizeConstants.JSONKeys.DATASOURCE) //TODO: to be used in future private String datasource; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) //TODO: to be used in future - private String experimentType; + private AnalyzerConstants.ExperimentType experimentType; private AnalyzerConstants.ExperimentStatus status; private String experiment_id; // this id is UUID and getting set at createExperiment API private ValidationOutputData validationData; // This object indicates if this API object is valid or invalid @@ -151,25 +151,14 @@ public void setDatasource(String datasource) { this.datasource = datasource; } - @Override - public String getExperimentType() { + public AnalyzerConstants.ExperimentType getExperimentType() { return experimentType; } - public void setExperimentType(String experimentType) { + public void setExperimentType(AnalyzerConstants.ExperimentType experimentType) { this.experimentType = experimentType; } - @Override - public boolean isNamespaceExperiment() { - return ExperimentTypeUtil.isNamespaceExperiment(experimentType); - } - - @Override - public boolean isContainerExperiment() { - return ExperimentTypeUtil.isContainerExperiment(experimentType); - } - @Override public String toString() { return "CreateExperimentAPIObject{" + @@ -186,5 +175,15 @@ public String toString() { ", recommendationSettings=" + recommendationSettings + '}'; } + + @Override + public boolean isNamespaceExperiment() { + return ExperimentTypeUtil.isNamespaceExperiment(experimentType); + } + + @Override + public boolean isContainerExperiment() { + return ExperimentTypeUtil.isContainerExperiment(experimentType); + } } diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java b/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java index 86d57abfd..ad53ce8a5 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java @@ -15,6 +15,7 @@ *******************************************************************************/ package com.autotune.analyzer.serviceObjects; +import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.ExperimentTypeAware; import com.autotune.analyzer.utils.ExperimentTypeUtil; import com.autotune.utils.KruizeConstants; @@ -26,7 +27,7 @@ public class ListRecommendationsAPIObject extends BaseSO implements ExperimentTy @SerializedName(KruizeConstants.JSONKeys.CLUSTER_NAME) private String clusterName; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) - private String experimentType; + private AnalyzerConstants.ExperimentType experimentType; @SerializedName(KruizeConstants.JSONKeys.KUBERNETES_OBJECTS) private List kubernetesObjects; @@ -47,12 +48,11 @@ public void setKubernetesObjects(List kubernetesObjects) { this.kubernetesObjects = kubernetesObjects; } - @Override - public String getExperimentType() { + public AnalyzerConstants.ExperimentType getExperimentType() { return experimentType; } - public void setExperimentType(String experimentType) { + public void setExperimentType(AnalyzerConstants.ExperimentType experimentType) { this.experimentType = experimentType; } diff --git a/src/main/java/com/autotune/analyzer/services/CreateExperiment.java b/src/main/java/com/autotune/analyzer/services/CreateExperiment.java index b66259f7d..f1ecba9ff 100644 --- a/src/main/java/com/autotune/analyzer/services/CreateExperiment.java +++ b/src/main/java/com/autotune/analyzer/services/CreateExperiment.java @@ -16,19 +16,14 @@ package com.autotune.analyzer.services; -import com.autotune.analyzer.exceptions.InvalidExperimentType; import com.autotune.analyzer.exceptions.KruizeResponse; import com.autotune.analyzer.experiment.ExperimentInitiator; import com.autotune.analyzer.kruizeObject.KruizeObject; import com.autotune.analyzer.serviceObjects.Converters; import com.autotune.analyzer.serviceObjects.CreateExperimentAPIObject; -import com.autotune.analyzer.serviceObjects.KubernetesAPIObject; import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.AnalyzerErrorConstants; import com.autotune.common.data.ValidationOutputData; -import com.autotune.common.data.result.ContainerData; -import com.autotune.common.data.result.NamespaceData; -import com.autotune.common.k8sObjects.K8sObject; import com.autotune.database.dao.ExperimentDAO; import com.autotune.database.dao.ExperimentDAOImpl; import com.autotune.database.service.ExperimentDBService; @@ -47,7 +42,10 @@ import javax.servlet.http.HttpServletResponse; import java.io.IOException; import java.io.PrintWriter; -import java.util.*; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.Map; import java.util.concurrent.ConcurrentHashMap; import java.util.stream.Collectors; @@ -101,9 +99,9 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) createExperimentAPIObject.setExperiment_id(Utils.generateID(createExperimentAPIObject.toString())); createExperimentAPIObject.setStatus(AnalyzerConstants.ExperimentStatus.IN_PROGRESS); // validating the kubernetes objects and experiment type - for (KubernetesAPIObject kubernetesAPIObject: createExperimentAPIObject.getKubernetesObjects()) { + /*for (KubernetesAPIObject kubernetesAPIObject : createExperimentAPIObject.getKubernetesObjects()) { if (createExperimentAPIObject.isContainerExperiment()) { - createExperimentAPIObject.setExperimentType(AnalyzerConstants.ExperimentTypes.CONTAINER_EXPERIMENT); + createExperimentAPIObject.setExperimentType(AnalyzerConstants.ExperimentType.CONTAINER); // check if namespace data is also set for container-type experiments if (null != kubernetesAPIObject.getNamespaceAPIObjects()) { throw new InvalidExperimentType(AnalyzerErrorConstants.APIErrors.CreateExperimentAPI.NAMESPACE_DATA_NOT_NULL_FOR_CONTAINER_EXP); @@ -116,7 +114,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) throw new InvalidExperimentType(AnalyzerErrorConstants.APIErrors.CreateExperimentAPI.NAMESPACE_EXP_NOT_SUPPORTED_FOR_REMOTE); } } - } + }*/ KruizeObject kruizeObject = Converters.KruizeObjectConverters.convertCreateExperimentAPIObjToKruizeObject(createExperimentAPIObject); if (null != kruizeObject) kruizeExpList.add(kruizeObject); @@ -149,9 +147,8 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) e.printStackTrace(); LOGGER.error("Unknown exception caught: " + e.getMessage()); sendErrorResponse(inputData, response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, "Internal Server Error: " + e.getMessage()); - } catch (InvalidExperimentType e) { - sendErrorResponse(inputData, response, null, HttpServletResponse.SC_BAD_REQUEST, e.getMessage()); - } finally { + } //catch (InvalidExperimentType e) { sendErrorResponse(inputData, response, null, HttpServletResponse.SC_BAD_REQUEST, e.getMessage()); } + finally { if (null != timerCreateExp) { MetricsConfig.timerCreateExp = MetricsConfig.timerBCreateExp.tag("status", statusValue).register(MetricsConfig.meterRegistry()); timerCreateExp.stop(MetricsConfig.timerCreateExp); diff --git a/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java b/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java index 8a2d5f22c..28b681af6 100644 --- a/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java +++ b/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java @@ -26,21 +26,16 @@ import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.AnalyzerErrorConstants; import com.autotune.analyzer.utils.GsonUTCDateAdapter; -import com.autotune.common.data.dataSourceQueries.PromQLDataSourceQueries; -import com.autotune.common.data.metrics.MetricAggregationInfoResults; -import com.autotune.common.data.metrics.MetricResults; import com.autotune.common.data.result.ContainerData; -import com.autotune.common.data.result.IntervalResults; import com.autotune.common.data.system.info.device.DeviceDetails; -import com.autotune.common.datasource.DataSourceInfo; -import com.autotune.common.k8sObjects.K8sObject; -import com.autotune.utils.GenericRestApiClient; import com.autotune.utils.KruizeConstants; import com.autotune.utils.MetricsConfig; import com.autotune.utils.Utils; -import com.google.gson.*; +import com.google.gson.ExclusionStrategy; +import com.google.gson.FieldAttributes; +import com.google.gson.Gson; +import com.google.gson.GsonBuilder; import io.micrometer.core.instrument.Timer; -import org.json.JSONObject; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -51,11 +46,10 @@ import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import java.io.IOException; -import java.lang.reflect.Method; -import java.net.URLEncoder; import java.sql.Timestamp; -import java.text.SimpleDateFormat; -import java.util.*; +import java.util.ArrayList; +import java.util.Date; +import java.util.List; import static com.autotune.analyzer.utils.AnalyzerConstants.ServiceConstants.CHARACTER_ENCODING; import static com.autotune.analyzer.utils.AnalyzerConstants.ServiceConstants.JSON_CONTENT_TYPE; @@ -108,7 +102,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) // validate and create KruizeObject if successful String validationMessage = recommendationEngine.validate_local(); if (validationMessage.isEmpty()) { - KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount); + KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount, null); if (kruizeObject.getValidation_data().isSuccess()) { LOGGER.debug("UpdateRecommendations API request count: {} success", calCount); interval_end_time = Utils.DateUtils.getTimeStampFrom(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT, diff --git a/src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java b/src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java index e558d1d37..65411b0f7 100644 --- a/src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java +++ b/src/main/java/com/autotune/analyzer/services/UpdateRecommendations.java @@ -48,6 +48,7 @@ import java.text.SimpleDateFormat; import java.util.*; +import static com.autotune.analyzer.utils.AnalyzerConstants.REMOTE; import static com.autotune.analyzer.utils.AnalyzerConstants.ServiceConstants.CHARACTER_ENCODING; import static com.autotune.analyzer.utils.AnalyzerConstants.ServiceConstants.JSON_CONTENT_TYPE; @@ -101,7 +102,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) // validate and create KruizeObject if successful String validationMessage = recommendationEngine.validate(); if (validationMessage.isEmpty()) { - KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount); + KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount, REMOTE); if (kruizeObject.getValidation_data().isSuccess()) { LOGGER.debug(String.format(AnalyzerErrorConstants.APIErrors.UpdateRecommendationsAPI.UPDATE_RECOMMENDATIONS_SUCCESS_COUNT, calCount)); interval_end_time = Utils.DateUtils.getTimeStampFrom(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT, diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index f7ae69c9f..391ddc617 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -236,6 +236,13 @@ public enum DeviceParameters { DEVICE_NAME } + public enum ExperimentType { + CONTAINER, // For container-level experiments + NAMESPACE, // For namespace-level experiments + CLUSTER, // For cluster-wide experiments + APPLICATION // For application-specific experiments + } + public static final class AcceleratorConstants { private AcceleratorConstants() { @@ -253,6 +260,7 @@ public static final class SupportedAccelerators { public static final String A100_80_GB = "A100-80GB"; public static final String A100_40_GB = "A100-40GB"; public static final String H100_80_GB = "H100-80GB"; + private SupportedAccelerators() { } @@ -272,6 +280,7 @@ public static final class AcceleratorProfiles { public static final String PROFILE_3G_40GB = "3g.40gb"; public static final String PROFILE_4G_40GB = "4g.40gb"; public static final String PROFILE_7G_80GB = "7g.80gb"; + private AcceleratorProfiles() { } diff --git a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeAware.java b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeAware.java index c8fd45dec..7111092cf 100644 --- a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeAware.java +++ b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeAware.java @@ -20,9 +20,11 @@ */ public interface ExperimentTypeAware { // Retrieves the experiment type associated with the implementing class. - String getExperimentType(); + AnalyzerConstants.ExperimentType getExperimentType(); + // checks if the experiment type is namespace boolean isNamespaceExperiment(); + // checks if the experiment type is container boolean isContainerExperiment(); } diff --git a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java index 591ea4d14..51b567617 100644 --- a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java +++ b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java @@ -20,11 +20,11 @@ * This class contains utility functions to determine experiment type */ public class ExperimentTypeUtil { - public static boolean isContainerExperiment(String experimentType) { - return experimentType == null || experimentType.equalsIgnoreCase(AnalyzerConstants.ExperimentTypes.CONTAINER_EXPERIMENT); + public static boolean isContainerExperiment(AnalyzerConstants.ExperimentType experimentType) { + return experimentType == null || AnalyzerConstants.ExperimentType.CONTAINER.equals(experimentType); } - public static boolean isNamespaceExperiment(String experimentType) { - return experimentType != null && experimentType.equalsIgnoreCase(AnalyzerConstants.ExperimentTypes.NAMESPACE_EXPERIMENT); + public static boolean isNamespaceExperiment(AnalyzerConstants.ExperimentType experimentType) { + return experimentType != null && AnalyzerConstants.ExperimentType.NAMESPACE.equals(experimentType); } } diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index 5b9425000..48cc0026b 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -405,7 +405,7 @@ private CreateExperimentAPIObject prepareCreateExperimentJSONInput(DataSourceCon createExperimentAPIObject.setExperiment_id(Utils.generateID(createExperimentAPIObject.toString())); createExperimentAPIObject.setStatus(AnalyzerConstants.ExperimentStatus.IN_PROGRESS); - createExperimentAPIObject.setExperimentType(AnalyzerConstants.ExperimentTypes.CONTAINER_EXPERIMENT); + createExperimentAPIObject.setExperimentType(AnalyzerConstants.ExperimentType.CONTAINER); createExperimentAPIObjects.add(createExperimentAPIObject); diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAO.java b/src/main/java/com/autotune/database/dao/ExperimentDAO.java index df72a14d7..29d49083d 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAO.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAO.java @@ -58,6 +58,10 @@ public interface ExperimentDAO { // Load a single experiment based on experimentName List loadExperimentByName(String experimentName) throws Exception; + // Load a single experiment based on experimentName + List loadLMExperimentByName(String experimentName) throws Exception; + + // Load a single data source based on name List loadDataSourceByName(String name) throws Exception; diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index 472dddc74..a2a1de630 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -121,6 +121,7 @@ public ValidationOutputData addExperimentToDB(KruizeLMExperimentEntry kruizeLMEx } } } catch (Exception e) { + LOGGER.debug("kruizeLMExperimentEntry={}", kruizeLMExperimentEntry); LOGGER.error("Not able to save experiment due to {}", e.getMessage()); validationOutputData.setMessage(e.getMessage()); } finally { @@ -717,7 +718,7 @@ public List loadAllExperiments() throws Exception { try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { entries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_EXPERIMENTS, KruizeExperimentEntry.class).list(); // TODO: remove native sql query and transient - getExperimentTypeInKruizeExperimentEntry(entries); + //getExperimentTypeInKruizeExperimentEntry(entries); statusValue = "success"; } catch (Exception e) { LOGGER.error("Not able to load experiment due to {}", e.getMessage()); @@ -814,6 +815,31 @@ public List loadAllMetricProfiles() throws Exception { return entries; } + @Override + public List loadLMExperimentByName(String experimentName) throws Exception { + //todo load only experimentStatus=inprogress , playback may not require completed experiments + List entries = null; + String statusValue = "failure"; + Timer.Sample timerLoadExpName = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + entries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_LM_EXPERIMENTS_BY_EXP_NAME, KruizeLMExperimentEntry.class) + .setParameter("experimentName", experimentName).list(); + // TODO: remove native sql query and transient + //getExperimentTypeInKruizeExperimentEntry(entries); + statusValue = "success"; + } catch (Exception e) { + LOGGER.error("Not able to load experiment {} due to {}", experimentName, e.getMessage()); + throw new Exception("Error while loading existing experiment from database due to : " + e.getMessage()); + } finally { + if (null != timerLoadExpName) { + MetricsConfig.timerLoadExpName = MetricsConfig.timerBLoadExpName.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadExpName.stop(MetricsConfig.timerLoadExpName); + } + + } + return entries; + } + @Override public List loadExperimentByName(String experimentName) throws Exception { //todo load only experimentStatus=inprogress , playback may not require completed experiments @@ -824,7 +850,7 @@ public List loadExperimentByName(String experimentName) t entries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_EXPERIMENTS_BY_EXP_NAME, KruizeExperimentEntry.class) .setParameter("experimentName", experimentName).list(); // TODO: remove native sql query and transient - getExperimentTypeInKruizeExperimentEntry(entries); + //getExperimentTypeInKruizeExperimentEntry(entries); statusValue = "success"; } catch (Exception e) { LOGGER.error("Not able to load experiment {} due to {}", experimentName, e.getMessage()); @@ -1162,11 +1188,11 @@ public List loadAllDataSources() throws Exception { return entries; } - private void getExperimentTypeInKruizeExperimentEntry(List entries) throws Exception { + /* private void getExperimentTypeInKruizeExperimentEntry(List entries) throws Exception { try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { for (KruizeExperimentEntry entry : entries) { if (isTargetCluserLocal(entry.getTarget_cluster())) { - if (null == entry.getExperimentType() || entry.getExperimentType().isEmpty()) { + if (null == entry.getE || entry.getExperimentType().isEmpty()) { String sql = DBConstants.SQLQUERY.SELECT_EXPERIMENT_EXP_TYPE; Query query = session.createNativeQuery(sql); query.setParameter("experiment_id", entry.getExperiment_id()); @@ -1181,7 +1207,7 @@ private void getExperimentTypeInKruizeExperimentEntry(List convertLMExperimentEntryToCreateExperimentAPIObject(List entries) throws Exception { + List createExperimentAPIObjects = new ArrayList<>(); + int failureThreshHold = entries.size(); + int failureCount = 0; + for (KruizeLMExperimentEntry entry : entries) { + try { + JsonNode extended_data = entry.getExtended_data(); + String extended_data_rawJson = extended_data.toString(); + CreateExperimentAPIObject apiObj = new Gson().fromJson(extended_data_rawJson, CreateExperimentAPIObject.class); + apiObj.setExperiment_id(entry.getExperiment_id()); + apiObj.setStatus(entry.getStatus()); + createExperimentAPIObjects.add(apiObj); + } catch (Exception e) { + LOGGER.error("Error in converting to apiObj from db object due to : {}", e.getMessage()); + LOGGER.error(entry.toString()); + failureCount++; + } + } + if (failureThreshHold > 0 && failureCount == failureThreshHold) + throw new Exception("None of the experiments are able to load from DB."); + + return createExperimentAPIObjects; + } + public static List convertExperimentEntryToCreateExperimentAPIObject(List entries) throws Exception { List createExperimentAPIObjects = new ArrayList<>(); int failureThreshHold = entries.size(); @@ -671,7 +695,6 @@ public static List convertExperimentEntryToCreateExpe CreateExperimentAPIObject apiObj = new Gson().fromJson(extended_data_rawJson, CreateExperimentAPIObject.class); apiObj.setExperiment_id(entry.getExperiment_id()); apiObj.setStatus(entry.getStatus()); - apiObj.setExperimentType(entry.getExperimentType()); createExperimentAPIObjects.add(apiObj); } catch (Exception e) { LOGGER.error("Error in converting to apiObj from db object due to : {}", e.getMessage()); diff --git a/src/main/java/com/autotune/database/service/ExperimentDBService.java b/src/main/java/com/autotune/database/service/ExperimentDBService.java index 79a1d2eaa..f631d7507 100644 --- a/src/main/java/com/autotune/database/service/ExperimentDBService.java +++ b/src/main/java/com/autotune/database/service/ExperimentDBService.java @@ -323,6 +323,32 @@ public void loadAllExperimentsData() throws Exception { loadAllRecommendations(KruizeOperator.autotuneObjectMap); } + public void loadLMExperimentFromDBByName(Map mainKruizeExperimentMap, String experimentName) throws Exception { + ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); + List entries = experimentDAO.loadLMExperimentByName(experimentName); + if (null != entries && !entries.isEmpty()) { + List createExperimentAPIObjects = DBHelpers.Converters.KruizeObjectConverters.convertLMExperimentEntryToCreateExperimentAPIObject(entries); + if (null != createExperimentAPIObjects && !createExperimentAPIObjects.isEmpty()) { + List kruizeExpList = new ArrayList<>(); + + int failureThreshHold = createExperimentAPIObjects.size(); + int failureCount = 0; + for (CreateExperimentAPIObject createExperimentAPIObject : createExperimentAPIObjects) { + KruizeObject kruizeObject = Converters.KruizeObjectConverters.convertCreateExperimentAPIObjToKruizeObject(createExperimentAPIObject); + if (null != kruizeObject) { + kruizeExpList.add(kruizeObject); + } else { + failureCount++; + } + } + if (failureThreshHold > 0 && failureCount == failureThreshHold) { + throw new Exception("Experiment " + experimentName + " unable to load from DB."); + } + experimentInterface.addExperimentToLocalStorage(mainKruizeExperimentMap, kruizeExpList); + } + } + } + public void loadExperimentFromDBByName(Map mainKruizeExperimentMap, String experimentName) throws Exception { ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); List entries = experimentDAO.loadExperimentByName(experimentName); diff --git a/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java b/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java index a794a4f84..ed71dcb33 100644 --- a/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java +++ b/src/main/java/com/autotune/database/table/KruizeExperimentEntry.java @@ -55,8 +55,6 @@ public class KruizeExperimentEntry { private String mode; private String target_cluster; private String performance_profile; - @Transient - private String experiment_type; @Enumerated(EnumType.STRING) private AnalyzerConstants.ExperimentStatus status; @JdbcTypeCode(SqlTypes.JSON) @@ -77,13 +75,16 @@ public KruizeExperimentEntry(KruizeLMExperimentEntry kruizeLMExperimentEntry) { this.mode = kruizeLMExperimentEntry.getMode(); this.target_cluster = kruizeLMExperimentEntry.getTarget_cluster(); this.performance_profile = kruizeLMExperimentEntry.getPerformance_profile(); - this.experiment_type = kruizeLMExperimentEntry.getExperimentType(); this.status = kruizeLMExperimentEntry.getStatus(); this.datasource = kruizeLMExperimentEntry.getDatasource(); this.extended_data = kruizeLMExperimentEntry.getExtended_data(); this.meta_data = kruizeLMExperimentEntry.getMeta_data(); } + public KruizeExperimentEntry() { + + } + public String getVersion() { return version; } @@ -172,12 +173,5 @@ public void setDatasource(JsonNode datasource) { this.datasource = datasource; } - public String getExperimentType() { - return experiment_type; - } - - public void setExperimentType(String experimentType) { - this.experiment_type = experimentType; - } } diff --git a/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java b/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java index 30e25f026..e28bce414 100644 --- a/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java +++ b/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java @@ -22,6 +22,10 @@ import org.hibernate.annotations.JdbcTypeCode; import org.hibernate.type.SqlTypes; +import java.sql.Timestamp; +import java.util.Date; + + /** * This is a Java class named KruizeExperimentEntry annotated with JPA annotations. * It represents a table named kruize_experiment in a relational database. @@ -54,8 +58,6 @@ public class KruizeLMExperimentEntry { private String mode; private String target_cluster; private String performance_profile; - @Transient - private String experiment_type; @Enumerated(EnumType.STRING) private AnalyzerConstants.ExperimentStatus status; @JdbcTypeCode(SqlTypes.JSON) @@ -64,6 +66,14 @@ public class KruizeLMExperimentEntry { private JsonNode extended_data; @JdbcTypeCode(SqlTypes.JSON) private JsonNode meta_data; + @Enumerated(EnumType.STRING) + private AnalyzerConstants.ExperimentType experiment_type; + @Temporal(TemporalType.TIMESTAMP) + @Column(name = "creation_date", updatable = false) + private Timestamp creation_date; + @Temporal(TemporalType.TIMESTAMP) + @Column(name = "updated_date") + private Timestamp updated_date; // TODO: update KruizeDSMetadataEntry @@ -156,13 +166,47 @@ public void setDatasource(JsonNode datasource) { this.datasource = datasource; } - public String getExperimentType() { + public AnalyzerConstants.ExperimentType getExperiment_type() { return experiment_type; } - public void setExperimentType(String experimentType) { - this.experiment_type = experimentType; + public void setExperiment_type(AnalyzerConstants.ExperimentType experiment_type) { + this.experiment_type = experiment_type; } + public Date getCreation_date() { + return creation_date; + } + + public void setCreation_date(Timestamp creation_date) { + this.creation_date = creation_date; + } + public Date getUpdated_date() { + return updated_date; + } + + public void setUpdated_date(Timestamp updated_date) { + this.updated_date = updated_date; + } + + @Override + public String toString() { + return "KruizeLMExperimentEntry{" + + "experiment_id='" + experiment_id + '\'' + + ", version='" + version + '\'' + + ", experiment_name='" + experiment_name + '\'' + + ", cluster_name='" + cluster_name + '\'' + + ", mode='" + mode + '\'' + + ", target_cluster='" + target_cluster + '\'' + + ", performance_profile='" + performance_profile + '\'' + + ", status=" + status + + ", datasource=" + datasource + + ", extended_data=" + extended_data + + ", meta_data=" + meta_data + + ", experiment_type=" + experiment_type + + ", creation_date=" + creation_date + + ", updated_date=" + updated_date + + '}'; + } } From 3242337765cc94b07465d34fc0a608428dd07228 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 06:23:00 +0530 Subject: [PATCH 22/85] testsuit update for new flag Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index f4f9b2aa4..418968c10 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1915,11 +1915,11 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } From 1d855c0046c0158ecde8c896847f623958d48aaa Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 14:08:53 +0530 Subject: [PATCH 23/85] incorporated review comments Signed-off-by: msvinaykumar --- .../autotune/analyzer/services/CreateExperiment.java | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/services/CreateExperiment.java b/src/main/java/com/autotune/analyzer/services/CreateExperiment.java index f1ecba9ff..191c015f3 100644 --- a/src/main/java/com/autotune/analyzer/services/CreateExperiment.java +++ b/src/main/java/com/autotune/analyzer/services/CreateExperiment.java @@ -16,11 +16,13 @@ package com.autotune.analyzer.services; +import com.autotune.analyzer.exceptions.InvalidExperimentType; import com.autotune.analyzer.exceptions.KruizeResponse; import com.autotune.analyzer.experiment.ExperimentInitiator; import com.autotune.analyzer.kruizeObject.KruizeObject; import com.autotune.analyzer.serviceObjects.Converters; import com.autotune.analyzer.serviceObjects.CreateExperimentAPIObject; +import com.autotune.analyzer.serviceObjects.KubernetesAPIObject; import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.AnalyzerErrorConstants; import com.autotune.common.data.ValidationOutputData; @@ -99,7 +101,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) createExperimentAPIObject.setExperiment_id(Utils.generateID(createExperimentAPIObject.toString())); createExperimentAPIObject.setStatus(AnalyzerConstants.ExperimentStatus.IN_PROGRESS); // validating the kubernetes objects and experiment type - /*for (KubernetesAPIObject kubernetesAPIObject : createExperimentAPIObject.getKubernetesObjects()) { + for (KubernetesAPIObject kubernetesAPIObject : createExperimentAPIObject.getKubernetesObjects()) { if (createExperimentAPIObject.isContainerExperiment()) { createExperimentAPIObject.setExperimentType(AnalyzerConstants.ExperimentType.CONTAINER); // check if namespace data is also set for container-type experiments @@ -114,7 +116,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) throw new InvalidExperimentType(AnalyzerErrorConstants.APIErrors.CreateExperimentAPI.NAMESPACE_EXP_NOT_SUPPORTED_FOR_REMOTE); } } - }*/ + } KruizeObject kruizeObject = Converters.KruizeObjectConverters.convertCreateExperimentAPIObjToKruizeObject(createExperimentAPIObject); if (null != kruizeObject) kruizeExpList.add(kruizeObject); @@ -147,8 +149,9 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) e.printStackTrace(); LOGGER.error("Unknown exception caught: " + e.getMessage()); sendErrorResponse(inputData, response, e, HttpServletResponse.SC_INTERNAL_SERVER_ERROR, "Internal Server Error: " + e.getMessage()); - } //catch (InvalidExperimentType e) { sendErrorResponse(inputData, response, null, HttpServletResponse.SC_BAD_REQUEST, e.getMessage()); } - finally { + } catch (InvalidExperimentType e) { + sendErrorResponse(inputData, response, null, HttpServletResponse.SC_BAD_REQUEST, e.getMessage()); + } finally { if (null != timerCreateExp) { MetricsConfig.timerCreateExp = MetricsConfig.timerBCreateExp.tag("status", statusValue).register(MetricsConfig.meterRegistry()); timerCreateExp.stop(MetricsConfig.timerCreateExp); From 1a302d98f515f3543e7833ae902163dd9803fa6a Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 07:57:20 +0530 Subject: [PATCH 24/85] incorporated review comments Signed-off-by: msvinaykumar --- .../autotune/database/table/lm/KruizeLMExperimentEntry.java | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java b/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java index e28bce414..3a641bb5c 100644 --- a/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java +++ b/src/main/java/com/autotune/database/table/lm/KruizeLMExperimentEntry.java @@ -27,8 +27,8 @@ /** - * This is a Java class named KruizeExperimentEntry annotated with JPA annotations. - * It represents a table named kruize_experiment in a relational database. + * This is a Java class named KruizeLMExperimentEntry annotated with JPA annotations. + * It represents a table named kruize_lm_experiments in a relational database. *

* The class has the following fields: *

@@ -42,6 +42,7 @@ * status: An enum representing the status of the experiment, defined in AnalyzerConstants.ExperimentStatus. * extended_data: A JSON object representing extended data for the experiment. * meta_data: A string representing metadata for the experiment. + * experiment_type : Recommendation generation at container, namespace level etc. * The ExperimentDetail class also has getters and setters for all its fields. */ @Entity From 0bad108c01927b7a279d56f7c4c898200ed7bd2f Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 17:48:30 +0530 Subject: [PATCH 25/85] incorporated review comments Signed-off-by: msvinaykumar --- src/main/java/com/autotune/Autotune.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/Autotune.java b/src/main/java/com/autotune/Autotune.java index e18652ed4..a76e3820a 100644 --- a/src/main/java/com/autotune/Autotune.java +++ b/src/main/java/com/autotune/Autotune.java @@ -115,7 +115,7 @@ public static void main(String[] args) { InitializeDeployment.setup_deployment_info(); // Configure AWS CloudWatch CloudWatchAppender.configureLoggerForCloudWatchLog(); - LOGGER.debug("ROS enabled : {}" ,KruizeDeploymentInfo.is_ros_enabled); + LOGGER.info("ROS enabled : {}" ,KruizeDeploymentInfo.is_ros_enabled); // Read and execute the DDLs here executeDDLs(AnalyzerConstants.ROS_DDL_SQL); if (KruizeDeploymentInfo.local == true) { From ce6b674603d9a2765488e7de81e0c67cb64cf812 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 18:03:14 +0530 Subject: [PATCH 26/85] incorporated review comments Signed-off-by: msvinaykumar --- .../recommendations/engine/RecommendationEngine.java | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java index 68465e4c6..077998c03 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java +++ b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java @@ -197,9 +197,13 @@ private KruizeObject createKruizeObject(String target_cluster) { KruizeObject kruizeObject = new KruizeObject(); try { - if (KruizeDeploymentInfo.is_ros_enabled && null != target_cluster && target_cluster.equalsIgnoreCase(AnalyzerConstants.REMOTE)) { // todo call this in function and use across every where - new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); - } else { + if (KruizeDeploymentInfo.is_ros_enabled){ + if(null == target_cluster || target_cluster.equalsIgnoreCase(AnalyzerConstants.REMOTE)){ + new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); + }else{ + new ExperimentDBService().loadLMExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); + } + }else{ new ExperimentDBService().loadLMExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); } From d96fe64ac333fbb8c1f971799e12ad56a1e8096b Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 01:10:59 +0530 Subject: [PATCH 27/85] adding VPA specific implementations Signed-off-by: Shekhar Saxena --- .../UnableToCreateVPAException.java | 27 ++ .../updater/RecommendationUpdaterImpl.java | 2 +- .../updater/vpa/VpaUpdaterImpl.java | 306 +++++++++++++++++- .../analyzer/utils/AnalyzerConstants.java | 15 +- .../utils/AnalyzerErrorConstants.java | 3 +- 5 files changed, 349 insertions(+), 4 deletions(-) create mode 100644 src/main/java/com/autotune/analyzer/exceptions/UnableToCreateVPAException.java diff --git a/src/main/java/com/autotune/analyzer/exceptions/UnableToCreateVPAException.java b/src/main/java/com/autotune/analyzer/exceptions/UnableToCreateVPAException.java new file mode 100644 index 000000000..3b5c3c494 --- /dev/null +++ b/src/main/java/com/autotune/analyzer/exceptions/UnableToCreateVPAException.java @@ -0,0 +1,27 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.exceptions; + +public class UnableToCreateVPAException extends Exception { + public UnableToCreateVPAException() { + } + + public UnableToCreateVPAException(String message) { + super(message); + } +} + diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java index a3cfcc379..76d1695c0 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java @@ -74,7 +74,7 @@ public KruizeObject generateResourceRecommendationsForExperiment(String experime int calCount = 0; String validationMessage = recommendationEngine.validate_local(); if (validationMessage.isEmpty()) { - KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount); + KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount, null); if (kruizeObject.getValidation_data().isSuccess()) { LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATED_RECOMMENDATIONS, experimentName); return kruizeObject; diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java index 66c2b80f9..7440e8655 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -16,16 +16,36 @@ package com.autotune.analyzer.recommendations.updater.vpa; +import com.autotune.analyzer.exceptions.ApplyRecommendationsError; +import com.autotune.analyzer.exceptions.UnableToCreateVPAException; +import com.autotune.analyzer.kruizeObject.KruizeObject; +import com.autotune.analyzer.recommendations.RecommendationConfigItem; import com.autotune.analyzer.recommendations.updater.RecommendationUpdaterImpl; import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.AnalyzerErrorConstants; +import com.autotune.common.k8sObjects.K8sObject; +import com.autotune.common.data.result.ContainerData; +import com.autotune.analyzer.recommendations.objects.MappedRecommendationForTimestamp; +import com.autotune.analyzer.recommendations.objects.TermRecommendations; +import io.fabric8.autoscaling.api.model.v1.*; +import io.fabric8.kubernetes.api.model.ObjectMeta; +import io.fabric8.kubernetes.api.model.Quantity; import io.fabric8.kubernetes.api.model.apiextensions.v1.CustomResourceDefinitionList; +import io.fabric8.kubernetes.api.model.autoscaling.v1.CrossVersionObjectReferenceBuilder; import io.fabric8.kubernetes.client.DefaultKubernetesClient; import io.fabric8.kubernetes.client.KubernetesClient; import io.fabric8.kubernetes.client.dsl.ApiextensionsAPIGroupDSL; +import io.fabric8.verticalpodautoscaler.client.DefaultVerticalPodAutoscalerClient; +import io.fabric8.verticalpodautoscaler.client.NamespacedVerticalPodAutoscalerClient; import org.slf4j.Logger; import org.slf4j.LoggerFactory; +import java.sql.Timestamp; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + public class VpaUpdaterImpl extends RecommendationUpdaterImpl { private static final Logger LOGGER = LoggerFactory.getLogger(VpaUpdaterImpl.class); private static VpaUpdaterImpl vpaUpdater = new VpaUpdaterImpl(); @@ -60,8 +80,292 @@ public boolean isUpdaterInstalled() { if (isVpaInstalled) { LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.FOUND_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); } else { - LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); } return isVpaInstalled; } + + /** + * Checks if a Vertical Pod Autoscaler (VPA) object with the specified name is present. + * + * @param vpaName String containing the name of the VPA object to search for + * @return true if the VPA object with the specified name is present, false otherwise + */ + private boolean checkIfVpaIsPresent(String vpaName) { + try { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); + NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); + VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); + + // TODO:// later we can also check here is the recommender is Kruize to confirm + for (VerticalPodAutoscaler vpa : vpas.getItems()) { + if (vpaName.equals(vpa.getMetadata().getName())) { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); + return true; + } + } + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); + return false; + } catch (Exception e) { + LOGGER.error("Error while checking VPA presence: " + e.getMessage(), e); + return false; + } + } + + + /** + * Returns the VPA Object if present with the name + * + * @param vpaName String containing the name of the VPA object to search for + * @return VerticalPodAutoscaler if the VPA object with the specified name is present, null otherwise + */ + private VerticalPodAutoscaler getVpaIsPresent(String vpaName) { + try { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); + NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); + VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); + + for (VerticalPodAutoscaler vpa : vpas.getItems()) { + if (vpaName.equals(vpa.getMetadata().getName())) { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); + return vpa; + } + } + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); + return null; + } catch (Exception e) { + LOGGER.error("Error while checking VPA presence: " + e.getMessage(), e); + return null; + } + } + + + /** + * Applies the resource recommendations contained within the provided KruizeObject + * This method will take the KruizeObject, which contains the resource recommendations, + * and apply them to the desired resources. + * + * @param kruizeObject KruizeObject containing the resource recommendations to be applied. + * @throws ApplyRecommendationsError in case of any error. + */ + @Override + public void applyResourceRecommendationsForExperiment(KruizeObject kruizeObject) throws ApplyRecommendationsError { + try { + // checking if VPA is installed or not + if (isUpdaterInstalled()) { + String expName = kruizeObject.getExperimentName(); + boolean vpaPresent = checkIfVpaIsPresent(expName); + + // create VPA Object is not present + if (!vpaPresent) { + createVpaObject(kruizeObject); + } + + for (K8sObject k8sObject: kruizeObject.getKubernetes_objects()) { + List containerRecommendations = convertRecommendationsToContainerPolicy(k8sObject.getContainerDataMap()); + if (containerRecommendations.isEmpty()){ + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.RECOMMENDATION_DATA_NOT_PRESENT); + } else { + RecommendedPodResources recommendedPodResources = new RecommendedPodResources(); + recommendedPodResources.setContainerRecommendations(containerRecommendations); + VerticalPodAutoscalerStatus vpaObjectStatus = new VerticalPodAutoscalerStatusBuilder() + .withRecommendation(recommendedPodResources) + .build(); + + // patching existing VPA Object + if (vpaObjectStatus != null) { + VerticalPodAutoscaler vpaObject = getVpaIsPresent(expName); + vpaObject.setStatus(vpaObjectStatus); + + NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); + client.v1().verticalpodautoscalers() + .inNamespace(vpaObject + .getMetadata() + .getNamespace()) + .withName(vpaObject + .getMetadata() + .getName()) + .patchStatus(vpaObject); + + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_PATCHED, + vpaObject.getMetadata().getName())); + } + } + } + + } else { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); + } + } catch (Exception e) { + throw new ApplyRecommendationsError(e.getMessage()); + } + } + + /** + * This function converts container recommendations for VPA Container Recommendations Object Format + */ + private List convertRecommendationsToContainerPolicy(HashMap containerDataMap) { + List containerRecommendations = new ArrayList<>(); + + for (Map.Entry containerDataEntry : containerDataMap.entrySet()) { + // fetcing container data + ContainerData containerData = containerDataEntry.getValue(); + String containerName = containerData.getContainer_name(); + HashMap recommendationData = containerData.getContainerRecommendations().getData(); + + // checking if recommendation data is present + if (recommendationData != null) { + for (MappedRecommendationForTimestamp value : recommendationData.values()) { + /* + * Fetching Short Term Cost Recommendations By Default + * TODO:// Implement functionality to choose the desired term and model + **/ + TermRecommendations termRecommendations = value.getShortTermRecommendations(); + HashMap> recommendationsConfig = termRecommendations.getCostRecommendations().getConfig(); + + Double cpuRecommendationValue = recommendationsConfig.get(AnalyzerConstants.ResourceSetting.requests).get(AnalyzerConstants.RecommendationItem.CPU).getAmount(); + Double memoryRecommendationValue = recommendationsConfig.get(AnalyzerConstants.ResourceSetting.requests).get(AnalyzerConstants.RecommendationItem.MEMORY).getAmount(); + + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, "CPU", containerName, cpuRecommendationValue)); + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, "MEMORY", containerName, memoryRecommendationValue)); + + String cpuRecommendationValueForVpa = resource2str("CPU", cpuRecommendationValue); + String memoryRecommendationValueForVpa = resource2str("MEMORY", memoryRecommendationValue); + + // creating container resource vpa object + RecommendedContainerResources recommendedContainerResources = new RecommendedContainerResources(); + recommendedContainerResources.setContainerName(containerName); + + // setting target values + Map target = new HashMap<>(); + target.put("cpu", new Quantity(cpuRecommendationValueForVpa)); + target.put("memory", new Quantity(memoryRecommendationValueForVpa)); + + // setting lower bound values + Map lowerBound = new HashMap<>(); + lowerBound.put("cpu", new Quantity(cpuRecommendationValueForVpa)); + lowerBound.put("memory", new Quantity(memoryRecommendationValueForVpa)); + + // setting upper bound values + Map upperBound = new HashMap<>(); + upperBound.put("cpu", new Quantity(cpuRecommendationValueForVpa)); + upperBound.put("memory", new Quantity(memoryRecommendationValueForVpa)); + + recommendedContainerResources.setLowerBound(lowerBound); + recommendedContainerResources.setTarget(target); + recommendedContainerResources.setUpperBound(upperBound); + + containerRecommendations.add(recommendedContainerResources); + } + } else { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.RECOMMENDATION_DATA_NOT_PRESENT); + } + } + return containerRecommendations; + } + + /** + * This function converts the cpu and memory values to VPA desired format + */ + public static String resource2str(String resource, double value) { + if (resource.equalsIgnoreCase("CPU")) { + // cpu related conversions + if (value < 1) { + return (int) (value * 1000) + "m"; + } else { + return String.valueOf(value); + } + } else { + // memory related conversions + if (value < 1024) { + return (int) value + "B"; + } else if (value < 1024 * 1024) { + return (int) (value / 1024) + "k"; + } else if (value < 1024 * 1024 * 1024) { + return (int) (value / 1024 / 1024) + "Mi"; + } else { + return (int) (value / 1024 / 1024 / 1024) + "Gi"; + } + } + } + + /* + * Creates a Vertical Pod Autoscaler (VPA) object in the specified namespace + * for the given deployment and containers. + */ + public void createVpaObject(KruizeObject kruizeObject) throws UnableToCreateVPAException { + try { + // checks if updater is installed or not + if (isUpdaterInstalled()) { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATEING_VPA, kruizeObject.getExperimentName()); + + // updating recommender to Kruize for VPA Object + Map additionalVpaObjectProps = getAdditionalVpaObjectProps(); + + // updating controlled resources + List controlledResources = new ArrayList<>(); + controlledResources.add("cpu"); + controlledResources.add("memory"); + + // updating container policies + for (K8sObject k8sObject: kruizeObject.getKubernetes_objects()) { + List containers = new ArrayList<>(k8sObject.getContainerDataMap().keySet()); + List containerPolicies = new ArrayList<>(); + for (String containerName : containers) { + ContainerResourcePolicy policy = new ContainerResourcePolicyBuilder() + .withContainerName(containerName) + .withControlledResources(controlledResources) + .build(); + containerPolicies.add(policy); + } + + PodResourcePolicy podPolicy = new PodResourcePolicyBuilder() + .withContainerPolicies(containerPolicies) + .build(); + + VerticalPodAutoscaler vpa = new VerticalPodAutoscalerBuilder() + .withApiVersion(AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_API_VERSION) + .withKind(AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_PLURAL) + .withMetadata(new ObjectMeta() {{ + setName(kruizeObject.getExperimentName()); + }}) + .withSpec(new VerticalPodAutoscalerSpecBuilder() + .withTargetRef(new CrossVersionObjectReferenceBuilder() + .withApiVersion(AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_TARGET_REF_API_VERSION) + .withKind(AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_TARGET_REF_KIND) + .withName(k8sObject.getName()) + .build()) + .withResourcePolicy(podPolicy) + .withAdditionalProperties(additionalVpaObjectProps) + .build()) + .build(); + + kubernetesClient.resource(vpa).inNamespace(k8sObject.getNamespace()).createOrReplace(); + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATED_VPA, kruizeObject.getExperimentName())); + } + + } else { + throw new UnableToCreateVPAException(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); + } + } catch (Exception e) { + throw new UnableToCreateVPAException(e.getMessage()); + } + } + + /* + * Prepare Object Map with addiional properties required for VPA Object + * such as - Recommender Name + */ + private static Map getAdditionalVpaObjectProps() { + Map additionalVpaObjectProps = new HashMap<>(); + List> recommenders = new ArrayList<>(); + Map recommender = new HashMap<>(); + recommender.put(AnalyzerConstants.RecommendationUpdaterConstants.VPA.RECOMMENDER_KEY, + AnalyzerConstants.RecommendationUpdaterConstants.VPA.RECOMMENDER_NAME); + recommenders.add(recommender); + additionalVpaObjectProps.put(AnalyzerConstants.RecommendationUpdaterConstants.VPA.RECOMMENDERS, recommenders); + return additionalVpaObjectProps; + } } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index d92aa3c03..d9a4cc650 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -691,6 +691,13 @@ private SupportedUpdaters() { public static final class VPA { public static final String VPA_PLURAL = "VerticalPodAutoscaler"; + public static final String RECOMMENDERS = "recommenders"; + public static final String RECOMMENDER_KEY = "recommender"; + public static final String RECOMMENDER_NAME = "Kruize"; + public static final String VPA_API_VERSION = "autoscaling.k8s.io/v1"; + public static final String VPA_TARGET_REF_API_VERSION = "apps/v1"; + public static final String VPA_TARGET_REF_KIND = "Deployment"; + private VPA() { @@ -702,7 +709,13 @@ public static final class InfoMsgs { public static final String GENERATED_RECOMMENDATIONS = "Generated recommendations for experiment: {}"; public static final String CHECKING_IF_UPDATER_INSTALLED = "Verifying if the updater is installed: {}"; public static final String FOUND_UPDATER_INSTALLED = "Found updater is installed: {}"; - + public static final String CHECKING_IF_VPA_PRESENT = "Checking for the presence of VPA with name: %s"; + public static final String VPA_WITH_NAME_FOUND = "VPA with name %s found."; + public static final String VPA_WITH_NAME_NOT_FOUND = "VPA with name %s not found."; + public static final String RECOMMENDATION_VALUE = "%s request recommendations for container %s is %f"; + public static final String VPA_PATCHED = "VPA object with name %s is patched successfully with recommendations."; + public static final String CREATEING_VPA = "Creating VPA with name: %s"; + public static final String CREATED_VPA = "Created VPA with name: %s"; private InfoMsgs() { } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java index 148494d38..951e80987 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java @@ -294,6 +294,7 @@ private RecommendationUpdaterErrors() { public static final String UNSUPPORTED_UPDATER_TYPE = "Updater type %s is not supported."; public static final String GENERATE_RECOMMNEDATION_FAILED = "Failed to generate recommendations for experiment: {}"; - public static final String UPDATER_NOT_INSTALLED = "Updater is not installed: {}"; + public static final String UPDATER_NOT_INSTALLED = "Updater is not installed."; + public static final String RECOMMENDATION_DATA_NOT_PRESENT = "Recommendations are not present for the experiment."; } } From 2a99f8c4abc3468894a38883bfbdccf7bc26fceb Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 01:17:37 +0530 Subject: [PATCH 28/85] adding vpa modes for kruize exps Signed-off-by: Shekhar Saxena --- .../autotune/analyzer/kruizeObject/ExperimentUseCaseType.java | 4 ++++ .../java/com/autotune/analyzer/utils/AnalyzerConstants.java | 3 +++ 2 files changed, 7 insertions(+) diff --git a/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java b/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java index b25eb74b4..01cae6a23 100644 --- a/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java +++ b/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java @@ -43,6 +43,10 @@ public ExperimentUseCaseType(KruizeObject kruizeObject) throws Exception { setLocal_monitoring(true); } else if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.EXPERIMENT)) { setLocal_experiment(true); + } else if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.RECREATE)) { + setLocal_monitoring(true); + } else if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.AUTO)) { + setLocal_monitoring(true); } else { throw new Exception("Invalid Mode " + kruizeObject.getMode() + " for target cluster as Local."); } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index 391ddc617..d54c6a653 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -30,6 +30,9 @@ public class AnalyzerConstants { public static final String EXPERIMENT = "experiment"; public static final String LOCAL = "local"; public static final String REMOTE = "remote"; + public static final String AUTO = "auto"; + public static final String RECREATE = "recreate"; + // Used to parse autotune configmaps From b80ded464bb587d24ce9f6de0976969a14ea9c4c Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 08:23:26 +0530 Subject: [PATCH 29/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- .../minikube/kruize-crc-minikube.yaml | 1 + .../openshift/kruize-crc-openshift.yaml | 1 + tests/scripts/common/common_functions.sh | 12 +++++++----- .../remote_monitoring_tests.sh | 2 +- 4 files changed, 10 insertions(+), 6 deletions(-) diff --git a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml index 66d0b8733..0e2a0c8ed 100644 --- a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml +++ b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml @@ -112,6 +112,7 @@ data: "dbdriver": "jdbc:postgresql://", "plots": "true", "local": "true", + "isROSEnabled": "false", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.monitoring.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", "experimentsURL" : "http://kruize.monitoring.svc.cluster.local:8080/createExperiment", diff --git a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml index 2deb3b954..a1da660e7 100644 --- a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml +++ b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml @@ -106,6 +106,7 @@ data: "dbdriver": "jdbc:postgresql://", "plots": "true", "local": "true", + "isROSEnabled": "false", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.openshift-tuning.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", "experimentsURL" : "http://kruize.openshift-tuning.svc.cluster.local:8080/createExperiment", diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index 418968c10..bdbad96f6 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1904,7 +1904,7 @@ function kruize_local_patch() { } # -# "local" flag is turned off for RM. +# "isROSEnabled" flag is turned on for RM. # Restores kruize default cpu/memory resources, PV storage for openshift # function kruize_remote_patch() { @@ -1914,13 +1914,15 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then - sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + #sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then - sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + #sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } diff --git a/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh b/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh index 2b2964958..858bdec75 100755 --- a/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh +++ b/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh @@ -61,7 +61,7 @@ function remote_monitoring_tests() { if [ ${skip_setup} -eq 0 ]; then echo "Setting up kruize..." | tee -a ${LOG} echo "${KRUIZE_SETUP_LOG}" - echo "setting local=false" + echo "setting isROSEnabled=false" kruize_remote_patch setup "${KRUIZE_POD_LOG}" >> ${KRUIZE_SETUP_LOG} 2>&1 echo "Setting up kruize...Done" | tee -a ${LOG} From 911cd8cc8af964369e292e8eb3057ed09a749e07 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 09:13:59 +0530 Subject: [PATCH 30/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index bdbad96f6..7e9ed517d 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1914,15 +1914,15 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then - sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - #sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then - sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - #sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } From 194b52a2b98944a48f27bdaf8dda38e58b164088 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 12:42:00 +0530 Subject: [PATCH 31/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index 7e9ed517d..b2db3782d 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1916,13 +1916,13 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - sed -i '/"local": "true"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } From cd3573fc590a51c83e9e225582fc3b35a6e65d22 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 12:50:16 +0530 Subject: [PATCH 32/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index b2db3782d..f3831bc9f 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1916,13 +1916,15 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } From 7c02a51c681ec55ee4aa27d6df7bf188da0900ea Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 12:57:01 +0530 Subject: [PATCH 33/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index f3831bc9f..52df80464 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1915,12 +1915,12 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} + #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} From e0866bd543c77acf18f86ee2332602bf1d03cdd6 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 15:06:37 +0530 Subject: [PATCH 34/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 7 ++++++- .../remote_monitoring_tests/remote_monitoring_tests.sh | 2 +- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index 52df80464..37a1916a7 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1916,7 +1916,12 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + if grep -q '"isROSEnabled": "false"' kruize-crc-minikube.yaml; then + sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} + else + echo "Error: Match not found" >&2 + exit 1 + fi #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} diff --git a/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh b/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh index 858bdec75..06e8bb09a 100755 --- a/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh +++ b/tests/scripts/remote_monitoring_tests/remote_monitoring_tests.sh @@ -61,7 +61,7 @@ function remote_monitoring_tests() { if [ ${skip_setup} -eq 0 ]; then echo "Setting up kruize..." | tee -a ${LOG} echo "${KRUIZE_SETUP_LOG}" - echo "setting isROSEnabled=false" + echo "setting isROSEnabled=true" kruize_remote_patch setup "${KRUIZE_POD_LOG}" >> ${KRUIZE_SETUP_LOG} 2>&1 echo "Setting up kruize...Done" | tee -a ${LOG} From b7fe8fb5b373c13f65f04a694047feaf6c22edea Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sat, 7 Dec 2024 16:22:08 +0530 Subject: [PATCH 35/85] truning Local=true for RM in test script Signed-off-by: msvinaykumar --- tests/scripts/common/common_functions.sh | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index 37a1916a7..977dde70f 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1916,11 +1916,12 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - if grep -q '"isROSEnabled": "false"' kruize-crc-minikube.yaml; then + if grep -q '"isROSEnabled": "false"' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE}; then + echo "match found" sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} else - echo "Error: Match not found" >&2 - exit 1 + echo "Error: Match not found" >&2 + exit 1 fi #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then From 749071f4a2328efe9090531b83e2e43a8024cc15 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 9 Dec 2024 15:40:03 +0530 Subject: [PATCH 36/85] incorporated review comments Signed-off-by: msvinaykumar --- .../minikube/kruize-crc-minikube.yaml | 1 - .../openshift/kruize-crc-openshift.yaml | 1 - .../aks/kruize-crc-aks.yaml | 1 - .../minikube/kruize-crc-minikube.yaml | 1 - .../openshift/kruize-crc-openshift.yaml | 1 - tests/scripts/common/common_functions.sh | 13 ------------- 6 files changed, 18 deletions(-) diff --git a/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml b/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml index d7da66483..3b7cce4f1 100644 --- a/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml +++ b/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml @@ -33,7 +33,6 @@ data: "savetodb": "true", "dbdriver": "jdbc:postgresql://", "plots": "true", - "local": "true", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.monitoring.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", "experimentsURL" : "http://kruize.monitoring.svc.cluster.local:8080/createExperiment", diff --git a/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml b/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml index 81dcf9214..052d267df 100644 --- a/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml +++ b/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml @@ -46,7 +46,6 @@ data: "savetodb": "true", "dbdriver": "jdbc:postgresql://", "plots": "true", - "local": "true", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.openshift-tuning.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", "experimentsURL" : "http://kruize.openshift-tuning.svc.cluster.local:8080/createExperiment", diff --git a/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml b/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml index 2541a16ca..3d21a4acc 100644 --- a/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml +++ b/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml @@ -97,7 +97,6 @@ data: "savetodb": "true", "dbdriver": "jdbc:postgresql://", "plots": "true", - "local": "true", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.monitoring.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", "experimentsURL" : "http://kruize.monitoring.svc.cluster.local:8080/createExperiment", diff --git a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml index 0e2a0c8ed..ff2557dc8 100644 --- a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml +++ b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml @@ -111,7 +111,6 @@ data: "savetodb": "true", "dbdriver": "jdbc:postgresql://", "plots": "true", - "local": "true", "isROSEnabled": "false", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.monitoring.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", diff --git a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml index a1da660e7..863d2a696 100644 --- a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml +++ b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml @@ -105,7 +105,6 @@ data: "savetodb": "true", "dbdriver": "jdbc:postgresql://", "plots": "true", - "local": "true", "isROSEnabled": "false", "logAllHttpReqAndResp": "true", "recommendationsURL" : "http://kruize.openshift-tuning.svc.cluster.local:8080/generateRecommendations?experiment_name=%s", diff --git a/tests/scripts/common/common_functions.sh b/tests/scripts/common/common_functions.sh index 977dde70f..cc0f329ce 100755 --- a/tests/scripts/common/common_functions.sh +++ b/tests/scripts/common/common_functions.sh @@ -1894,13 +1894,6 @@ function kruize_local_patch() { CRC_DIR="./manifests/crc/default-db-included-installation" KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT="${CRC_DIR}/openshift/kruize-crc-openshift.yaml" KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE="${CRC_DIR}/minikube/kruize-crc-minikube.yaml" - - - if [ ${cluster_type} == "minikube" ]; then - sed -i 's/"local": "false"/"local": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - elif [ ${cluster_type} == "openshift" ]; then - sed -i 's/"local": "false"/"local": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - fi } # @@ -1914,8 +1907,6 @@ function kruize_remote_patch() { if [ ${cluster_type} == "minikube" ]; then - #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} - #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} if grep -q '"isROSEnabled": "false"' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE}; then echo "match found" sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} @@ -1923,14 +1914,10 @@ function kruize_remote_patch() { echo "Error: Match not found" >&2 exit 1 fi - #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_MINIKUBE} elif [ ${cluster_type} == "openshift" ]; then - #sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - #sed -i 's/"local": "true"/"local": "false"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/"isROSEnabled": "false"/"isROSEnabled": "true"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(storage:\)[[:space:]]*[0-9]\+Mi/\1\2 1Gi/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} sed -i 's/\([[:space:]]*\)\(memory:\)[[:space:]]*".*"/\1\2 "2Gi"/; s/\([[:space:]]*\)\(cpu:\)[[:space:]]*".*"/\1\2 "2"/' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} - #sed -i '/"local": "false"/a \ \ \ \ "isROSEnabled": "true",' ${KRUIZE_CRC_DEPLOY_MANIFEST_OPENSHIFT} fi } From f9edf7c144fe29fdbe9c5c017f1704ea4b4617dd Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Tue, 3 Dec 2024 16:58:57 +0530 Subject: [PATCH 37/85] add timers Signed-off-by: kusumachalasani --- .../analyzer/services/BulkService.java | 55 +++-- .../analyzer/workerimpl/BulkJobManager.java | 223 +++++++++++------- .../common/datasource/DataSourceManager.java | 36 ++- .../com/autotune/utils/MetricsConfig.java | 14 ++ 4 files changed, 214 insertions(+), 114 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/services/BulkService.java b/src/main/java/com/autotune/analyzer/services/BulkService.java index 4f507f51a..8fb29f5b4 100644 --- a/src/main/java/com/autotune/analyzer/services/BulkService.java +++ b/src/main/java/com/autotune/analyzer/services/BulkService.java @@ -18,9 +18,11 @@ import com.autotune.analyzer.serviceObjects.BulkInput; import com.autotune.analyzer.serviceObjects.BulkJobStatus; import com.autotune.analyzer.workerimpl.BulkJobManager; +import com.autotune.utils.MetricsConfig; import com.fasterxml.jackson.databind.ObjectMapper; import com.fasterxml.jackson.databind.ser.impl.SimpleBeanPropertyFilter; import com.fasterxml.jackson.databind.ser.impl.SimpleFilterProvider; +import io.micrometer.core.instrument.Timer; import org.json.JSONObject; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -65,6 +67,8 @@ public void init(ServletConfig config) throws ServletException { */ @Override protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { + String statusValue = "failure"; + Timer.Sample timerJobStatus = Timer.start(MetricsConfig.meterRegistry()); try { String jobID = req.getParameter(JOB_ID); String verboseParam = req.getParameter(VERBOSE); @@ -107,12 +111,18 @@ protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws Se objectMapper.setFilterProvider(filters); String jsonResponse = objectMapper.writeValueAsString(jobDetails); resp.getWriter().write(jsonResponse); + statusValue = "success"; } catch (Exception e) { e.printStackTrace(); } } } catch (Exception e) { e.printStackTrace(); + } finally { + if (null != timerJobStatus) { + MetricsConfig.timerJobStatus = MetricsConfig.timerBJobStatus.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerJobStatus.stop(MetricsConfig.timerJobStatus); + } } } @@ -124,28 +134,37 @@ protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws Se */ @Override protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { - // Set response type - response.setContentType(JSON_CONTENT_TYPE); - response.setCharacterEncoding(CHARACTER_ENCODING); + String statusValue = "failure"; + Timer.Sample timerCreateBulkJob = Timer.start(MetricsConfig.meterRegistry()); + try { + // Set response type + response.setContentType(JSON_CONTENT_TYPE); + response.setCharacterEncoding(CHARACTER_ENCODING); - // Create ObjectMapper instance - ObjectMapper objectMapper = new ObjectMapper(); + // Create ObjectMapper instance + ObjectMapper objectMapper = new ObjectMapper(); - // Read the request payload and map to RequestPayload class - BulkInput payload = objectMapper.readValue(request.getInputStream(), BulkInput.class); + // Read the request payload and map to RequestPayload class + BulkInput payload = objectMapper.readValue(request.getInputStream(), BulkInput.class); - // Generate a unique jobID - String jobID = UUID.randomUUID().toString(); - BulkJobStatus jobStatus = new BulkJobStatus(jobID, IN_PROGRESS, Instant.now()); - jobStatusMap.put(jobID, jobStatus); - // Submit the job to be processed asynchronously - executorService.submit(new BulkJobManager(jobID, jobStatus, payload)); + // Generate a unique jobID + String jobID = UUID.randomUUID().toString(); + BulkJobStatus jobStatus = new BulkJobStatus(jobID, IN_PROGRESS, Instant.now()); + jobStatusMap.put(jobID, jobStatus); + // Submit the job to be processed asynchronously + executorService.submit(new BulkJobManager(jobID, jobStatus, payload)); - // Just sending a simple success response back - // Return the jobID to the user - JSONObject jsonObject = new JSONObject(); - jsonObject.put(JOB_ID, jobID); - response.getWriter().write(jsonObject.toString()); + // Just sending a simple success response back + // Return the jobID to the user + JSONObject jsonObject = new JSONObject(); + jsonObject.put(JOB_ID, jobID); + response.getWriter().write(jsonObject.toString()); + } finally { + if (null != timerCreateBulkJob) { + MetricsConfig.timerCreateBulkJob = MetricsConfig.timerBCreateBulkJob.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerCreateBulkJob.stop(MetricsConfig.timerCreateBulkJob); + } + } } diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index 48cc0026b..8eef423a3 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -27,6 +27,7 @@ import com.autotune.operator.KruizeDeploymentInfo; import com.autotune.utils.GenericRestApiClient; import com.autotune.utils.KruizeConstants; +import com.autotune.utils.MetricsConfig; import com.autotune.utils.Utils; import com.fasterxml.jackson.core.JsonProcessingException; import com.google.gson.Gson; @@ -48,6 +49,7 @@ import java.util.*; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; +import java.util.concurrent.TimeUnit; import java.util.regex.Matcher; import java.util.regex.Pattern; @@ -118,6 +120,9 @@ private static Map parseLabelString(String labelString) { @Override public void run() { + String statusValue = "failure"; + MetricsConfig.activeJobs.incrementAndGet(); + io.micrometer.core.instrument.Timer.Sample timerRunJob = Timer.start(MetricsConfig.meterRegistry()); DataSourceMetadataInfo metadataInfo = null; DataSourceManager dataSourceManager = new DataSourceManager(); DataSourceInfo datasource = null; @@ -153,82 +158,108 @@ public void run() { } else { ExecutorService createExecutor = Executors.newFixedThreadPool(bulk_thread_pool_size); ExecutorService generateExecutor = Executors.newFixedThreadPool(bulk_thread_pool_size); - for (CreateExperimentAPIObject apiObject : createExperimentAPIObjectMap.values()) { - DataSourceInfo finalDatasource = datasource; - createExecutor.submit(() -> { - String experiment_name = apiObject.getExperimentName(); - BulkJobStatus.Experiment experiment = jobData.addExperiment(experiment_name); - try { - // send request to createExperiment API for experiment creation - GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); - apiClient.setBaseURL(KruizeDeploymentInfo.experiments_url); - GenericRestApiClient.HttpResponseWrapper responseCode; - boolean expriment_exists = false; + try { + for (CreateExperimentAPIObject apiObject : createExperimentAPIObjectMap.values()) { + DataSourceInfo finalDatasource = datasource; + createExecutor.submit(() -> { + String experiment_name = apiObject.getExperimentName(); + BulkJobStatus.Experiment experiment = jobData.addExperiment(experiment_name); try { - responseCode = apiClient.callKruizeAPI("[" + new Gson().toJson(apiObject) + "]"); - LOGGER.debug("API Response code: {}", responseCode); - if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { - expriment_exists = true; - } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { - expriment_exists = true; - } else { - experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, responseCode.getResponseBody().toString(), responseCode.getStatusCode())); - } - } catch (Exception e) { - e.printStackTrace(); - experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_BAD_REQUEST)); - } finally { - if (!expriment_exists) { - LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); - } - synchronized (new Object()) { - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { - setFinalJobStatus(COMPLETED, null, null, finalDatasource); + // send request to createExperiment API for experiment creation + GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); + apiClient.setBaseURL(KruizeDeploymentInfo.experiments_url); + GenericRestApiClient.HttpResponseWrapper responseCode; + boolean expriment_exists = false; + try { + responseCode = apiClient.callKruizeAPI("[" + new Gson().toJson(apiObject) + "]"); + LOGGER.debug("API Response code: {}", responseCode); + if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { + expriment_exists = true; + } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { + expriment_exists = true; + } else { + experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, responseCode.getResponseBody().toString(), responseCode.getStatusCode())); + } + } catch (Exception e) { + e.printStackTrace(); + experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_BAD_REQUEST)); + } finally { + if (!expriment_exists) { + LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); + jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); + } + synchronized (new Object()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + setFinalJobStatus(COMPLETED, null, null, finalDatasource); + } } } - } - if (expriment_exists) { - generateExecutor.submit(() -> { - // send request to generateRecommendations API - GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); - String encodedExperimentName; - encodedExperimentName = URLEncoder.encode(experiment_name, StandardCharsets.UTF_8); - recommendationApiClient.setBaseURL(String.format(KruizeDeploymentInfo.recommendations_url, encodedExperimentName)); - GenericRestApiClient.HttpResponseWrapper recommendationResponseCode = null; - try { - recommendationResponseCode = recommendationApiClient.callKruizeAPI(null); - LOGGER.debug("API Response code: {}", recommendationResponseCode); - if (recommendationResponseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { - experiment.getRecommendations().setStatus(NotificationConstants.Status.PROCESSED); - } else { + if (expriment_exists) { + generateExecutor.submit(() -> { + // send request to generateRecommendations API + GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); + String encodedExperimentName; + encodedExperimentName = URLEncoder.encode(experiment_name, StandardCharsets.UTF_8); + recommendationApiClient.setBaseURL(String.format(KruizeDeploymentInfo.recommendations_url, encodedExperimentName)); + GenericRestApiClient.HttpResponseWrapper recommendationResponseCode = null; + try { + recommendationResponseCode = recommendationApiClient.callKruizeAPI(null); + LOGGER.debug("API Response code: {}", recommendationResponseCode); + if (recommendationResponseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { + experiment.getRecommendations().setStatus(NotificationConstants.Status.PROCESSED); + } else { + experiment.getRecommendations().setStatus(NotificationConstants.Status.FAILED); + experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, recommendationResponseCode.getResponseBody().toString(), recommendationResponseCode.getStatusCode())); + } + } catch (Exception e) { + e.printStackTrace(); experiment.getRecommendations().setStatus(NotificationConstants.Status.FAILED); - experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, recommendationResponseCode.getResponseBody().toString(), recommendationResponseCode.getStatusCode())); - } - } catch (Exception e) { - e.printStackTrace(); - experiment.getRecommendations().setStatus(NotificationConstants.Status.FAILED); - experiment.getRecommendations().setNotifications(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); - } finally { - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); - synchronized (new Object()) { - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { - setFinalJobStatus(COMPLETED, null, null, finalDatasource); + experiment.getRecommendations().setNotifications(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); + } finally { + jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); + synchronized (new Object()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + setFinalJobStatus(COMPLETED, null, null, finalDatasource); + } } } - } - }); - } - } catch (Exception e) { - e.printStackTrace(); - experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { - setFinalJobStatus(COMPLETED, null, null, finalDatasource); + }); + } + } catch (Exception e) { + e.printStackTrace(); + experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); + jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + setFinalJobStatus(COMPLETED, null, null, finalDatasource); + } } + }); + } + } finally { + createExecutor.shutdown(); + while (!createExecutor.isTerminated()) { + try { + createExecutor.awaitTermination(1, TimeUnit.MINUTES); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + break; + } + } + + generateExecutor.shutdown(); + while (!generateExecutor.isTerminated()) { + try { + generateExecutor.awaitTermination(1, TimeUnit.MINUTES); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + break; } - }); + } + + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + statusValue = "success"; + } } } } @@ -249,6 +280,12 @@ public void run() { LOGGER.error(e.getMessage()); e.printStackTrace(); setFinalJobStatus(FAILED, String.valueOf(HttpURLConnection.HTTP_INTERNAL_ERROR), new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR), datasource); + } finally { + if (null != timerRunJob) { + MetricsConfig.timerRunJob = MetricsConfig.timerBRunJob.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerRunJob.stop(MetricsConfig.timerRunJob); + } + MetricsConfig.activeJobs.decrementAndGet(); } } @@ -286,34 +323,43 @@ public void setFinalJobStatus(String status, String notificationKey, BulkJobStat } Map getExperimentMap(String labelString, BulkJobStatus jobData, DataSourceMetadataInfo metadataInfo, DataSourceInfo datasource) throws Exception { - Map createExperimentAPIObjectMap = new HashMap<>(); - Collection dataSourceCollection = metadataInfo.getDataSourceHashMap().values(); - for (DataSource ds : dataSourceCollection) { - HashMap clusterHashMap = ds.getDataSourceClusterHashMap(); - for (DataSourceCluster dsc : clusterHashMap.values()) { - HashMap namespaceHashMap = dsc.getDataSourceNamespaceHashMap(); - for (DataSourceNamespace namespace : namespaceHashMap.values()) { - HashMap dataSourceWorkloadHashMap = namespace.getDataSourceWorkloadHashMap(); - if (dataSourceWorkloadHashMap != null) { - for (DataSourceWorkload dsw : dataSourceWorkloadHashMap.values()) { - HashMap dataSourceContainerHashMap = dsw.getDataSourceContainerHashMap(); - if (dataSourceContainerHashMap != null) { - for (DataSourceContainer dc : dataSourceContainerHashMap.values()) { - // Experiment name - dynamically constructed - String experiment_name = frameExperimentName(labelString, dsc, namespace, dsw, dc); - // create JSON to be passed in the createExperimentAPI - List createExperimentAPIObjectList = new ArrayList<>(); - CreateExperimentAPIObject apiObject = prepareCreateExperimentJSONInput(dc, dsc, dsw, namespace, - experiment_name, createExperimentAPIObjectList); - createExperimentAPIObjectMap.put(experiment_name, apiObject); + String statusValue = "failure"; + Timer.Sample timerGetExpMap = Timer.start(MetricsConfig.meterRegistry()); + try { + Map createExperimentAPIObjectMap = new HashMap<>(); + Collection dataSourceCollection = metadataInfo.getDataSourceHashMap().values(); + for (DataSource ds : dataSourceCollection) { + HashMap clusterHashMap = ds.getDataSourceClusterHashMap(); + for (DataSourceCluster dsc : clusterHashMap.values()) { + HashMap namespaceHashMap = dsc.getDataSourceNamespaceHashMap(); + for (DataSourceNamespace namespace : namespaceHashMap.values()) { + HashMap dataSourceWorkloadHashMap = namespace.getDataSourceWorkloadHashMap(); + if (dataSourceWorkloadHashMap != null) { + for (DataSourceWorkload dsw : dataSourceWorkloadHashMap.values()) { + HashMap dataSourceContainerHashMap = dsw.getDataSourceContainerHashMap(); + if (dataSourceContainerHashMap != null) { + for (DataSourceContainer dc : dataSourceContainerHashMap.values()) { + // Experiment name - dynamically constructed + String experiment_name = frameExperimentName(labelString, dsc, namespace, dsw, dc); + // create JSON to be passed in the createExperimentAPI + List createExperimentAPIObjectList = new ArrayList<>(); + CreateExperimentAPIObject apiObject = prepareCreateExperimentJSONInput(dc, dsc, dsw, namespace, + experiment_name, createExperimentAPIObjectList); + createExperimentAPIObjectMap.put(experiment_name, apiObject); + } } } } } } } + return createExperimentAPIObjectMap; + } finally { + if (null != timerGetExpMap) { + MetricsConfig.timerGetExpMap = MetricsConfig.timerBGetExpMap.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerGetExpMap.stop(MetricsConfig.timerGetExpMap); + } } - return createExperimentAPIObjectMap; } private String getLabels(BulkInput.FilterWrapper filter) { @@ -461,3 +507,4 @@ public String frameExperimentName(String labelString, DataSourceCluster dataSour return experimentName; } } + diff --git a/src/main/java/com/autotune/common/datasource/DataSourceManager.java b/src/main/java/com/autotune/common/datasource/DataSourceManager.java index a8401970c..a71d80083 100644 --- a/src/main/java/com/autotune/common/datasource/DataSourceManager.java +++ b/src/main/java/com/autotune/common/datasource/DataSourceManager.java @@ -24,6 +24,8 @@ import com.autotune.database.dao.ExperimentDAOImpl; import com.autotune.database.service.ExperimentDBService; import com.autotune.utils.KruizeConstants; +import com.autotune.utils.MetricsConfig; +import io.micrometer.core.instrument.Timer; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -65,15 +67,25 @@ public DataSourceManager() { * @return */ public DataSourceMetadataInfo importMetadataFromDataSource(DataSourceInfo dataSourceInfo, String uniqueKey, long startTime, long endTime, int steps) throws DataSourceDoesNotExist, IOException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException { - if (null == dataSourceInfo) { - throw new DataSourceDoesNotExist(KruizeConstants.DataSourceConstants.DataSourceErrorMsgs.MISSING_DATASOURCE_INFO); - } - DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.createDataSourceMetadata(dataSourceInfo, uniqueKey, startTime, endTime, steps); - if (null == dataSourceMetadataInfo) { - LOGGER.error(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceInfo.getName()); - return null; + String statusValue = "failure"; + io.micrometer.core.instrument.Timer.Sample timerImportMetadata = Timer.start(MetricsConfig.meterRegistry()); + try { + if (null == dataSourceInfo) { + throw new DataSourceDoesNotExist(KruizeConstants.DataSourceConstants.DataSourceErrorMsgs.MISSING_DATASOURCE_INFO); + } + DataSourceMetadataInfo dataSourceMetadataInfo = dataSourceMetadataOperator.createDataSourceMetadata(dataSourceInfo, uniqueKey, startTime, endTime, steps); + if (null == dataSourceMetadataInfo) { + LOGGER.error(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceInfo.getName()); + return null; + } + statusValue = "success"; + return dataSourceMetadataInfo; + } finally { + if (null != timerImportMetadata) { + MetricsConfig.timerImportMetadata = MetricsConfig.timerBImportMetadata.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerImportMetadata.stop(MetricsConfig.timerImportMetadata); + } } - return dataSourceMetadataInfo; } /** @@ -84,6 +96,8 @@ public DataSourceMetadataInfo importMetadataFromDataSource(DataSourceInfo dataSo * @throws DataSourceDoesNotExist Thrown when the provided data source information is null. */ public DataSourceMetadataInfo getMetadataFromDataSource(DataSourceInfo dataSource) { + String statusValue = "failure"; + io.micrometer.core.instrument.Timer.Sample timerGetMetadata = Timer.start(MetricsConfig.meterRegistry()); try { if (null == dataSource) { throw new DataSourceDoesNotExist(KruizeConstants.DataSourceConstants.DataSourceErrorMsgs.MISSING_DATASOURCE_INFO); @@ -94,11 +108,17 @@ public DataSourceMetadataInfo getMetadataFromDataSource(DataSourceInfo dataSourc LOGGER.error(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.DATASOURCE_METADATA_INFO_NOT_AVAILABLE, "for datasource {}" + dataSourceName); return null; } + statusValue = "success"; return dataSourceMetadataInfo; } catch (DataSourceDoesNotExist e) { LOGGER.error(e.getMessage()); } catch (Exception e) { LOGGER.error("Loading saved datasource metadata failed: {} ", e.getMessage()); + } finally { + if (null != timerGetMetadata) { + MetricsConfig.timerGetMetadata = MetricsConfig.timerBGetMetadata.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerGetMetadata.stop(MetricsConfig.timerGetMetadata); + } } return null; } diff --git a/src/main/java/com/autotune/utils/MetricsConfig.java b/src/main/java/com/autotune/utils/MetricsConfig.java index 002d1411a..c0a99e2e6 100644 --- a/src/main/java/com/autotune/utils/MetricsConfig.java +++ b/src/main/java/com/autotune/utils/MetricsConfig.java @@ -17,6 +17,8 @@ public class MetricsConfig { public static Timer timerLoadAllRec, timerLoadAllExp, timerLoadAllResults; public static Timer timerAddRecDB, timerAddResultsDB, timerAddExpDB, timerAddBulkResultsDB; public static Timer timerAddPerfProfileDB, timerLoadPerfProfileName, timerLoadAllPerfProfiles; + public static Timer timerImportMetadata, timerGetMetadata; + public static Timer timerJobStatus, timerCreateBulkJob, timerGetExpMap, timerCreateBulkExp, timerGenerateBulkRec, timerRunJob; public static Counter timerKruizeNotifications; public static Timer.Builder timerBListRec, timerBListExp, timerBCreateExp, timerBUpdateResults, timerBUpdateRecommendations; public static Timer.Builder timerBLoadRecExpName, timerBLoadResultsExpName, timerBLoadExpName, timerBLoadRecExpNameDate, timerBBoxPlots; @@ -27,6 +29,8 @@ public class MetricsConfig { public static PrometheusMeterRegistry meterRegistry; public static Timer timerListDS, timerImportDSMetadata, timerListDSMetadata; public static Timer.Builder timerBListDS, timerBImportDSMetadata, timerBListDSMetadata; + public static Timer.Builder timerBImportMetadata, timerBGetMetadata; + public static Timer.Builder timerBJobStatus, timerBCreateBulkJob, timerBGetExpMap, timerBCreateBulkExp, timerBGenerateBulkRec, timerBRunJob; private static MetricsConfig INSTANCE; public String API_METRIC_DESC = "Time taken for Kruize APIs"; public String DB_METRIC_DESC = "Time taken for KruizeDB methods"; @@ -62,6 +66,16 @@ private MetricsConfig() { timerBImportDSMetadata = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "dsmetadata").tag("method", "POST"); timerBListDSMetadata = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "dsmetadata").tag("method", "GET"); timerBKruizeNotifications = Counter.builder("KruizeNotifications").description("Kruize notifications").tag("api", "updateRecommendations"); + + timerBImportMetadata = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "datasources").tag("method", "importMetadata"); + timerBGetMetadata = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "datasources").tag("method", "getMetadata"); + timerBJobStatus = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "jobStatus"); + timerBCreateBulkJob = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkJob"); + timerBGetExpMap = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "getExperimentMap"); + timerBCreateBulkExp = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkExperiment"); + timerBGenerateBulkRec = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "generateBulkRecommendation"); + timerBRunJob = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "runBulkJob"); + new ClassLoaderMetrics().bindTo(meterRegistry); new ProcessorMetrics().bindTo(meterRegistry); new JvmGcMetrics().bindTo(meterRegistry); From 06fe1056c34b11e0b149cf01b9c4b92ec53f31b0 Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Wed, 4 Dec 2024 10:22:28 +0530 Subject: [PATCH 38/85] update createJob timer status Signed-off-by: kusumachalasani --- src/main/java/com/autotune/analyzer/services/BulkService.java | 1 + 1 file changed, 1 insertion(+) diff --git a/src/main/java/com/autotune/analyzer/services/BulkService.java b/src/main/java/com/autotune/analyzer/services/BulkService.java index 8fb29f5b4..c695bc70a 100644 --- a/src/main/java/com/autotune/analyzer/services/BulkService.java +++ b/src/main/java/com/autotune/analyzer/services/BulkService.java @@ -159,6 +159,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) JSONObject jsonObject = new JSONObject(); jsonObject.put(JOB_ID, jobID); response.getWriter().write(jsonObject.toString()); + statusValue = "success"; } finally { if (null != timerCreateBulkJob) { MetricsConfig.timerCreateBulkJob = MetricsConfig.timerBCreateBulkJob.tag("status", statusValue).register(MetricsConfig.meterRegistry()); From e037b3b1faebdcb74b89201fc2ead37b9c7bb590 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 11:55:08 +0530 Subject: [PATCH 39/85] updating vpa recommender key Signed-off-by: Shekhar Saxena --- .../java/com/autotune/analyzer/utils/AnalyzerConstants.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index d9a4cc650..135a5ece3 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -692,7 +692,7 @@ private SupportedUpdaters() { public static final class VPA { public static final String VPA_PLURAL = "VerticalPodAutoscaler"; public static final String RECOMMENDERS = "recommenders"; - public static final String RECOMMENDER_KEY = "recommender"; + public static final String RECOMMENDER_KEY = "name"; public static final String RECOMMENDER_NAME = "Kruize"; public static final String VPA_API_VERSION = "autoscaling.k8s.io/v1"; public static final String VPA_TARGET_REF_API_VERSION = "apps/v1"; From e4e3df40d5b00ac2a3503ca66a519f0b5fe04cff Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Tue, 10 Dec 2024 12:02:46 +0530 Subject: [PATCH 40/85] resolve conflicts Signed-off-by: kusumachalasani --- .../analyzer/workerimpl/BulkJobManager.java | 19 +++++++++++-------- .../com/autotune/utils/MetricsConfig.java | 17 ++++++++++++----- 2 files changed, 23 insertions(+), 13 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index 8eef423a3..da479724c 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -50,6 +50,7 @@ import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicReference; import java.util.regex.Matcher; import java.util.regex.Pattern; @@ -165,26 +166,29 @@ public void run() { String experiment_name = apiObject.getExperimentName(); BulkJobStatus.Experiment experiment = jobData.addExperiment(experiment_name); try { - // send request to createExperiment API for experiment creation GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); apiClient.setBaseURL(KruizeDeploymentInfo.experiments_url); GenericRestApiClient.HttpResponseWrapper responseCode; - boolean expriment_exists = false; + boolean experiment_exists = false; try { responseCode = apiClient.callKruizeAPI("[" + new Gson().toJson(apiObject) + "]"); LOGGER.debug("API Response code: {}", responseCode); if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { - expriment_exists = true; + experiment_exists = true; } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { - expriment_exists = true; + experiment_exists = true; } else { + } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { + experiment_exists = true; + } else { + jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, responseCode.getResponseBody().toString(), responseCode.getStatusCode())); } } catch (Exception e) { e.printStackTrace(); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_BAD_REQUEST)); } finally { - if (!expriment_exists) { + if (!experiment_exists) { LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); } @@ -193,9 +197,8 @@ public void run() { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } } - } - - if (expriment_exists) { + } + if (experiment_exists) { generateExecutor.submit(() -> { // send request to generateRecommendations API GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); diff --git a/src/main/java/com/autotune/utils/MetricsConfig.java b/src/main/java/com/autotune/utils/MetricsConfig.java index c0a99e2e6..897f45848 100644 --- a/src/main/java/com/autotune/utils/MetricsConfig.java +++ b/src/main/java/com/autotune/utils/MetricsConfig.java @@ -1,6 +1,8 @@ package com.autotune.utils; import io.micrometer.core.instrument.Counter; +import io.micrometer.core.instrument.Gauge; +import io.micrometer.core.instrument.Metrics; import io.micrometer.core.instrument.Timer; import io.micrometer.core.instrument.binder.jvm.ClassLoaderMetrics; import io.micrometer.core.instrument.binder.jvm.JvmGcMetrics; @@ -10,6 +12,8 @@ import io.micrometer.prometheus.PrometheusConfig; import io.micrometer.prometheus.PrometheusMeterRegistry; +import java.util.concurrent.atomic.AtomicInteger; + public class MetricsConfig { public static Timer timerListRec, timerListExp, timerCreateExp, timerUpdateResults, timerUpdateRecomendations; @@ -19,13 +23,13 @@ public class MetricsConfig { public static Timer timerAddPerfProfileDB, timerLoadPerfProfileName, timerLoadAllPerfProfiles; public static Timer timerImportMetadata, timerGetMetadata; public static Timer timerJobStatus, timerCreateBulkJob, timerGetExpMap, timerCreateBulkExp, timerGenerateBulkRec, timerRunJob; - public static Counter timerKruizeNotifications; + public static Counter timerKruizeNotifications , timerBulkJobs; public static Timer.Builder timerBListRec, timerBListExp, timerBCreateExp, timerBUpdateResults, timerBUpdateRecommendations; public static Timer.Builder timerBLoadRecExpName, timerBLoadResultsExpName, timerBLoadExpName, timerBLoadRecExpNameDate, timerBBoxPlots; public static Timer.Builder timerBLoadAllRec, timerBLoadAllExp, timerBLoadAllResults; public static Timer.Builder timerBAddRecDB, timerBAddResultsDB, timerBAddExpDB, timerBAddBulkResultsDB; public static Timer.Builder timerBAddPerfProfileDB, timerBLoadPerfProfileName, timerBLoadAllPerfProfiles; - public static Counter.Builder timerBKruizeNotifications; + public static Counter.Builder timerBKruizeNotifications, timerBBulkJobs; public static PrometheusMeterRegistry meterRegistry; public static Timer timerListDS, timerImportDSMetadata, timerListDSMetadata; public static Timer.Builder timerBListDS, timerBImportDSMetadata, timerBListDSMetadata; @@ -35,6 +39,8 @@ public class MetricsConfig { public String API_METRIC_DESC = "Time taken for Kruize APIs"; public String DB_METRIC_DESC = "Time taken for KruizeDB methods"; public String METHOD_METRIC_DESC = "Time taken for Kruize methods"; + public static final AtomicInteger activeJobs = new AtomicInteger(0); + public static Gauge.Builder timerBBulkRunJobs; private MetricsConfig() { meterRegistry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT); @@ -72,16 +78,17 @@ private MetricsConfig() { timerBJobStatus = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "jobStatus"); timerBCreateBulkJob = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkJob"); timerBGetExpMap = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "getExperimentMap"); - timerBCreateBulkExp = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkExperiment"); - timerBGenerateBulkRec = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "generateBulkRecommendation"); + //timerBCreateBulkExp = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkExperiment"); + //timerBGenerateBulkRec = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "generateBulkRecommendation"); timerBRunJob = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "runBulkJob"); + timerBBulkRunJobs = Gauge.builder("kruizeAPI_active_jobs_count", activeJobs, AtomicInteger::get).description("No.of bulk jobs running").tags("api", "bulk", "method", "runBulkJob" , "status", "running"); + timerBBulkRunJobs.register(meterRegistry); new ClassLoaderMetrics().bindTo(meterRegistry); new ProcessorMetrics().bindTo(meterRegistry); new JvmGcMetrics().bindTo(meterRegistry); new JvmMemoryMetrics().bindTo(meterRegistry); meterRegistry.config().namingConvention(NamingConvention.dot); - } public static PrometheusMeterRegistry meterRegistry() { From 7082f51452ae48ac557c1946eced6bb88910036b Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Thu, 5 Dec 2024 23:16:40 +0530 Subject: [PATCH 41/85] update bulk metrics Signed-off-by: kusumachalasani --- scripts/kruize_metrics.py | 57 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/scripts/kruize_metrics.py b/scripts/kruize_metrics.py index 00cba2647..5f7016ca1 100644 --- a/scripts/kruize_metrics.py +++ b/scripts/kruize_metrics.py @@ -25,7 +25,7 @@ import os import argparse -csv_headers = ["timestamp","listRecommendations_count_success","listExperiments_count_success","createExperiment_count_success","updateResults_count_success","updateRecommendations_count_success","generatePlots_count_success","loadRecommendationsByExperimentName_count_success","loadRecommendationsByExperimentNameAndDate_count_success","loadResultsByExperimentName_count_success","loadExperimentByName_count_success","addRecommendationToDB_count_success","addResultToDB_count_success","addBulkResultsToDBAndFetchFailedResults_count_success","addExperimentToDB_count_success","addPerformanceProfileToDB_count_success","loadPerformanceProfileByName_count_success","loadAllPerformanceProfiles_count_success","listRecommendations_count_failure","listExperiments_count_failure","createExperiment_count_failure","updateResults_count_failure","updateRecommendations_count_failure","generatePlots_count_failure","loadRecommendationsByExperimentName_count_failure","loadRecommendationsByExperimentNameAndDate_count_failure","loadResultsByExperimentName_count_failure","loadExperimentByName_count_failure","addRecommendationToDB_count_failure","addResultToDB_count_failure","addBulkResultsToDBAndFetchFailedResults_count_failure","addExperimentToDB_count_failure","addPerformanceProfileToDB_count_failure","loadPerformanceProfileByName_count_failure","loadAllPerformanceProfiles_count_failure","listRecommendations_sum_success","listExperiments_sum_success","createExperiment_sum_success","updateResults_sum_success","updateRecommendations_sum_success","generatePlots_sum_success","loadRecommendationsByExperimentName_sum_success","loadRecommendationsByExperimentNameAndDate_sum_success","loadResultsByExperimentName_sum_success","loadExperimentByName_sum_success","addRecommendationToDB_sum_success","addResultToDB_sum_success","addBulkResultsToDBAndFetchFailedResults_sum_success","addExperimentToDB_sum_success","addPerformanceProfileToDB_sum_success","loadPerformanceProfileByName_sum_success","loadAllPerformanceProfiles_sum_success","listRecommendations_sum_failure","listExperiments_sum_failure","createExperiment_sum_failure","updateResults_sum_failure","updateRecommendations_sum_failure","generatePlots_sum_failure","loadRecommendationsByExperimentName_sum_failure","loadRecommendationsByExperimentNameAndDate_sum_failure","loadResultsByExperimentName_sum_failure","loadExperimentByName_sum_failure","addRecommendationToDB_sum_failure","addResultToDB_sum_failure","addBulkResultsToDBAndFetchFailedResults_sum_failure","addExperimentToDB_sum_failure","addPerformanceProfileToDB_sum_failure","loadPerformanceProfileByName_sum_failure","loadAllPerformanceProfiles_sum_failure","loadAllRecommendations_sum_failure","loadAllExperiments_sum_failure","loadAllResults_sum_failure","loadAllRecommendations_sum_success","loadAllExperiments_sum_success","loadAllResults_sum_success","listRecommendations_max_success","listExperiments_max_success","createExperiment_max_success","updateResults_max_success","updateRecommendations_max_success","generatePlots_max_success","loadRecommendationsByExperimentName_max_success","loadRecommendationsByExperimentNameAndDate_max_success","loadResultsByExperimentName_max_success","loadExperimentByName_max_success","addRecommendationToDB_max_success","addResultToDB_max_success","addBulkResultsToDBAndFetchFailedResults_max_success","addExperimentToDB_max_success","addPerformanceProfileToDB_max_success","loadPerformanceProfileByName_max_success","loadAllPerformanceProfiles_max_success","kruizedb_cpu_max","kruizedb_memory","kruize_cpu_max","kruize_memory","kruize_results","db_size","updateResultsPerCall_success","updateRecommendationsPerCall_success","updateRecommendations_notifications_total"] +csv_headers = ["timestamp","listRecommendations_count_success","listExperiments_count_success","createExperiment_count_success","updateResults_count_success","updateRecommendations_count_success","createBulkJob_count_success", "bulkJobs_count_running" , "jobStatus_count_success", "bulk_getExperimentMap_count_success", "runBulkJob_count_success","importMetadata_count_success", "generatePlots_count_success","loadRecommendationsByExperimentName_count_success","loadRecommendationsByExperimentNameAndDate_count_success","loadResultsByExperimentName_count_success","loadExperimentByName_count_success","addRecommendationToDB_count_success","addResultToDB_count_success","addBulkResultsToDBAndFetchFailedResults_count_success","addExperimentToDB_count_success","addPerformanceProfileToDB_count_success","loadPerformanceProfileByName_count_success","loadAllPerformanceProfiles_count_success","listRecommendations_count_failure","listExperiments_count_failure","createExperiment_count_failure","updateResults_count_failure","updateRecommendations_count_failure","createBulkJob_count_failure","jobStatus_count_failure","runBulkJob_count_failure","importMetadata_count_failure","generatePlots_count_failure","loadRecommendationsByExperimentName_count_failure","loadRecommendationsByExperimentNameAndDate_count_failure","loadResultsByExperimentName_count_failure","loadExperimentByName_count_failure","addRecommendationToDB_count_failure","addResultToDB_count_failure","addBulkResultsToDBAndFetchFailedResults_count_failure","addExperimentToDB_count_failure","addPerformanceProfileToDB_count_failure","loadPerformanceProfileByName_count_failure","loadAllPerformanceProfiles_count_failure","listRecommendations_sum_success","listExperiments_sum_success","createExperiment_sum_success","updateResults_sum_success","updateRecommendations_sum_success","createBulkJob_sum_success", "jobStatus_sum_success", "bulk_getExperimentMap_sum_success", "runBulkJob_sum_success","importMetadata_sum_success", "generatePlots_sum_success","loadRecommendationsByExperimentName_sum_success","loadRecommendationsByExperimentNameAndDate_sum_success","loadResultsByExperimentName_sum_success","loadExperimentByName_sum_success","addRecommendationToDB_sum_success","addResultToDB_sum_success","addBulkResultsToDBAndFetchFailedResults_sum_success","addExperimentToDB_sum_success","addPerformanceProfileToDB_sum_success","loadPerformanceProfileByName_sum_success","loadAllPerformanceProfiles_sum_success","listRecommendations_sum_failure","listExperiments_sum_failure","createExperiment_sum_failure","updateResults_sum_failure","updateRecommendations_sum_failure","createBulkJob_sum_failure", "jobStatus_sum_failure", "bulk_getExperimentMap_sum_failure", "runBulkJob_sum_failure","importMetadata_sum_failure","generatePlots_sum_failure","loadRecommendationsByExperimentName_sum_failure","loadRecommendationsByExperimentNameAndDate_sum_failure","loadResultsByExperimentName_sum_failure","loadExperimentByName_sum_failure","addRecommendationToDB_sum_failure","addResultToDB_sum_failure","addBulkResultsToDBAndFetchFailedResults_sum_failure","addExperimentToDB_sum_failure","addPerformanceProfileToDB_sum_failure","loadPerformanceProfileByName_sum_failure","loadAllPerformanceProfiles_sum_failure","loadAllRecommendations_sum_failure","loadAllExperiments_sum_failure","loadAllResults_sum_failure","loadAllRecommendations_sum_success","loadAllExperiments_sum_success","loadAllResults_sum_success","listRecommendations_max_success","listExperiments_max_success","createExperiment_max_success","updateResults_max_success","updateRecommendations_max_success","createBulkJob_max_success", "jobStatus_max_success", "bulk_getExperimentMap_max_success", "runBulkJob_max_success","importMetadata_max_success","generatePlots_max_success","loadRecommendationsByExperimentName_max_success","loadRecommendationsByExperimentNameAndDate_max_success","loadResultsByExperimentName_max_success","loadExperimentByName_max_success","addRecommendationToDB_max_success","addResultToDB_max_success","addBulkResultsToDBAndFetchFailedResults_max_success","addExperimentToDB_max_success","addPerformanceProfileToDB_max_success","loadPerformanceProfileByName_max_success","loadAllPerformanceProfiles_max_success","kruizedb_cpu_max","kruizedb_memory","kruize_cpu_max","kruize_memory","kruize_results","db_size","updateResultsPerCall_success","updateRecommendationsPerCall_success","BulkJobPerCall_success", "updateRecommendations_notifications_total"] queries_map_total = { @@ -34,6 +34,12 @@ "createExperiment_count_success": "sum((kruizeAPI_count{api=\"createExperiment\",application=\"Kruize\",status=\"success\"}))", "updateResults_count_success": "sum((kruizeAPI_count{api=\"updateResults\",application=\"Kruize\",status=\"success\"}))", "updateRecommendations_count_success": "sum((kruizeAPI_count{api=\"updateRecommendations\",application=\"Kruize\",status=\"success\"}))", + "createBulkJob_count_success": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"success\"}))", + "bulkJobs_count_running": "sum((kruizeAPI_active_jobs_count{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"running\"}))", + "jobStatus_count_success": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"success\"}))", + "bulk_getExperimentMap_count_success": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"success\"}))", + "runBulkJob_count_success": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"success\"}))", + "importMetadata_count_success": "sum((kruizeAPI_count{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"success\"}))", "generatePlots_count_success": "sum((KruizeMethod_count{method=\"generatePlots\",application=\"Kruize\",status=\"success\"}))", "loadRecommendationsByExperimentName_count_success": "sum((kruizeDB_count{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"success\"}))", "loadRecommendationsByExperimentNameAndDate_count_success": "sum((kruizeDB_count{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"success\"}))", @@ -51,6 +57,10 @@ "createExperiment_count_failure": "sum((kruizeAPI_count{api=\"createExperiment\",application=\"Kruize\",status=\"failure\"}))", "updateResults_count_failure": "sum((kruizeAPI_count{api=\"updateResults\",application=\"Kruize\",status=\"failure\"}))", "updateRecommendations_count_failure": "sum((kruizeAPI_count{api=\"updateRecommendations\",application=\"Kruize\",status=\"failure\"}))", + "createBulkJob_count_failure": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"failure\"}))", + "jobStatus_count_failure": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"failure\"}))", + "runBulkJob_count_failure": "sum((kruizeAPI_count{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"failure\"}))", + "importMetadata_count_failure": "sum((kruizeAPI_count{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"failure\"}))", "generatePlots_count_failure": "sum((KruizeMethod_count{method=\"generatePlots\",application=\"Kruize\",status=\"failure\"}))", "loadRecommendationsByExperimentName_count_failure": "sum((kruizeDB_count{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"failure\"}))", "loadRecommendationsByExperimentNameAndDate_count_failure": "sum((kruizeDB_count{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"failure\"}))", @@ -68,6 +78,11 @@ "createExperiment_sum_success": "sum((kruizeAPI_sum{api=\"createExperiment\",application=\"Kruize\",status=\"success\"}))", "updateResults_sum_success": "sum((kruizeAPI_sum{api=\"updateResults\",application=\"Kruize\",status=\"success\"}))", "updateRecommendations_sum_success": "sum((kruizeAPI_sum{api=\"updateRecommendations\",application=\"Kruize\",status=\"success\"}))", + "createBulkJob_sum_success": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"success\"}))", + "jobStatus_sum_success": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"success\"}))", + "bulk_getExperimentMap_sum_success": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"success\"}))", + "runBulkJob_sum_success": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"success\"}))", + "importMetadata_sum_success": "sum((kruizeAPI_sum{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"success\"}))", "generatePlots_sum_success": "sum((KruizeMethod_sum{method=\"generatePlots\",application=\"Kruize\",status=\"success\"}))", "loadRecommendationsByExperimentName_sum_success": "sum((kruizeDB_sum{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"success\"}))", "loadRecommendationsByExperimentNameAndDate_sum_success": "sum((kruizeDB_sum{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"success\"}))", @@ -85,6 +100,11 @@ "createExperiment_sum_failure": "sum((kruizeAPI_sum{api=\"createExperiment\",application=\"Kruize\",status=\"failure\"}))", "updateResults_sum_failure": "sum((kruizeAPI_sum{api=\"updateResults\",application=\"Kruize\",status=\"failure\"}))", "updateRecommendations_sum_failure": "sum((kruizeAPI_sum{api=\"updateRecommendations\",application=\"Kruize\",status=\"failure\"}))", + "createBulkJob_sum_failure": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"failure\"}))", + "jobStatus_sum_failure": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"failure\"}))", + "bulk_getExperimentMap_sum_failure": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"failure\"}))", + "runBulkJob_sum_failure": "sum((kruizeAPI_sum{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"failure\"}))", + "importMetadata_sum_failure": "sum((kruizeAPI_sum{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"failure\"}))", "generatePlots_sum_failure": "sum((KruizeMethod_sum{method=\"generatePlots\",application=\"Kruize\",status=\"failure\"}))", "loadRecommendationsByExperimentName_sum_failure": "sum((kruizeDB_sum{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"failure\"}))", "loadRecommendationsByExperimentNameAndDate_sum_failure": "sum((kruizeDB_sum{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"failure\"}))", @@ -108,6 +128,11 @@ "createExperiment_max_success": "max(max_over_time(kruizeAPI_max{{api=\"createExperiment\",application=\"Kruize\",status=\"success\"}}[6h]))", "updateResults_max_success": "max(max_over_time(kruizeAPI_max{{api=\"updateResults\",application=\"Kruize\",status=\"success\"}}[6h]))", "updateRecommendations_max_success": "max(max_over_time(kruizeAPI_max{{api=\"updateRecommendations\",application=\"Kruize\",status=\"success\"}}[6h]))", + "createBulkJob_max_success": "max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"success\"}}[6h]))", + "jobStatus_max_success": "max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"success\"}}[6h]))", + "bulk_getExperimentMap_max_success": "max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"success\"}}[6h]))", + "runBulkJob_max_success": "max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"success\"}}[6h]))", + "importMetadata_max_success": "max(max_over_time(kruizeAPI_max{{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"success\"}}[6h]))", "generatePlots_max_success": "max(max_over_time(KruizeMethod_max{{method=\"generatePlots\",application=\"Kruize\",status=\"success\"}}[6h]))", "loadRecommendationsByExperimentName_max_success": "max(max_over_time(kruizeDB_max{{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"success\"}}[6h]))", "loadRecommendationsByExperimentNameAndDate_max_success": "max(max_over_time(kruizeDB_max{{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"success\"}}[6h]))", @@ -204,7 +229,15 @@ def run_queries(map_type,server,prometheus_url=None): results_map["updateRecommendationsPerCall_success"] = sum_success / count_success except ValueError: print("Error: Unable to convert values to floats.") - + if "runBulkJob_sum_success" in results_map and "runBulkJob_count_success" in results_map: + if results_map["runBulkJob_sum_success"] and results_map["runBulkJob_count_success"]: + try: + sum_success = round(float(results_map["runBulkJob_sum_success"]),10) + count_success = round(float(results_map["runBulkJob_count_success"]),10) + if count_success != 0: + results_map["BulkJobPerCall_success"] = sum_success / count_success + except ValueError: + print("Error: Unable to convert values to floats.") except Exception as e: print(f"AN ERROR OCCURED: {e}") sys.exit(1) @@ -320,6 +353,12 @@ def main(argv): "createExperiment_count_success": f"sum(increase(kruizeAPI_count{{api=\"createExperiment\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "updateResults_count_success": f"sum(increase(kruizeAPI_count{{api=\"updateResults\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "updateRecommendations_count_success": f"sum(increase(kruizeAPI_count{{api=\"updateRecommendations\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", + "createBulkJob_count_success": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"success\"}}[{time_duration}]))", + "bulkJobs_count_running": f"sum(increase(kruizeAPI_active_jobs_count{{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"running\"}}[{time_duration}]))", + "jobStatus_count_success": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"success\"}}[{time_duration}]))", + "bulk_getExperimentMap_count_success": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"success\"}}[{time_duration}]))", + "runBulkJob_count_success": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"success\"}}[{time_duration}]))", + "importMetadata_count_success": f"sum(increase(kruizeAPI_count{{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"success\"}}[{time_duration}]))", "generatePlots_count_success": f"sum(increase(KruizeMethod_count{{method=\"generatePlots\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "loadRecommendationsByExperimentName_count_success": f"sum(increase(kruizeDB_count{{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "loadRecommendationsByExperimentNameAndDate_count_success": f"sum(increase(kruizeDB_count{{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", @@ -337,6 +376,10 @@ def main(argv): "createExperiment_count_failure": f"sum(increase(kruizeAPI_count{{api=\"createExperiment\",application=\"Kruize\",status=\"failure\"}}[{time_duration}]))", "updateResults_count_failure": f"sum(increase(kruizeAPI_count{{api=\"updateResults\",application=\"Kruize\",status=\"failure\"}}[{time_duration}]))", "updateRecommendations_count_failure": f"sum(increase(kruizeAPI_count{{api=\"updateRecommendations\",application=\"Kruize\",status=\"failure\"}}[{time_duration}]))", + "createBulkJob_count_failure": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"failure\"}}[{time_duration}]))", + "jobStatus_count_failure": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"failure\"}}[{time_duration}]))", + "runBulkJob_count_failure": f"sum(increase(kruizeAPI_count{{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"failure\"}}[{time_duration}]))", + "importMetadata_count_failure": f"sum(increase(kruizeAPI_count{{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"failure\"}}[{time_duration}]))", "generatePlots_count_failure": f"sum(increase(KruizeMethod_count{{method=\"generatePlots\",application=\"Kruize\",status=\"failure\"}}[{time_duration}]))", "loadRecommendationsByExperimentName_count_failure": f"sum(increase(kruizeDB_count{{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"failure\"}}[{time_duration}]))", "loadRecommendationsByExperimentNameAndDate_count_failure": f"sum(increase(kruizeDB_count{{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"failure\"}}[{time_duration}]))", @@ -354,6 +397,11 @@ def main(argv): "createExperiment_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"createExperiment\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "updateResults_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"updateResults\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "updateRecommendations_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"updateRecommendations\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", + "createBulkJob_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"success\"}}[{time_duration}]))", + "jobStatus_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"success\"}}[{time_duration}]))", + "bulk_getExperimentMap_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"success\"}}[{time_duration}]))", + "runBulkJob_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"success\"}}[{time_duration}]))", + "importMetadata_sum_success": f"sum(increase(kruizeAPI_sum{{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"success\"}}[{time_duration}]))", "generatePlots_sum_success": f"sum(increase(KruizeMethod_sum{{method=\"generatePlots\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "loadRecommendationsByExperimentName_sum_success": f"sum(increase(kruizeDB_sum{{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "loadRecommendationsByExperimentNameAndDate_sum_success": f"sum(increase(kruizeDB_sum{{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", @@ -371,6 +419,11 @@ def main(argv): "createExperiment_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"createExperiment\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "updateResults_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"updateResults\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "updateRecommendations_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"updateRecommendations\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", + "createBulkJob_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"createBulkJob\",status=\"success\"}}[{time_duration}]))", + "jobStatus_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"jobStatus\",status=\"success\"}}[{time_duration}]))", + "bulk_getExperimentMap_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"getExperimentMap\",status=\"success\"}}[{time_duration}]))", + "runBulkJob_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"bulk\",application=\"Kruize\",method=\"runBulkJob\",status=\"success\"}}[{time_duration}]))", + "importMetadata_max_success": f"max(max_over_time(kruizeAPI_max{{api=\"datasources\",application=\"Kruize\",method=\"importMetadata\",status=\"success\"}}[{time_duration}]))", "generatePlots_max_success": f"max(max_over_time(KruizeMethod_max{{method=\"generatePlots\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "loadRecommendationsByExperimentName_max_success": f"max(max_over_time(kruizeDB_max{{method=\"loadRecommendationsByExperimentName\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", "loadRecommendationsByExperimentNameAndDate_max_success": f"max(max_over_time(kruizeDB_max{{method=\"loadRecommendationsByExperimentNameAndDate\",application=\"Kruize\",status=\"success\"}}[{time_duration}]))", From ee0bc9a15747848de0944f60fd113849be9db4d5 Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Fri, 6 Dec 2024 14:39:35 +0530 Subject: [PATCH 42/85] remove commented out code Signed-off-by: kusumachalasani --- .../java/com/autotune/analyzer/workerimpl/BulkJobManager.java | 4 +++- src/main/java/com/autotune/utils/MetricsConfig.java | 2 -- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index da479724c..0246f3695 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -166,6 +166,7 @@ public void run() { String experiment_name = apiObject.getExperimentName(); BulkJobStatus.Experiment experiment = jobData.addExperiment(experiment_name); try { + // send request to createExperiment API for experiment creation GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); apiClient.setBaseURL(KruizeDeploymentInfo.experiments_url); GenericRestApiClient.HttpResponseWrapper responseCode; @@ -198,7 +199,8 @@ public void run() { } } } - if (experiment_exists) { + + if (expriment_exists) { generateExecutor.submit(() -> { // send request to generateRecommendations API GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); diff --git a/src/main/java/com/autotune/utils/MetricsConfig.java b/src/main/java/com/autotune/utils/MetricsConfig.java index 897f45848..b7afa12f7 100644 --- a/src/main/java/com/autotune/utils/MetricsConfig.java +++ b/src/main/java/com/autotune/utils/MetricsConfig.java @@ -78,8 +78,6 @@ private MetricsConfig() { timerBJobStatus = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "jobStatus"); timerBCreateBulkJob = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkJob"); timerBGetExpMap = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "getExperimentMap"); - //timerBCreateBulkExp = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "createBulkExperiment"); - //timerBGenerateBulkRec = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "generateBulkRecommendation"); timerBRunJob = Timer.builder("kruizeAPI").description(API_METRIC_DESC).tag("api", "bulk").tag("method", "runBulkJob"); timerBBulkRunJobs = Gauge.builder("kruizeAPI_active_jobs_count", activeJobs, AtomicInteger::get).description("No.of bulk jobs running").tags("api", "bulk", "method", "runBulkJob" , "status", "running"); timerBBulkRunJobs.register(meterRegistry); From baae1006ea10b11bfbe2ed30c8316ea9ade7912f Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Fri, 6 Dec 2024 14:45:39 +0530 Subject: [PATCH 43/85] fix typo Signed-off-by: kusumachalasani --- .../java/com/autotune/analyzer/workerimpl/BulkJobManager.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index 0246f3695..86e351f7d 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -200,7 +200,7 @@ public void run() { } } - if (expriment_exists) { + if (experiment_exists) { generateExecutor.submit(() -> { // send request to generateRecommendations API GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); From c64090fbf1c876abc7bffaf461b623ddc3bd4c96 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 12:19:06 +0530 Subject: [PATCH 44/85] updating logs Signed-off-by: Shekhar Saxena --- .../analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java index 7440e8655..3c47c499f 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -299,7 +299,7 @@ public void createVpaObject(KruizeObject kruizeObject) throws UnableToCreateVPAE try { // checks if updater is installed or not if (isUpdaterInstalled()) { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATEING_VPA, kruizeObject.getExperimentName()); + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATEING_VPA, kruizeObject.getExperimentName())); // updating recommender to Kruize for VPA Object Map additionalVpaObjectProps = getAdditionalVpaObjectProps(); From c3c0c7de155204a05d2153b18fe4e2477f7a8e18 Mon Sep 17 00:00:00 2001 From: kusumachalasani Date: Tue, 10 Dec 2024 13:23:18 +0530 Subject: [PATCH 45/85] resolve conflicts Signed-off-by: kusumachalasani --- .../analyzer/workerimpl/BulkJobManager.java | 25 ++++++++----------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index 86e351f7d..d032e2b50 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -15,7 +15,6 @@ *******************************************************************************/ package com.autotune.analyzer.workerimpl; - import com.autotune.analyzer.kruizeObject.RecommendationSettings; import com.autotune.analyzer.serviceObjects.*; import com.autotune.analyzer.utils.AnalyzerConstants; @@ -31,6 +30,7 @@ import com.autotune.utils.Utils; import com.fasterxml.jackson.core.JsonProcessingException; import com.google.gson.Gson; +import io.micrometer.core.instrument.Timer; import org.apache.http.conn.ConnectTimeoutException; import org.json.JSONObject; import org.slf4j.Logger; @@ -50,7 +50,6 @@ import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; -import java.util.concurrent.atomic.AtomicReference; import java.util.regex.Matcher; import java.util.regex.Pattern; @@ -58,7 +57,6 @@ import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.*; import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.NotificationConstants.*; - /** * The `run` method processes bulk input to create experiments and generates resource optimization recommendations. * It handles the creation of experiment names based on various data source components, makes HTTP POST requests @@ -123,7 +121,7 @@ private static Map parseLabelString(String labelString) { public void run() { String statusValue = "failure"; MetricsConfig.activeJobs.incrementAndGet(); - io.micrometer.core.instrument.Timer.Sample timerRunJob = Timer.start(MetricsConfig.meterRegistry()); + Timer.Sample timerRunJob = Timer.start(MetricsConfig.meterRegistry()); DataSourceMetadataInfo metadataInfo = null; DataSourceManager dataSourceManager = new DataSourceManager(); DataSourceInfo datasource = null; @@ -170,26 +168,22 @@ public void run() { GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); apiClient.setBaseURL(KruizeDeploymentInfo.experiments_url); GenericRestApiClient.HttpResponseWrapper responseCode; - boolean experiment_exists = false; + boolean expriment_exists = false; try { responseCode = apiClient.callKruizeAPI("[" + new Gson().toJson(apiObject) + "]"); LOGGER.debug("API Response code: {}", responseCode); if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { - experiment_exists = true; - } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { - experiment_exists = true; - } else { + expriment_exists = true; } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { - experiment_exists = true; + expriment_exists = true; } else { - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, responseCode.getResponseBody().toString(), responseCode.getStatusCode())); } } catch (Exception e) { e.printStackTrace(); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_BAD_REQUEST)); } finally { - if (!experiment_exists) { + if (!expriment_exists) { LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); } @@ -198,9 +192,9 @@ public void run() { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } } - } + } - if (experiment_exists) { + if (expriment_exists) { generateExecutor.submit(() -> { // send request to generateRecommendations API GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); @@ -242,6 +236,7 @@ public void run() { }); } } finally { + // Shutdown createExecutor and wait for it to finish createExecutor.shutdown(); while (!createExecutor.isTerminated()) { try { @@ -252,6 +247,7 @@ public void run() { } } + // Shutdown generateExecutor and wait for it to finish generateExecutor.shutdown(); while (!generateExecutor.isTerminated()) { try { @@ -512,4 +508,3 @@ public String frameExperimentName(String labelString, DataSourceCluster dataSour return experimentName; } } - From 0c1d6bc86ef329d53cb7bd3399bce01f0fa513f5 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Wed, 11 Dec 2024 18:46:56 +0530 Subject: [PATCH 46/85] addressing review comments Signed-off-by: Shekhar Saxena --- .../updater/vpa/VpaUpdaterImpl.java | 155 +++++++++--------- .../utils/RecommendationUtils.java | 25 +++ .../utils/AnalyzerErrorConstants.java | 1 + 3 files changed, 103 insertions(+), 78 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java index 3c47c499f..69f3b893b 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -21,6 +21,7 @@ import com.autotune.analyzer.kruizeObject.KruizeObject; import com.autotune.analyzer.recommendations.RecommendationConfigItem; import com.autotune.analyzer.recommendations.updater.RecommendationUpdaterImpl; +import com.autotune.analyzer.recommendations.utils.RecommendationUtils; import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.analyzer.utils.AnalyzerErrorConstants; import com.autotune.common.k8sObjects.K8sObject; @@ -48,7 +49,7 @@ public class VpaUpdaterImpl extends RecommendationUpdaterImpl { private static final Logger LOGGER = LoggerFactory.getLogger(VpaUpdaterImpl.class); - private static VpaUpdaterImpl vpaUpdater = new VpaUpdaterImpl(); + private static VpaUpdaterImpl vpaUpdater; private KubernetesClient kubernetesClient; private ApiextensionsAPIGroupDSL apiextensionsClient; @@ -59,7 +60,7 @@ private VpaUpdaterImpl() { this.apiextensionsClient = kubernetesClient.apiextensions(); } - public static VpaUpdaterImpl getInstance() { + public static synchronized VpaUpdaterImpl getInstance() { if (vpaUpdater == null) { vpaUpdater = new VpaUpdaterImpl(); } @@ -72,17 +73,25 @@ public static VpaUpdaterImpl getInstance() { */ @Override public boolean isUpdaterInstalled() { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_UPDATER_INSTALLED, - AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); - // checking if VPA CRD is present or not - CustomResourceDefinitionList crdList = apiextensionsClient.v1().customResourceDefinitions().list(); - boolean isVpaInstalled = crdList.getItems().stream().anyMatch(crd -> AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_PLURAL.equalsIgnoreCase(crd.getSpec().getNames().getKind())); - if (isVpaInstalled) { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.FOUND_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); - } else { - LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); + try { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_UPDATER_INSTALLED, + AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + // checking if VPA CRD is present or not + boolean isVpaInstalled = false; + CustomResourceDefinitionList crdList = apiextensionsClient.v1().customResourceDefinitions().list(); + if (null != crdList && null != crdList.getItems() && !crdList.getItems().isEmpty()) { + isVpaInstalled = crdList.getItems().stream().anyMatch(crd -> AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_PLURAL.equalsIgnoreCase(crd.getSpec().getNames().getKind())); + } + if (isVpaInstalled) { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.FOUND_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + } else { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); + } + return isVpaInstalled; + } catch (Exception e) { + LOGGER.error(e.getMessage()); + return false; } - return isVpaInstalled; } /** @@ -93,19 +102,25 @@ public boolean isUpdaterInstalled() { */ private boolean checkIfVpaIsPresent(String vpaName) { try { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); - NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); - VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); - - // TODO:// later we can also check here is the recommender is Kruize to confirm - for (VerticalPodAutoscaler vpa : vpas.getItems()) { - if (vpaName.equals(vpa.getMetadata().getName())) { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); - return true; + if (null == vpaName || vpaName.isEmpty()) { + throw new Exception(AnalyzerErrorConstants.RecommendationUpdaterErrors.INVALID_VPA_NAME); + } else { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); + NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); + VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); + + if (null != vpas && null != vpas.getItems() && !vpas.getItems().isEmpty()) { + // TODO:// later we can also check here is the recommender is Kruize to confirm + for (VerticalPodAutoscaler vpa : vpas.getItems()) { + if (vpaName.equals(vpa.getMetadata().getName())) { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); + return true; + } + } } + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); + return false; } - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); - return false; } catch (Exception e) { LOGGER.error("Error while checking VPA presence: " + e.getMessage(), e); return false; @@ -121,18 +136,24 @@ private boolean checkIfVpaIsPresent(String vpaName) { */ private VerticalPodAutoscaler getVpaIsPresent(String vpaName) { try { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); - NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); - VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); - - for (VerticalPodAutoscaler vpa : vpas.getItems()) { - if (vpaName.equals(vpa.getMetadata().getName())) { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); - return vpa; + if (null == vpaName || vpaName.isEmpty()) { + throw new Exception(AnalyzerErrorConstants.RecommendationUpdaterErrors.INVALID_VPA_NAME); + } else { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); + NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); + VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); + + if (null != vpas && null != vpas.getItems() && !vpas.getItems().isEmpty()) { + for (VerticalPodAutoscaler vpa : vpas.getItems()) { + if (vpaName.equals(vpa.getMetadata().getName())) { + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); + return vpa; + } + } } + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); + return null; } - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); - return null; } catch (Exception e) { LOGGER.error("Error while checking VPA presence: " + e.getMessage(), e); return null; @@ -152,7 +173,9 @@ private VerticalPodAutoscaler getVpaIsPresent(String vpaName) { public void applyResourceRecommendationsForExperiment(KruizeObject kruizeObject) throws ApplyRecommendationsError { try { // checking if VPA is installed or not - if (isUpdaterInstalled()) { + if (!isUpdaterInstalled()) { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); + } else { String expName = kruizeObject.getExperimentName(); boolean vpaPresent = checkIfVpaIsPresent(expName); @@ -192,9 +215,6 @@ public void applyResourceRecommendationsForExperiment(KruizeObject kruizeObject) } } } - - } else { - LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); } } catch (Exception e) { throw new ApplyRecommendationsError(e.getMessage()); @@ -208,13 +228,15 @@ private List convertRecommendationsToContainerPo List containerRecommendations = new ArrayList<>(); for (Map.Entry containerDataEntry : containerDataMap.entrySet()) { - // fetcing container data + // fetching container data ContainerData containerData = containerDataEntry.getValue(); String containerName = containerData.getContainer_name(); HashMap recommendationData = containerData.getContainerRecommendations().getData(); // checking if recommendation data is present - if (recommendationData != null) { + if (null == recommendationData) { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.RECOMMENDATION_DATA_NOT_PRESENT); + } else { for (MappedRecommendationForTimestamp value : recommendationData.values()) { /* * Fetching Short Term Cost Recommendations By Default @@ -228,11 +250,15 @@ private List convertRecommendationsToContainerPo Double cpuRecommendationValue = recommendationsConfig.get(AnalyzerConstants.ResourceSetting.requests).get(AnalyzerConstants.RecommendationItem.CPU).getAmount(); Double memoryRecommendationValue = recommendationsConfig.get(AnalyzerConstants.ResourceSetting.requests).get(AnalyzerConstants.RecommendationItem.MEMORY).getAmount(); - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, "CPU", containerName, cpuRecommendationValue)); - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, "MEMORY", containerName, memoryRecommendationValue)); + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, + AnalyzerConstants.RecommendationItem.CPU, containerName, cpuRecommendationValue)); + LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, + AnalyzerConstants.RecommendationItem.MEMORY, containerName, memoryRecommendationValue)); - String cpuRecommendationValueForVpa = resource2str("CPU", cpuRecommendationValue); - String memoryRecommendationValueForVpa = resource2str("MEMORY", memoryRecommendationValue); + String cpuRecommendationValueForVpa = RecommendationUtils.resource2str(AnalyzerConstants.RecommendationItem.CPU.toString(), + cpuRecommendationValue); + String memoryRecommendationValueForVpa = RecommendationUtils.resource2str(AnalyzerConstants.RecommendationItem.MEMORY.toString(), + memoryRecommendationValue); // creating container resource vpa object RecommendedContainerResources recommendedContainerResources = new RecommendedContainerResources(); @@ -240,18 +266,18 @@ private List convertRecommendationsToContainerPo // setting target values Map target = new HashMap<>(); - target.put("cpu", new Quantity(cpuRecommendationValueForVpa)); - target.put("memory", new Quantity(memoryRecommendationValueForVpa)); + target.put(AnalyzerConstants.RecommendationItem.CPU.toString(), new Quantity(cpuRecommendationValueForVpa)); + target.put(AnalyzerConstants.RecommendationItem.MEMORY.toString(), new Quantity(memoryRecommendationValueForVpa)); // setting lower bound values Map lowerBound = new HashMap<>(); - lowerBound.put("cpu", new Quantity(cpuRecommendationValueForVpa)); - lowerBound.put("memory", new Quantity(memoryRecommendationValueForVpa)); + lowerBound.put(AnalyzerConstants.RecommendationItem.CPU.toString(), new Quantity(cpuRecommendationValueForVpa)); + lowerBound.put(AnalyzerConstants.RecommendationItem.MEMORY.toString(), new Quantity(memoryRecommendationValueForVpa)); // setting upper bound values Map upperBound = new HashMap<>(); - upperBound.put("cpu", new Quantity(cpuRecommendationValueForVpa)); - upperBound.put("memory", new Quantity(memoryRecommendationValueForVpa)); + upperBound.put(AnalyzerConstants.RecommendationItem.CPU.toString(), new Quantity(cpuRecommendationValueForVpa)); + upperBound.put(AnalyzerConstants.RecommendationItem.MEMORY.toString(), new Quantity(memoryRecommendationValueForVpa)); recommendedContainerResources.setLowerBound(lowerBound); recommendedContainerResources.setTarget(target); @@ -259,38 +285,11 @@ private List convertRecommendationsToContainerPo containerRecommendations.add(recommendedContainerResources); } - } else { - LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.RECOMMENDATION_DATA_NOT_PRESENT); } } return containerRecommendations; } - /** - * This function converts the cpu and memory values to VPA desired format - */ - public static String resource2str(String resource, double value) { - if (resource.equalsIgnoreCase("CPU")) { - // cpu related conversions - if (value < 1) { - return (int) (value * 1000) + "m"; - } else { - return String.valueOf(value); - } - } else { - // memory related conversions - if (value < 1024) { - return (int) value + "B"; - } else if (value < 1024 * 1024) { - return (int) (value / 1024) + "k"; - } else if (value < 1024 * 1024 * 1024) { - return (int) (value / 1024 / 1024) + "Mi"; - } else { - return (int) (value / 1024 / 1024 / 1024) + "Gi"; - } - } - } - /* * Creates a Vertical Pod Autoscaler (VPA) object in the specified namespace * for the given deployment and containers. @@ -306,8 +305,8 @@ public void createVpaObject(KruizeObject kruizeObject) throws UnableToCreateVPAE // updating controlled resources List controlledResources = new ArrayList<>(); - controlledResources.add("cpu"); - controlledResources.add("memory"); + controlledResources.add(AnalyzerConstants.RecommendationItem.CPU.toString()); + controlledResources.add(AnalyzerConstants.RecommendationItem.MEMORY.toString()); // updating container policies for (K8sObject k8sObject: kruizeObject.getKubernetes_objects()) { diff --git a/src/main/java/com/autotune/analyzer/recommendations/utils/RecommendationUtils.java b/src/main/java/com/autotune/analyzer/recommendations/utils/RecommendationUtils.java index ce65ce5aa..8bb996695 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/utils/RecommendationUtils.java +++ b/src/main/java/com/autotune/analyzer/recommendations/utils/RecommendationUtils.java @@ -371,5 +371,30 @@ public static String getSupportedModelBasedOnModelName(String modelName) { return null; } + + /** + * This function converts the cpu and memory values to VPA desired format + */ + public static String resource2str(String resource, double value) { + if (resource.equalsIgnoreCase(AnalyzerConstants.RecommendationItem.CPU.toString())) { + // cpu related conversions + if (value < 1) { + return (int) (value * 1000) + "m"; + } else { + return String.valueOf(value); + } + } else { + // memory related conversions + if (value < 1024) { + return (int) value + "B"; + } else if (value < 1024 * 1024) { + return (int) (value / 1024) + "k"; + } else if (value < 1024 * 1024 * 1024) { + return (int) (value / 1024 / 1024) + "Mi"; + } else { + return (int) (value / 1024 / 1024 / 1024) + "Gi"; + } + } + } } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java index 951e80987..de16b4faf 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java @@ -296,5 +296,6 @@ private RecommendationUpdaterErrors() { public static final String GENERATE_RECOMMNEDATION_FAILED = "Failed to generate recommendations for experiment: {}"; public static final String UPDATER_NOT_INSTALLED = "Updater is not installed."; public static final String RECOMMENDATION_DATA_NOT_PRESENT = "Recommendations are not present for the experiment."; + public static final String INVALID_VPA_NAME = "VPA name cannot be null or empty."; } } From f837746398c0f732aa0a41b5712dc17bf7ca2e0f Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Thu, 12 Dec 2024 12:35:47 +0530 Subject: [PATCH 47/85] replacing sync func to sync block Signed-off-by: Shekhar Saxena --- .../recommendations/updater/vpa/VpaUpdaterImpl.java | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java index 69f3b893b..b2ced9db5 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -61,9 +61,16 @@ private VpaUpdaterImpl() { } public static synchronized VpaUpdaterImpl getInstance() { - if (vpaUpdater == null) { - vpaUpdater = new VpaUpdaterImpl(); + if (null != vpaUpdater) { + return vpaUpdater; } + + synchronized (VpaUpdaterImpl.class) { + if (null == vpaUpdater) { + vpaUpdater = new VpaUpdaterImpl(); + } + } + return vpaUpdater; } From 5bb1b8a02902c4d434cc31e38f0f1a0c80b5fcb4 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Thu, 12 Dec 2024 12:38:03 +0530 Subject: [PATCH 48/85] removing sync from func Signed-off-by: Shekhar Saxena --- .../analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java index b2ced9db5..d17453c9e 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -60,7 +60,7 @@ private VpaUpdaterImpl() { this.apiextensionsClient = kubernetesClient.apiextensions(); } - public static synchronized VpaUpdaterImpl getInstance() { + public static VpaUpdaterImpl getInstance() { if (null != vpaUpdater) { return vpaUpdater; } From f9cba87c54415e3411253d97465ddff73ad7b7b2 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 13 Dec 2024 11:44:09 +0530 Subject: [PATCH 49/85] fixing generate recommendations api Signed-off-by: Shekhar Saxena --- .../analyzer/recommendations/engine/RecommendationEngine.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java index 077998c03..cf1bb91b2 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java +++ b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java @@ -2408,8 +2408,8 @@ public List filterMetricsBasedOnExpTypeAndK8sObject(PerformanceProfile m // Include metrics based on experiment_type, kubernetes_object and exclude maxDate metric return !name.equals(maxDateQuery) && ( - (experimentType.equals(AnalyzerConstants.ExperimentTypes.NAMESPACE_EXPERIMENT) && kubernetes_object.equals(namespace)) || - (experimentType.equals(AnalyzerConstants.ExperimentTypes.CONTAINER_EXPERIMENT) && kubernetes_object.equals(container)) + (experimentType.name().equalsIgnoreCase(AnalyzerConstants.ExperimentTypes.NAMESPACE_EXPERIMENT) && kubernetes_object.equals(namespace)) || + (experimentType.name().equalsIgnoreCase(AnalyzerConstants.ExperimentTypes.CONTAINER_EXPERIMENT) && kubernetes_object.equals(container)) ); }) .toList(); From 88742cf9a05d4608919e50a853fe773795cc75bb Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Fri, 13 Dec 2024 14:46:20 +0530 Subject: [PATCH 50/85] fix conflicts Signed-off-by: Saad Khan --- .../autotune/common/datasource/DataSourceMetadataOperator.java | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java b/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java index b74b4c63e..4fb1552e1 100644 --- a/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java +++ b/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java @@ -217,7 +217,7 @@ public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(Da LOGGER.info("containerQuery: {}", containerQuery); JsonArray namespacesDataResultArray = fetchQueryResults(dataSourceInfo, namespaceQuery, startTime, endTime, steps); - LOGGER.debug("namespacesDataResultArray: {}", namespacesDataResultArray); + LOGGER.info("namespacesDataResultArray: {}", namespacesDataResultArray); if (!op.validateResultArray(namespacesDataResultArray)) { dataSourceMetadataInfo = dataSourceDetailsHelper.createDataSourceMetadataInfoObject(dataSourceName, null); } else { @@ -226,6 +226,7 @@ public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(Da * Value: DataSourceNamespace object corresponding to a namespace */ HashMap datasourceNamespaces = dataSourceDetailsHelper.getActiveNamespaces(namespacesDataResultArray); + LOGGER.info("datasourceNamespaces: {}", datasourceNamespaces.keySet()); dataSourceMetadataInfo = dataSourceDetailsHelper.createDataSourceMetadataInfoObject(dataSourceName, datasourceNamespaces); /** From f90273947caae79e3d9c280c23b346e22bf4d34c Mon Sep 17 00:00:00 2001 From: Chandrakala Subramanyam Date: Fri, 13 Dec 2024 15:13:35 +0530 Subject: [PATCH 51/85] Updated ubi9 version Signed-off-by: Chandrakala Subramanyam --- Dockerfile.autotune | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Dockerfile.autotune b/Dockerfile.autotune index cabb9c367..1f019d6bd 100644 --- a/Dockerfile.autotune +++ b/Dockerfile.autotune @@ -16,7 +16,7 @@ ########################################################## # Build Docker Image ########################################################## -FROM registry.access.redhat.com/ubi9/ubi-minimal:9.5 as mvnbuild-jdk21 +FROM registry.access.redhat.com/ubi9/ubi-minimal:9.5-1733767867 as mvnbuild-jdk21 ARG USER=autotune ARG AUTOTUNE_HOME=/home/$USER @@ -48,7 +48,7 @@ RUN jlink --strip-debug --compress 2 --no-header-files --no-man-pages --module-p # Runtime Docker Image ########################################################## # Use ubi-minimal as the base image -FROM registry.access.redhat.com/ubi9/ubi-minimal:9.5 +FROM registry.access.redhat.com/ubi9/ubi-minimal:9.5-1733767867 ARG AUTOTUNE_VERSION ARG USER=autotune From 9d484afec6ff6c25bb289e0aff22a389eec057f3 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Fri, 13 Dec 2024 18:13:22 +0530 Subject: [PATCH 52/85] update filter examples in the design doc Signed-off-by: Saad Khan --- design/BulkAPI.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/design/BulkAPI.md b/design/BulkAPI.md index 566dd5de1..d9d1864c4 100644 --- a/design/BulkAPI.md +++ b/design/BulkAPI.md @@ -28,9 +28,9 @@ progress of the job. { "filter": { "exclude": { - "namespace": ["cadvisor", "openshift-tuning", "openshift-monitoring", "thanos-bench"], - "workload": ["osd-rebalance-infra-nodes-28887030", "blackbox-exporter", "thanos-query"], - "containers": ["tfb-0", "alertmanager"], + "namespace": ["openshift-.*"], + "workload": [], + "containers": [], "labels": { "org_id": "ABCOrga", "source_id": "ZZZ", @@ -38,9 +38,9 @@ progress of the job. } }, "include": { - "namespace": ["cadvisor", "openshift-tuning", "openshift-monitoring", "thanos-bench"], - "workload": ["osd-rebalance-infra-nodes-28887030", "blackbox-exporter", "thanos-query"], - "containers": ["tfb-0", "alertmanager"], + "namespace": ["openshift-tuning"], + "workload": [], + "containers": [], "labels": { "org_id": "ABCOrga", "source_id": "ZZZ", @@ -111,11 +111,15 @@ The specified time range determines the period over which the data is analyzed t #### 2. **Request Payload with `exclude` filter specified:** -- **`exclude`** filters out namespaces like `"cadvisor"` and workloads like `"blackbox-exporter"`, along with containers and labels that match the specified values. So, we'll generate create experiments and generate recommendations for every namespace, workload and containers except those. +- **`exclude`** As shown in the example above, it filters out all namespaces starting with the name `openshift-` . So, we'll create experiments and generate recommendations for every namespace except those. #### 3. **Request Payload with `include` filter specified:** -- **`include`** explicitly selects the namespaces, workloads, containers, and labels to be queried. So, for only those we'll create experiments and get the recommendations. +- **`include`** As shown in the example above, it filters out the namespace `openshift-`. So, we'll create experiments and generate recommendations for every namespace starting with the specified name. + +#### 3. **Request Payload with both `include` and `exclude` filter specified:** + +- **`include`** As shown in the example above, it filters out all namespaces starting with the name `openshift-` but includes the `openshift-tuning` one. So, we'll create experiments and generate recommendations for the `openshift-tuning` namespace. ### GET Request: From 223b155807e4c19a84cd18fee3e917a5448b4d95 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 13 Dec 2024 18:27:19 +0530 Subject: [PATCH 53/85] reducing logging events Signed-off-by: Shekhar Saxena --- .../updater/RecommendationUpdaterImpl.java | 4 +-- .../updater/vpa/VpaUpdaterImpl.java | 26 +++++++++---------- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java index 76d1695c0..f423dcd3c 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterImpl.java @@ -68,7 +68,7 @@ public boolean isUpdaterInstalled() { @Override public KruizeObject generateResourceRecommendationsForExperiment(String experimentName) { try { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATING_RECOMMENDATIONS, experimentName); + LOGGER.debug(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATING_RECOMMENDATIONS, experimentName); // generating latest recommendations for experiment RecommendationEngine recommendationEngine = new RecommendationEngine(experimentName, null, null); int calCount = 0; @@ -76,7 +76,7 @@ public KruizeObject generateResourceRecommendationsForExperiment(String experime if (validationMessage.isEmpty()) { KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount, null); if (kruizeObject.getValidation_data().isSuccess()) { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATED_RECOMMENDATIONS, experimentName); + LOGGER.debug(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.GENERATED_RECOMMENDATIONS, experimentName); return kruizeObject; } else { throw new Exception(kruizeObject.getValidation_data().getMessage()); diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java index d17453c9e..40e9ddb75 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/vpa/VpaUpdaterImpl.java @@ -81,7 +81,7 @@ public static VpaUpdaterImpl getInstance() { @Override public boolean isUpdaterInstalled() { try { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_UPDATER_INSTALLED, + LOGGER.debug(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); // checking if VPA CRD is present or not boolean isVpaInstalled = false; @@ -90,7 +90,7 @@ public boolean isUpdaterInstalled() { isVpaInstalled = crdList.getItems().stream().anyMatch(crd -> AnalyzerConstants.RecommendationUpdaterConstants.VPA.VPA_PLURAL.equalsIgnoreCase(crd.getSpec().getNames().getKind())); } if (isVpaInstalled) { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.FOUND_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + LOGGER.debug(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.FOUND_UPDATER_INSTALLED, AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); } else { LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDATER_NOT_INSTALLED); } @@ -112,7 +112,7 @@ private boolean checkIfVpaIsPresent(String vpaName) { if (null == vpaName || vpaName.isEmpty()) { throw new Exception(AnalyzerErrorConstants.RecommendationUpdaterErrors.INVALID_VPA_NAME); } else { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); @@ -120,12 +120,12 @@ private boolean checkIfVpaIsPresent(String vpaName) { // TODO:// later we can also check here is the recommender is Kruize to confirm for (VerticalPodAutoscaler vpa : vpas.getItems()) { if (vpaName.equals(vpa.getMetadata().getName())) { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); return true; } } } - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); + LOGGER.error(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); return false; } } catch (Exception e) { @@ -146,19 +146,19 @@ private VerticalPodAutoscaler getVpaIsPresent(String vpaName) { if (null == vpaName || vpaName.isEmpty()) { throw new Exception(AnalyzerErrorConstants.RecommendationUpdaterErrors.INVALID_VPA_NAME); } else { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_IF_VPA_PRESENT, vpaName)); NamespacedVerticalPodAutoscalerClient client = new DefaultVerticalPodAutoscalerClient(); VerticalPodAutoscalerList vpas = client.v1().verticalpodautoscalers().inAnyNamespace().list(); if (null != vpas && null != vpas.getItems() && !vpas.getItems().isEmpty()) { for (VerticalPodAutoscaler vpa : vpas.getItems()) { if (vpaName.equals(vpa.getMetadata().getName())) { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_FOUND, vpaName)); return vpa; } } } - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); + LOGGER.error(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_WITH_NAME_NOT_FOUND, vpaName)); return null; } } catch (Exception e) { @@ -217,7 +217,7 @@ public void applyResourceRecommendationsForExperiment(KruizeObject kruizeObject) .getName()) .patchStatus(vpaObject); - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_PATCHED, + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.VPA_PATCHED, vpaObject.getMetadata().getName())); } } @@ -257,9 +257,9 @@ private List convertRecommendationsToContainerPo Double cpuRecommendationValue = recommendationsConfig.get(AnalyzerConstants.ResourceSetting.requests).get(AnalyzerConstants.RecommendationItem.CPU).getAmount(); Double memoryRecommendationValue = recommendationsConfig.get(AnalyzerConstants.ResourceSetting.requests).get(AnalyzerConstants.RecommendationItem.MEMORY).getAmount(); - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, AnalyzerConstants.RecommendationItem.CPU, containerName, cpuRecommendationValue)); - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.RECOMMENDATION_VALUE, AnalyzerConstants.RecommendationItem.MEMORY, containerName, memoryRecommendationValue)); String cpuRecommendationValueForVpa = RecommendationUtils.resource2str(AnalyzerConstants.RecommendationItem.CPU.toString(), @@ -305,7 +305,7 @@ public void createVpaObject(KruizeObject kruizeObject) throws UnableToCreateVPAE try { // checks if updater is installed or not if (isUpdaterInstalled()) { - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATEING_VPA, kruizeObject.getExperimentName())); + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATEING_VPA, kruizeObject.getExperimentName())); // updating recommender to Kruize for VPA Object Map additionalVpaObjectProps = getAdditionalVpaObjectProps(); @@ -349,7 +349,7 @@ public void createVpaObject(KruizeObject kruizeObject) throws UnableToCreateVPAE .build(); kubernetesClient.resource(vpa).inNamespace(k8sObject.getNamespace()).createOrReplace(); - LOGGER.info(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATED_VPA, kruizeObject.getExperimentName())); + LOGGER.debug(String.format(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CREATED_VPA, kruizeObject.getExperimentName())); } } else { From d6289aa57400b9276090562150736d8f81043439 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 13 Dec 2024 18:45:20 +0530 Subject: [PATCH 54/85] code refactor Signed-off-by: Shekhar Saxena --- .../kruizeObject/ExperimentUseCaseType.java | 26 ++++++++++++------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java b/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java index 01cae6a23..30a34fb50 100644 --- a/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java +++ b/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java @@ -39,16 +39,22 @@ public ExperimentUseCaseType(KruizeObject kruizeObject) throws Exception { throw new Exception("Invalid Mode " + kruizeObject.getMode() + " for target cluster as Remote."); } } else if (kruizeObject.getTarget_cluster().equalsIgnoreCase(AnalyzerConstants.LOCAL)) { - if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.MONITOR)) { - setLocal_monitoring(true); - } else if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.EXPERIMENT)) { - setLocal_experiment(true); - } else if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.RECREATE)) { - setLocal_monitoring(true); - } else if (kruizeObject.getMode().equalsIgnoreCase(AnalyzerConstants.AUTO)) { - setLocal_monitoring(true); - } else { - throw new Exception("Invalid Mode " + kruizeObject.getMode() + " for target cluster as Local."); + switch (kruizeObject.getMode().toLowerCase()) { + case AnalyzerConstants.MONITOR: + setLocal_monitoring(true); + break; + + case AnalyzerConstants.EXPERIMENT: + setLocal_experiment(true); + break; + + case AnalyzerConstants.RECREATE: + case AnalyzerConstants.AUTO: + setLocal_monitoring(true); + break; + + default: + throw new Exception("Invalid Mode " + kruizeObject.getMode() + " for target cluster as Local."); } } else { throw new Exception("Invalid Target cluster type"); From 4a098a8ec7f29c2adf906f1e3afe7e32614141b7 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 13 Dec 2024 18:53:24 +0530 Subject: [PATCH 55/85] code refactoring Signed-off-by: Shekhar Saxena --- .../analyzer/kruizeObject/ExperimentUseCaseType.java | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java b/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java index 30a34fb50..3d437eb90 100644 --- a/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java +++ b/src/main/java/com/autotune/analyzer/kruizeObject/ExperimentUseCaseType.java @@ -41,6 +41,8 @@ public ExperimentUseCaseType(KruizeObject kruizeObject) throws Exception { } else if (kruizeObject.getTarget_cluster().equalsIgnoreCase(AnalyzerConstants.LOCAL)) { switch (kruizeObject.getMode().toLowerCase()) { case AnalyzerConstants.MONITOR: + case AnalyzerConstants.RECREATE: + case AnalyzerConstants.AUTO: setLocal_monitoring(true); break; @@ -48,11 +50,6 @@ public ExperimentUseCaseType(KruizeObject kruizeObject) throws Exception { setLocal_experiment(true); break; - case AnalyzerConstants.RECREATE: - case AnalyzerConstants.AUTO: - setLocal_monitoring(true); - break; - default: throw new Exception("Invalid Mode " + kruizeObject.getMode() + " for target cluster as Local."); } From 4b7c4a7a0fdf64e4f146bec3cdf56ff8786f1d66 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 02:15:13 +0530 Subject: [PATCH 56/85] adding support for updater service Signed-off-by: Shekhar Saxena --- src/main/java/com/autotune/Autotune.java | 7 ++ .../analyzer/kruizeObject/KruizeObject.java | 10 ++ .../updater/RecommendationUpdaterService.java | 92 +++++++++++++++++++ .../analyzer/utils/AnalyzerConstants.java | 2 + .../utils/AnalyzerErrorConstants.java | 1 + 5 files changed, 112 insertions(+) create mode 100644 src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java diff --git a/src/main/java/com/autotune/Autotune.java b/src/main/java/com/autotune/Autotune.java index a76e3820a..000f990df 100644 --- a/src/main/java/com/autotune/Autotune.java +++ b/src/main/java/com/autotune/Autotune.java @@ -21,6 +21,7 @@ import com.autotune.analyzer.exceptions.MonitoringAgentNotFoundException; import com.autotune.analyzer.exceptions.MonitoringAgentNotSupportedException; import com.autotune.analyzer.performanceProfiles.MetricProfileCollection; +import com.autotune.analyzer.recommendations.updater.RecommendationUpdaterService; import com.autotune.analyzer.utils.AnalyzerConstants; import com.autotune.common.datasource.DataSourceCollection; import com.autotune.common.datasource.DataSourceInfo; @@ -133,6 +134,8 @@ public static void main(String[] args) { checkAvailableDataSources(); // load available metric profiles from db loadMetricProfilesFromDB(); + // start updater service + startRecommendationUpdaterService(); } // close the existing session factory before recreating @@ -288,4 +291,8 @@ private static void executeDDLs(String ddlFileName) throws Exception { LOGGER.info(DBConstants.DB_MESSAGES.DB_LIVELINESS_PROBE_SUCCESS); } + // starts the recommendation updater service + private static void startRecommendationUpdaterService() { + RecommendationUpdaterService.initiateUpdaterService(); + } } diff --git a/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java b/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java index 28f18a3e4..3badf4b51 100644 --- a/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java +++ b/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java @@ -51,6 +51,8 @@ public final class KruizeObject implements ExperimentTypeAware { private String datasource; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) //TODO: to be used in future private AnalyzerConstants.ExperimentType experimentType; + @SerializedName("default_updater") + private String defaultUpdater; private String namespace; // TODO: Currently adding it at this level with an assumption that there is only one entry in k8s object needs to be changed private String mode; //Todo convert into Enum @SerializedName("target_cluster") @@ -310,6 +312,14 @@ public void setExperimentType(AnalyzerConstants.ExperimentType experimentType) { } + public String getDefaultUpdater() { + return defaultUpdater; + } + + public void setDefaultUpdater(String defaultUpdater) { + this.defaultUpdater = defaultUpdater; + } + @Override public String toString() { // Creating a temporary cluster name as we allow null for cluster name now diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java new file mode 100644 index 000000000..7a21f9228 --- /dev/null +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java @@ -0,0 +1,92 @@ +/******************************************************************************* + * Copyright (c) 2024 Red Hat, IBM Corporation and others. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + *******************************************************************************/ + +package com.autotune.analyzer.recommendations.updater; + +import com.autotune.analyzer.experiment.ExperimentInterface; +import com.autotune.analyzer.experiment.ExperimentInterfaceImpl; +import com.autotune.analyzer.kruizeObject.KruizeObject; +import com.autotune.analyzer.recommendations.updater.vpa.VpaUpdaterImpl; +import com.autotune.analyzer.utils.AnalyzerConstants; +import com.autotune.analyzer.utils.AnalyzerErrorConstants; +import com.autotune.database.service.ExperimentDBService; +import com.autotune.database.table.KruizeExperimentEntry; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.Executors; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +public class RecommendationUpdaterService { + + private static final Logger LOGGER = LoggerFactory.getLogger(RecommendationUpdaterService.class); + + public static void initiateUpdaterService() { + try { + ScheduledExecutorService executorService = Executors.newSingleThreadScheduledExecutor(); + + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.STARTING_SERVICE); + executorService.scheduleAtFixedRate(() -> { + try { + RecommendationUpdaterImpl updater = new RecommendationUpdaterImpl(); + Map experiments = getAutoModeExperiments(); + for (Map.Entry experiment : experiments.entrySet()) { + KruizeObject kruizeObject = updater.generateResourceRecommendationsForExperiment(experiment.getValue().getExperimentName()); + // TODO:// add default updater in kruizeObject and check if GPU recommendations are present + if (kruizeObject.getDefaultUpdater().isEmpty() || kruizeObject.getDefaultUpdater() == null) { + kruizeObject.setDefaultUpdater(AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); + } + + if (kruizeObject.getDefaultUpdater().equalsIgnoreCase(AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA)) { + VpaUpdaterImpl vpaUpdater = VpaUpdaterImpl.getInstance(); + vpaUpdater.applyResourceRecommendationsForExperiment(kruizeObject); + } + } + } catch (Exception e) { + LOGGER.error(e.getMessage()); + } + }, 0, AnalyzerConstants.RecommendationUpdaterConstants.DEFAULT_SLEEP_INTERVAL, TimeUnit.SECONDS); + } catch (Exception e) { + LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDTAER_SERVICE_START_ERROR + e.getMessage()); + } + } + + private static Map getAutoModeExperiments() { + try { + Map mainKruizeExperimentMap = new ConcurrentHashMap<>(); + new ExperimentDBService().loadAllExperiments(mainKruizeExperimentMap); + // filter map to only include entries where mode is auto or recreate + Map filteredMap = mainKruizeExperimentMap.entrySet().stream() + .filter(entry -> { + String mode = entry.getValue().getMode(); + return AnalyzerConstants.AUTO.equalsIgnoreCase(mode) || AnalyzerConstants.RECREATE.equalsIgnoreCase(mode); + }) + .collect(Collectors.toConcurrentMap(Map.Entry::getKey, Map.Entry::getValue)); + + return filteredMap; + } catch (Exception e) { + LOGGER.error(e.getMessage()); + return new HashMap<>(); + } + } + +} diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index c644389b2..c06064df7 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -684,6 +684,7 @@ private RecommendationUpdaterConstants() { } + public static final int DEFAULT_SLEEP_INTERVAL = 120; public static final class SupportedUpdaters { public static final String VPA = "vpa"; @@ -719,6 +720,7 @@ public static final class InfoMsgs { public static final String VPA_PATCHED = "VPA object with name %s is patched successfully with recommendations."; public static final String CREATEING_VPA = "Creating VPA with name: %s"; public static final String CREATED_VPA = "Created VPA with name: %s"; + public static final String STARTING_SERVICE = "Starting recommendation updater."; private InfoMsgs() { } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java index de16b4faf..b5a092594 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerErrorConstants.java @@ -292,6 +292,7 @@ private RecommendationUpdaterErrors() { } + public static final String UPDTAER_SERVICE_START_ERROR = "Error occurred while initializing RecommendationUpdaterService."; public static final String UNSUPPORTED_UPDATER_TYPE = "Updater type %s is not supported."; public static final String GENERATE_RECOMMNEDATION_FAILED = "Failed to generate recommendations for experiment: {}"; public static final String UPDATER_NOT_INSTALLED = "Updater is not installed."; From 5c67a50ce94735c66cbcaa146c71b0ff84967c8a Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 12:01:42 +0530 Subject: [PATCH 57/85] adding support for updater service Signed-off-by: Shekhar Saxena --- .../updater/RecommendationUpdaterService.java | 11 +++++--- .../analyzer/utils/AnalyzerConstants.java | 4 ++- .../autotune/database/dao/ExperimentDAO.java | 3 +++ .../database/dao/ExperimentDAOImpl.java | 23 ++++++++++++++++ .../autotune/database/helper/DBConstants.java | 1 + .../database/service/ExperimentDBService.java | 26 +++++++++++++++++++ 6 files changed, 63 insertions(+), 5 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java index 7a21f9228..ffe6559ce 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java @@ -52,7 +52,7 @@ public static void initiateUpdaterService() { for (Map.Entry experiment : experiments.entrySet()) { KruizeObject kruizeObject = updater.generateResourceRecommendationsForExperiment(experiment.getValue().getExperimentName()); // TODO:// add default updater in kruizeObject and check if GPU recommendations are present - if (kruizeObject.getDefaultUpdater().isEmpty() || kruizeObject.getDefaultUpdater() == null) { + if (kruizeObject.getDefaultUpdater() == null) { kruizeObject.setDefaultUpdater(AnalyzerConstants.RecommendationUpdaterConstants.SupportedUpdaters.VPA); } @@ -61,10 +61,13 @@ public static void initiateUpdaterService() { vpaUpdater.applyResourceRecommendationsForExperiment(kruizeObject); } } + LOGGER.info("Done"); } catch (Exception e) { LOGGER.error(e.getMessage()); } - }, 0, AnalyzerConstants.RecommendationUpdaterConstants.DEFAULT_SLEEP_INTERVAL, TimeUnit.SECONDS); + }, AnalyzerConstants.RecommendationUpdaterConstants.DEFAULT_INITIAL_DELAY, + AnalyzerConstants.RecommendationUpdaterConstants.DEFAULT_SLEEP_INTERVAL, + TimeUnit.SECONDS); } catch (Exception e) { LOGGER.error(AnalyzerErrorConstants.RecommendationUpdaterErrors.UPDTAER_SERVICE_START_ERROR + e.getMessage()); } @@ -72,8 +75,9 @@ public static void initiateUpdaterService() { private static Map getAutoModeExperiments() { try { + LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_AUTO_EXP); Map mainKruizeExperimentMap = new ConcurrentHashMap<>(); - new ExperimentDBService().loadAllExperiments(mainKruizeExperimentMap); + new ExperimentDBService().loadAllLMExperiments(mainKruizeExperimentMap); // filter map to only include entries where mode is auto or recreate Map filteredMap = mainKruizeExperimentMap.entrySet().stream() .filter(entry -> { @@ -81,7 +85,6 @@ private static Map getAutoModeExperiments() { return AnalyzerConstants.AUTO.equalsIgnoreCase(mode) || AnalyzerConstants.RECREATE.equalsIgnoreCase(mode); }) .collect(Collectors.toConcurrentMap(Map.Entry::getKey, Map.Entry::getValue)); - return filteredMap; } catch (Exception e) { LOGGER.error(e.getMessage()); diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index c06064df7..ac40d8791 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -684,7 +684,8 @@ private RecommendationUpdaterConstants() { } - public static final int DEFAULT_SLEEP_INTERVAL = 120; + public static final int DEFAULT_SLEEP_INTERVAL = 60; + public static final int DEFAULT_INITIAL_DELAY = 30; public static final class SupportedUpdaters { public static final String VPA = "vpa"; @@ -721,6 +722,7 @@ public static final class InfoMsgs { public static final String CREATEING_VPA = "Creating VPA with name: %s"; public static final String CREATED_VPA = "Created VPA with name: %s"; public static final String STARTING_SERVICE = "Starting recommendation updater."; + public static final String CHECKING_AUTO_EXP = "Searching for experiments with auto or recreate mode."; private InfoMsgs() { } diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAO.java b/src/main/java/com/autotune/database/dao/ExperimentDAO.java index 29d49083d..07770bc64 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAO.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAO.java @@ -43,6 +43,9 @@ public interface ExperimentDAO { // If Kruize object restarts load all experiment which are in inprogress public List loadAllExperiments() throws Exception; + // If Kruize object restarts load all local monitoring experiments which are in inprogress + public List loadAllLMExperiments() throws Exception; + // If Kruize object restarts load all results from the experiments which are in inprogress List loadAllResults() throws Exception; diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index a2a1de630..668e69f5b 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -732,6 +732,29 @@ public List loadAllExperiments() throws Exception { return entries; } + @Override + public List loadAllLMExperiments() throws Exception { + //todo load only experimentStatus=inprogress , playback may not require completed experiments + List entries = null; + String statusValue = "failure"; + Timer.Sample timerLoadAllExp = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + entries = session.createQuery(SELECT_FROM_LM_EXPERIMENTS, KruizeLMExperimentEntry.class).list(); + // TODO: remove native sql query and transient + //getExperimentTypeInKruizeExperimentEntry(entries); + statusValue = "success"; + } catch (Exception e) { + LOGGER.error("Not able to load experiment due to {}", e.getMessage()); + throw new Exception("Error while loading exsisting experiments from database due to : " + e.getMessage()); + } finally { + if (null != timerLoadAllExp) { + MetricsConfig.timerLoadAllExp = MetricsConfig.timerBLoadAllExp.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadAllExp.stop(MetricsConfig.timerLoadAllExp); + } + } + return entries; + } + @Override public List loadAllResults() throws Exception { // TODO: load only experimentStatus=inProgress , playback may not require completed experiments diff --git a/src/main/java/com/autotune/database/helper/DBConstants.java b/src/main/java/com/autotune/database/helper/DBConstants.java index eaaf9942d..6a3bd6dc5 100644 --- a/src/main/java/com/autotune/database/helper/DBConstants.java +++ b/src/main/java/com/autotune/database/helper/DBConstants.java @@ -6,6 +6,7 @@ public class DBConstants { public static final class SQLQUERY { public static final String SELECT_FROM_EXPERIMENTS = "from KruizeExperimentEntry"; + public static final String SELECT_FROM_LM_EXPERIMENTS = "from KruizeLMExperimentEntry"; public static final String SELECT_FROM_EXPERIMENTS_BY_EXP_NAME = "from KruizeExperimentEntry k WHERE k.experiment_name = :experimentName"; public static final String SELECT_FROM_LM_EXPERIMENTS_BY_EXP_NAME = "from KruizeLMExperimentEntry k WHERE k.experiment_name = :experimentName"; public static final String SELECT_FROM_RESULTS = "from KruizeResultsEntry"; diff --git a/src/main/java/com/autotune/database/service/ExperimentDBService.java b/src/main/java/com/autotune/database/service/ExperimentDBService.java index f631d7507..029e10f7a 100644 --- a/src/main/java/com/autotune/database/service/ExperimentDBService.java +++ b/src/main/java/com/autotune/database/service/ExperimentDBService.java @@ -77,6 +77,32 @@ public void loadAllExperiments(Map mainKruizeExperimentMap } } + public void loadAllLMExperiments(Map mainKruizeExperimentMap) throws Exception { + ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); + List entries = experimentDAO.loadAllLMExperiments(); + if (null != entries && !entries.isEmpty()) { + List createExperimentAPIObjects = DBHelpers.Converters.KruizeObjectConverters.convertLMExperimentEntryToCreateExperimentAPIObject(entries); + if (null != createExperimentAPIObjects && !createExperimentAPIObjects.isEmpty()) { + List kruizeExpList = new ArrayList<>(); + + int failureThreshHold = createExperimentAPIObjects.size(); + int failureCount = 0; + for (CreateExperimentAPIObject createExperimentAPIObject : createExperimentAPIObjects) { + KruizeObject kruizeObject = Converters.KruizeObjectConverters.convertCreateExperimentAPIObjToKruizeObject(createExperimentAPIObject); + if (null != kruizeObject) { + kruizeExpList.add(kruizeObject); + } else { + failureCount++; + } + } + if (failureThreshHold > 0 && failureCount == failureThreshHold) { + throw new Exception("None of the experiments are able to load from DB."); + } + experimentInterface.addExperimentToLocalStorage(mainKruizeExperimentMap, kruizeExpList); + } + } + } + public void loadAllResults(Map mainKruizeExperimentMap) throws Exception { ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); KruizeObject kruizeObject; From 69ad30e54ca866ff25887c2a0a4ad74b45d2cf17 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Tue, 10 Dec 2024 12:20:28 +0530 Subject: [PATCH 58/85] removing comments Signed-off-by: Shekhar Saxena --- .../recommendations/updater/RecommendationUpdaterService.java | 1 - 1 file changed, 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java index ffe6559ce..48e2eac35 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java @@ -61,7 +61,6 @@ public static void initiateUpdaterService() { vpaUpdater.applyResourceRecommendationsForExperiment(kruizeObject); } } - LOGGER.info("Done"); } catch (Exception e) { LOGGER.error(e.getMessage()); } From d28c8a418b4774308081e8c44047ff0d3acfcc61 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Fri, 13 Dec 2024 19:08:33 +0530 Subject: [PATCH 59/85] updating info logs to debug Signed-off-by: Shekhar Saxena --- .../recommendations/updater/RecommendationUpdaterService.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java index 48e2eac35..ab821a822 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java +++ b/src/main/java/com/autotune/analyzer/recommendations/updater/RecommendationUpdaterService.java @@ -74,7 +74,7 @@ public static void initiateUpdaterService() { private static Map getAutoModeExperiments() { try { - LOGGER.info(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_AUTO_EXP); + LOGGER.debug(AnalyzerConstants.RecommendationUpdaterConstants.InfoMsgs.CHECKING_AUTO_EXP); Map mainKruizeExperimentMap = new ConcurrentHashMap<>(); new ExperimentDBService().loadAllLMExperiments(mainKruizeExperimentMap); // filter map to only include entries where mode is auto or recreate From 68c0ee329fe8ee8aeaada2f11c44bbcde2613270 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Mon, 16 Dec 2024 11:46:14 +0530 Subject: [PATCH 60/85] removing disbale-verification function Signed-off-by: Shekhar Saxena --- .../java/com/autotune/utils/HttpUtils.java | 26 ------------------- 1 file changed, 26 deletions(-) diff --git a/src/main/java/com/autotune/utils/HttpUtils.java b/src/main/java/com/autotune/utils/HttpUtils.java index ef92a089b..5cceba53f 100644 --- a/src/main/java/com/autotune/utils/HttpUtils.java +++ b/src/main/java/com/autotune/utils/HttpUtils.java @@ -91,32 +91,6 @@ private static String getDataFromConnection(HttpURLConnection connection) throws return response.toString(); } - public static void disableSSLVerification() { - TrustManager[] dummyTrustManager = new TrustManager[]{new X509TrustManager() { - public X509Certificate[] getAcceptedIssuers() { - return null; - } - - public void checkClientTrusted(X509Certificate[] certs, String authType) { } - - public void checkServerTrusted(X509Certificate[] certs, String authType) { } - }}; - - HostnameVerifier allHostsValid = (hostname, session) -> true; - - SSLContext sslContext = null; - try { - sslContext = SSLContext.getInstance("TLSv1.2"); - sslContext.init(null, dummyTrustManager, new java.security.SecureRandom()); - } catch (NoSuchAlgorithmException | KeyManagementException e) { - e.printStackTrace(); - } - - assert sslContext != null; - HttpsURLConnection.setDefaultSSLSocketFactory(sslContext.getSocketFactory()); - HttpsURLConnection.setDefaultHostnameVerifier(allHostsValid); - } - public static String postRequest(URL url, String content) { try { URLConnection connection = url.openConnection(); From e7c18b6c61062d76fbca36bd66f218f3d5ad3133 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Mon, 16 Dec 2024 14:26:08 +0530 Subject: [PATCH 61/85] fix issue when all experiments creation is skipped in case of some null namespaces Signed-off-by: Saad Khan --- .../data/dataSourceMetadata/DataSourceMetadataHelper.java | 2 +- src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/main/java/com/autotune/common/data/dataSourceMetadata/DataSourceMetadataHelper.java b/src/main/java/com/autotune/common/data/dataSourceMetadata/DataSourceMetadataHelper.java index 18a22bd57..c562cf1b4 100644 --- a/src/main/java/com/autotune/common/data/dataSourceMetadata/DataSourceMetadataHelper.java +++ b/src/main/java/com/autotune/common/data/dataSourceMetadata/DataSourceMetadataHelper.java @@ -427,7 +427,7 @@ public void updateContainerDataSourceMetadataInfoObject(String dataSourceName, D if (null == dataSourceNamespace) { LOGGER.debug(KruizeConstants.DataSourceConstants.DataSourceMetadataErrorMsgs.INVALID_DATASOURCE_METADATA_NAMESPACE); - return; + continue; } // Iterate over workloads in namespaceWorkloadMap diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index a2a1de630..ca2094838 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -985,7 +985,7 @@ public KruizeRecommendationEntry loadRecommendationsByExperimentNameAndDate(Stri getExperimentTypeInSingleKruizeRecommendationsEntry(recommendationEntries); statusValue = "success"; } catch (NoResultException e) { - LOGGER.debug("Generating new recommendation for Experiment name : %s interval_end_time: %S", experimentName, interval_end_time); + LOGGER.debug("Generating new recommendation for Experiment name : {} interval_end_time: {}", experimentName, interval_end_time); } catch (Exception e) { LOGGER.error("Not able to load recommendations due to {}", e.getMessage()); recommendationEntries = null; From 79c2196b8de242f33385eef680eb261ab31c49af Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Fri, 13 Dec 2024 22:59:47 +0530 Subject: [PATCH 62/85] list Recommendations to support both rm and lm Signed-off-by: msvinaykumar --- migrations/kruize_local_ddl.sql | 2 +- .../engine/RecommendationEngine.java | 8 +- .../services/GenerateRecommendations.java | 2 +- .../services/ListRecommendations.java | 21 +- .../analyzer/utils/AnalyzerConstants.java | 1 + .../analyzer/workerimpl/BulkJobManager.java | 3 +- .../autotune/database/dao/ExperimentDAO.java | 11 + .../database/dao/ExperimentDAOImpl.java | 189 +++++++++--- .../autotune/database/helper/DBConstants.java | 13 +- .../autotune/database/helper/DBHelpers.java | 282 +++++++++++++----- .../database/init/KruizeHibernateUtil.java | 2 + .../database/service/ExperimentDBService.java | 97 +++++- .../table/KruizeRecommendationEntry.java | 13 + .../table/lm/KruizeLMRecommendationEntry.java | 82 +++++ 14 files changed, 583 insertions(+), 143 deletions(-) create mode 100644 src/main/java/com/autotune/database/table/lm/KruizeLMRecommendationEntry.java diff --git a/migrations/kruize_local_ddl.sql b/migrations/kruize_local_ddl.sql index fc5474b26..645dfbab1 100644 --- a/migrations/kruize_local_ddl.sql +++ b/migrations/kruize_local_ddl.sql @@ -5,4 +5,4 @@ create table IF NOT EXISTS kruize_dsmetadata (id serial, version varchar(255), d alter table kruize_lm_experiments add column metadata_id bigint references kruize_dsmetadata(id); alter table if exists kruize_lm_experiments add constraint UK_lm_experiment_name unique (experiment_name); create table IF NOT EXISTS kruize_metric_profiles (api_version varchar(255), kind varchar(255), metadata jsonb, name varchar(255) not null, k8s_type varchar(255), profile_version float(53) not null, slo jsonb, primary key (name)); -alter table kruize_recommendations add column experiment_type varchar(255); +create table IF NOT EXISTS kruize_lm_recommendations (interval_end_time timestamp(6) not null, experiment_name varchar(255) not null, cluster_name varchar(255), extended_data jsonb, version varchar(255),experiment_type varchar(255), primary key (experiment_name, interval_end_time)) PARTITION BY RANGE (interval_end_time); diff --git a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java index cf1bb91b2..8f4331a1d 100644 --- a/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java +++ b/src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java @@ -197,13 +197,13 @@ private KruizeObject createKruizeObject(String target_cluster) { KruizeObject kruizeObject = new KruizeObject(); try { - if (KruizeDeploymentInfo.is_ros_enabled){ - if(null == target_cluster || target_cluster.equalsIgnoreCase(AnalyzerConstants.REMOTE)){ + if (KruizeDeploymentInfo.is_ros_enabled) { + if (null == target_cluster || target_cluster.equalsIgnoreCase(AnalyzerConstants.REMOTE)) { new ExperimentDBService().loadExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); - }else{ + } else { new ExperimentDBService().loadLMExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); } - }else{ + } else { new ExperimentDBService().loadLMExperimentFromDBByName(mainKruizeExperimentMAP, experimentName); } diff --git a/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java b/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java index 28b681af6..de9c8a3c2 100644 --- a/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java +++ b/src/main/java/com/autotune/analyzer/services/GenerateRecommendations.java @@ -102,7 +102,7 @@ protected void doPost(HttpServletRequest request, HttpServletResponse response) // validate and create KruizeObject if successful String validationMessage = recommendationEngine.validate_local(); if (validationMessage.isEmpty()) { - KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount, null); + KruizeObject kruizeObject = recommendationEngine.prepareRecommendations(calCount, AnalyzerConstants.LOCAL); // todo target cluster is set to LOCAL always if (kruizeObject.getValidation_data().isSuccess()) { LOGGER.debug("UpdateRecommendations API request count: {} success", calCount); interval_end_time = Utils.DateUtils.getTimeStampFrom(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT, diff --git a/src/main/java/com/autotune/analyzer/services/ListRecommendations.java b/src/main/java/com/autotune/analyzer/services/ListRecommendations.java index ee533905f..73cfb272f 100644 --- a/src/main/java/com/autotune/analyzer/services/ListRecommendations.java +++ b/src/main/java/com/autotune/analyzer/services/ListRecommendations.java @@ -84,19 +84,26 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t String experimentName = request.getParameter(AnalyzerConstants.ServiceConstants.EXPERIMENT_NAME); String latestRecommendation = request.getParameter(AnalyzerConstants.ServiceConstants.LATEST); String monitoringEndTime = request.getParameter(KruizeConstants.JSONKeys.MONITORING_END_TIME); + String rm = request.getParameter(AnalyzerConstants.ServiceConstants.RM); Timestamp monitoringEndTimestamp = null; Map mKruizeExperimentMap = new ConcurrentHashMap(); - ; boolean getLatest = true; boolean checkForTimestamp = false; boolean error = false; + boolean rmTable = false; if (null != latestRecommendation && !latestRecommendation.isEmpty() && latestRecommendation.equalsIgnoreCase(AnalyzerConstants.BooleanString.FALSE) ) { getLatest = false; } + if (null != rm + && !rm.isEmpty() + && rm.equalsIgnoreCase(AnalyzerConstants.BooleanString.TRUE) + ) { + rmTable = true; + } List kruizeObjectList = new ArrayList<>(); try { // Check if experiment name is passed @@ -104,7 +111,11 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t // trim the experiment name to remove whitespaces experimentName = experimentName.trim(); try { - new ExperimentDBService().loadExperimentAndRecommendationsFromDBByName(mKruizeExperimentMap, experimentName); + if (rmTable) { + new ExperimentDBService().loadExperimentAndRecommendationsFromDBByName(mKruizeExperimentMap, experimentName); + } else { + new ExperimentDBService().loadLMExperimentAndRecommendationsFromDBByName(mKruizeExperimentMap, experimentName); + } } catch (Exception e) { LOGGER.error("Loading saved experiment {} failed: {} ", experimentName, e.getMessage()); } @@ -151,7 +162,11 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t } } else { try { - new ExperimentDBService().loadAllExperimentsAndRecommendations(mKruizeExperimentMap); + if (rmTable) { + new ExperimentDBService().loadAllExperimentsAndRecommendations(mKruizeExperimentMap); + } else { + new ExperimentDBService().loadAllLMExperimentsAndRecommendations(mKruizeExperimentMap); + } } catch (Exception e) { LOGGER.error("Loading saved experiment {} failed: {} ", experimentName, e.getMessage()); } diff --git a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java index ac40d8791..db256caaa 100644 --- a/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java +++ b/src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java @@ -438,6 +438,7 @@ public static final class ServiceConstants { public static final String CLUSTER_NAME = "cluster_name"; public static final String VERBOSE = "verbose"; public static final String FALSE = "false"; + public static final String RM = "rm"; private ServiceConstants() { } diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index d032e2b50..893180218 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -57,6 +57,7 @@ import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.*; import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.NotificationConstants.*; + /** * The `run` method processes bulk input to create experiments and generates resource optimization recommendations. * It handles the creation of experiment names based on various data source components, makes HTTP POST requests @@ -121,7 +122,7 @@ private static Map parseLabelString(String labelString) { public void run() { String statusValue = "failure"; MetricsConfig.activeJobs.incrementAndGet(); - Timer.Sample timerRunJob = Timer.start(MetricsConfig.meterRegistry()); + io.micrometer.core.instrument.Timer.Sample timerRunJob = Timer.start(MetricsConfig.meterRegistry()); DataSourceMetadataInfo metadataInfo = null; DataSourceManager dataSourceManager = new DataSourceManager(); DataSourceInfo datasource = null; diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAO.java b/src/main/java/com/autotune/database/dao/ExperimentDAO.java index 07770bc64..a918c71de 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAO.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAO.java @@ -6,6 +6,7 @@ import com.autotune.common.data.ValidationOutputData; import com.autotune.database.table.*; import com.autotune.database.table.lm.KruizeLMExperimentEntry; +import com.autotune.database.table.lm.KruizeLMRecommendationEntry; import java.sql.Timestamp; import java.util.List; @@ -25,6 +26,10 @@ public interface ExperimentDAO { // Add recommendation to DB public ValidationOutputData addRecommendationToDB(KruizeRecommendationEntry recommendationEntry); + // Add recommendation to DB + public ValidationOutputData addRecommendationToDB(KruizeLMRecommendationEntry recommendationEntry); + + // Add Performance Profile to DB public ValidationOutputData addPerformanceProfileToDB(KruizePerformanceProfileEntry kruizePerformanceProfileEntry); @@ -52,6 +57,8 @@ public interface ExperimentDAO { // If Kruize restarts load all recommendations List loadAllRecommendations() throws Exception; + List loadAllLMRecommendations() throws Exception; + // If Kruize restarts load all performance profiles List loadAllPerformanceProfiles() throws Exception; @@ -75,6 +82,8 @@ public interface ExperimentDAO { // Load all recommendations of a particular experiment List loadRecommendationsByExperimentName(String experimentName) throws Exception; + // Load all recommendations of a particular experiment + List loadLMRecommendationsByExperimentName(String experimentName) throws Exception; // Load a single Performance Profile based on name List loadPerformanceProfileByName(String performanceProfileName) throws Exception; @@ -88,6 +97,8 @@ public interface ExperimentDAO { // Load all recommendations of a particular experiment and interval end Time KruizeRecommendationEntry loadRecommendationsByExperimentNameAndDate(String experimentName, String cluster_name, Timestamp interval_end_time) throws Exception; + KruizeLMRecommendationEntry loadLMRecommendationsByExperimentNameAndDate(String experimentName, String cluster_name, Timestamp interval_end_time) throws Exception; + // Get KruizeResult Record List getKruizeResultsEntry(String experiment_name, String cluster_name, Timestamp interval_start_time, Timestamp interval_end_time) throws Exception; diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index 35e0269f2..597c51f60 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -25,6 +25,7 @@ import com.autotune.database.init.KruizeHibernateUtil; import com.autotune.database.table.*; import com.autotune.database.table.lm.KruizeLMExperimentEntry; +import com.autotune.database.table.lm.KruizeLMRecommendationEntry; import com.autotune.utils.KruizeConstants; import com.autotune.utils.MetricsConfig; import io.micrometer.core.instrument.Timer; @@ -384,9 +385,48 @@ public ValidationOutputData addRecommendationToDB(KruizeRecommendationEntry reco tx = session.beginTransaction(); session.persist(recommendationEntry); tx.commit(); - if (null == recommendationEntry.getExperimentType() || recommendationEntry.getExperimentType().isEmpty()) { - updateExperimentTypeInKruizeRecommendationEntry(recommendationEntry); - } + validationOutputData.setSuccess(true); + statusValue = "success"; + } else { + tx = session.beginTransaction(); + existingRecommendationEntry.setExtended_data(recommendationEntry.getExtended_data()); + session.merge(existingRecommendationEntry); + tx.commit(); + validationOutputData.setSuccess(true); + statusValue = "success"; + } + } catch (Exception e) { + LOGGER.error("Not able to save recommendation due to {}", e.getMessage()); + if (tx != null) tx.rollback(); + e.printStackTrace(); + validationOutputData.setSuccess(false); + validationOutputData.setMessage(e.getMessage()); + //todo save error to API_ERROR_LOG + } + } catch (Exception e) { + LOGGER.error("Not able to save recommendation due to {}", e.getMessage()); + } finally { + if (null != timerAddRecDB) { + MetricsConfig.timerAddRecDB = MetricsConfig.timerBAddRecDB.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerAddRecDB.stop(MetricsConfig.timerAddRecDB); + } + } + return validationOutputData; + } + + @Override + public ValidationOutputData addRecommendationToDB(KruizeLMRecommendationEntry recommendationEntry) { + ValidationOutputData validationOutputData = new ValidationOutputData(false, null, null); + Transaction tx = null; + String statusValue = "failure"; + Timer.Sample timerAddRecDB = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + try { + KruizeLMRecommendationEntry existingRecommendationEntry = loadLMRecommendationsByExperimentNameAndDate(recommendationEntry.getExperiment_name(), recommendationEntry.getCluster_name(), recommendationEntry.getInterval_end_time()); + if (null == existingRecommendationEntry) { + tx = session.beginTransaction(); + session.persist(recommendationEntry); + tx.commit(); validationOutputData.setSuccess(true); statusValue = "success"; } else { @@ -717,6 +757,27 @@ public List loadAllExperiments() throws Exception { Timer.Sample timerLoadAllExp = Timer.start(MetricsConfig.meterRegistry()); try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { entries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_EXPERIMENTS, KruizeExperimentEntry.class).list(); + statusValue = "success"; + } catch (Exception e) { + LOGGER.error("Not able to load experiment due to {}", e.getMessage()); + throw new Exception("Error while loading exsisting experiments from database due to : " + e.getMessage()); + } finally { + if (null != timerLoadAllExp) { + MetricsConfig.timerLoadAllExp = MetricsConfig.timerBLoadAllExp.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadAllExp.stop(MetricsConfig.timerLoadAllExp); + } + } + return entries; + } + + @Override + public List loadAllLMExperiments() throws Exception { + //todo load only experimentStatus=inprogress , playback may not require completed experiments + List entries = null; + String statusValue = "failure"; + Timer.Sample timerLoadAllExp = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + entries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_LM_EXPERIMENTS, KruizeLMExperimentEntry.class).list(); // TODO: remove native sql query and transient //getExperimentTypeInKruizeExperimentEntry(entries); statusValue = "success"; @@ -799,6 +860,28 @@ public List loadAllRecommendations() throws Exception return recommendationEntries; } + @Override + public List loadAllLMRecommendations() throws Exception { + List recommendationEntries = null; + String statusValue = "failure"; + Timer.Sample timerLoadAllRec = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + recommendationEntries = session.createQuery( + DBConstants.SQLQUERY.SELECT_FROM_LM_RECOMMENDATIONS, + KruizeLMRecommendationEntry.class).list(); + statusValue = "success"; + } catch (Exception e) { + LOGGER.error("Not able to load recommendations due to {}", e.getMessage()); + throw new Exception("Error while loading existing recommendations from database due to : " + e.getMessage()); + } finally { + if (null != timerLoadAllRec) { + MetricsConfig.timerLoadAllRec = MetricsConfig.timerBLoadAllRec.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadAllRec.stop(MetricsConfig.timerLoadAllRec); + } + } + return recommendationEntries; + } + @Override public List loadAllPerformanceProfiles() throws Exception { String statusValue = "failure"; @@ -973,7 +1056,27 @@ public List loadRecommendationsByExperimentName(Strin try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { recommendationEntries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_RECOMMENDATIONS_BY_EXP_NAME, KruizeRecommendationEntry.class) .setParameter("experimentName", experimentName).list(); - getExperimentTypeInKruizeRecommendationsEntry(recommendationEntries); + statusValue = "success"; + } catch (Exception e) { + LOGGER.error("Not able to load recommendations due to {}", e.getMessage()); + throw new Exception("Error while loading existing recommendations from database due to : " + e.getMessage()); + } finally { + if (null != timerLoadRecExpName) { + MetricsConfig.timerLoadRecExpName = MetricsConfig.timerBLoadRecExpName.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadRecExpName.stop(MetricsConfig.timerLoadRecExpName); + } + } + return recommendationEntries; + } + + @Override + public List loadLMRecommendationsByExperimentName(String experimentName) throws Exception { + List recommendationEntries = null; + String statusValue = "failure"; + Timer.Sample timerLoadRecExpName = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + recommendationEntries = session.createQuery(DBConstants.SQLQUERY.SELECT_FROM_LM_RECOMMENDATIONS_BY_EXP_NAME, KruizeLMRecommendationEntry.class) + .setParameter("experimentName", experimentName).list(); statusValue = "success"; } catch (Exception e) { LOGGER.error("Not able to load recommendations due to {}", e.getMessage()); @@ -1005,7 +1108,40 @@ public KruizeRecommendationEntry loadRecommendationsByExperimentNameAndDate(Stri if (cluster_name != null) kruizeRecommendationEntryQuery.setParameter(CLUSTER_NAME, cluster_name); recommendationEntries = kruizeRecommendationEntryQuery.getSingleResult(); - getExperimentTypeInSingleKruizeRecommendationsEntry(recommendationEntries); + statusValue = "success"; + } catch (NoResultException e) { + LOGGER.debug("Generating new recommendation for Experiment name : %s interval_end_time: %S", experimentName, interval_end_time); + } catch (Exception e) { + LOGGER.error("Not able to load recommendations due to {}", e.getMessage()); + recommendationEntries = null; + throw new Exception("Error while loading existing recommendations from database due to : " + e.getMessage()); + } finally { + if (null != timerLoadRecExpNameDate) { + MetricsConfig.timerLoadRecExpNameDate = MetricsConfig.timerBLoadRecExpNameDate.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadRecExpNameDate.stop(MetricsConfig.timerLoadRecExpNameDate); + } + } + return recommendationEntries; + } + + @Override + public KruizeLMRecommendationEntry loadLMRecommendationsByExperimentNameAndDate(String experimentName, String cluster_name, Timestamp interval_end_time) throws Exception { + KruizeLMRecommendationEntry recommendationEntries = null; + String statusValue = "failure"; + String clusterCondtionSql = ""; + if (cluster_name != null) + clusterCondtionSql = String.format(" and k.%s = :%s ", KruizeConstants.JSONKeys.CLUSTER_NAME, KruizeConstants.JSONKeys.CLUSTER_NAME); + else + clusterCondtionSql = String.format(" and k.%s is null ", KruizeConstants.JSONKeys.CLUSTER_NAME); + + Timer.Sample timerLoadRecExpNameDate = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + Query kruizeRecommendationEntryQuery = session.createQuery(SELECT_FROM_LM_RECOMMENDATIONS_BY_EXP_NAME_AND_END_TIME + clusterCondtionSql, KruizeLMRecommendationEntry.class) + .setParameter(KruizeConstants.JSONKeys.EXPERIMENT_NAME, experimentName) + .setParameter(KruizeConstants.JSONKeys.INTERVAL_END_TIME, interval_end_time); + if (cluster_name != null) + kruizeRecommendationEntryQuery.setParameter(CLUSTER_NAME, cluster_name); + recommendationEntries = kruizeRecommendationEntryQuery.getSingleResult(); statusValue = "success"; } catch (NoResultException e) { LOGGER.debug("Generating new recommendation for Experiment name : %s interval_end_time: %S", experimentName, interval_end_time); @@ -1266,54 +1402,13 @@ private void updateExperimentTypeInKruizeExperimentEntry(KruizeExperimentEntry k } }*/ - private void getExperimentTypeInKruizeRecommendationsEntry(List entries) throws Exception { - for (KruizeRecommendationEntry recomEntry : entries) { - getExperimentTypeInSingleKruizeRecommendationsEntry(recomEntry); - } - } private void getExperimentTypeInSingleKruizeRecommendationsEntry(KruizeRecommendationEntry recomEntry) throws Exception { List expEntries = loadExperimentByName(recomEntry.getExperiment_name()); - if (null != expEntries && !expEntries.isEmpty()) { - if (isTargetCluserLocal(expEntries.get(0).getTarget_cluster())) { - try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { - String sql = DBConstants.SQLQUERY.SELECT_RECOMMENDATIONS_EXP_TYPE; - Query query = session.createNativeQuery(sql); - // set experiment_type parameter in sql query - query.setParameter("experiment_name", recomEntry.getExperiment_name()); - List exType = query.getResultList(); - if (null != exType && !exType.isEmpty()) { - recomEntry.setExperimentType(exType.get(0)); - } - } catch (Exception e) { - LOGGER.error("Not able to get experiment type in recommendation entry due to {}", e.getMessage()); - throw new Exception("Error while updating experiment type to recommendation due to : " + e.getMessage()); - } - } - } - } - private void updateExperimentTypeInKruizeRecommendationEntry(KruizeRecommendationEntry recommendationEntry) throws Exception { - List entries = loadExperimentByName(recommendationEntry.getExperiment_name()); - if (null != entries && !entries.isEmpty()) { - if (isTargetCluserLocal(entries.get(0).getTarget_cluster())) { - try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { - Transaction tx = session.beginTransaction(); - String sql = DBConstants.SQLQUERY.UPDATE_RECOMMENDATIONS_EXP_TYPE; - Query query = session.createNativeQuery(sql); - query.setParameter("experiment_type", recommendationEntry.getExperimentType()); - query.setParameter("experiment_name", recommendationEntry.getExperiment_name()); - query.setParameter("interval_end_time", recommendationEntry.getInterval_end_time()); - query.executeUpdate(); - tx.commit(); - } catch (Exception e) { - LOGGER.error("Not able to update experiment type in recommendation entry due to {}", e.getMessage()); - throw new Exception("Error while updating experiment type to recommendation due to : " + e.getMessage()); - } - } - } } + private boolean isTargetCluserLocal(String targetCluster) { if (AnalyzerConstants.LOCAL.equalsIgnoreCase(targetCluster)) { return true; diff --git a/src/main/java/com/autotune/database/helper/DBConstants.java b/src/main/java/com/autotune/database/helper/DBConstants.java index 6a3bd6dc5..2e9b55f93 100644 --- a/src/main/java/com/autotune/database/helper/DBConstants.java +++ b/src/main/java/com/autotune/database/helper/DBConstants.java @@ -51,12 +51,19 @@ public static final class SQLQUERY { "k.interval_end_time = (SELECT MAX(e.interval_end_time) FROM KruizeResultsEntry e where e.experiment_name = :%s ) ", KruizeConstants.JSONKeys.EXPERIMENT_NAME, KruizeConstants.JSONKeys.EXPERIMENT_NAME); public static final String SELECT_FROM_RECOMMENDATIONS_BY_EXP_NAME = String.format("from KruizeRecommendationEntry k WHERE k.experiment_name = :experimentName"); + public static final String SELECT_FROM_LM_RECOMMENDATIONS_BY_EXP_NAME = String.format("from KruizeLMRecommendationEntry k WHERE k.experiment_name = :experimentName"); public static final String SELECT_FROM_RECOMMENDATIONS_BY_EXP_NAME_AND_END_TIME = String.format( "from KruizeRecommendationEntry k WHERE " + "k.experiment_name = :%s and " + "k.interval_end_time= :%s ", KruizeConstants.JSONKeys.EXPERIMENT_NAME, KruizeConstants.JSONKeys.INTERVAL_END_TIME); + public static final String SELECT_FROM_LM_RECOMMENDATIONS_BY_EXP_NAME_AND_END_TIME = String.format( + "from KruizeLMRecommendationEntry k WHERE " + + "k.experiment_name = :%s and " + + "k.interval_end_time= :%s ", + KruizeConstants.JSONKeys.EXPERIMENT_NAME, KruizeConstants.JSONKeys.INTERVAL_END_TIME); public static final String SELECT_FROM_RECOMMENDATIONS = "from KruizeRecommendationEntry"; + public static final String SELECT_FROM_LM_RECOMMENDATIONS = "from KruizeLMRecommendationEntry"; public static final String SELECT_FROM_PERFORMANCE_PROFILE = "from KruizePerformanceProfileEntry"; public static final String SELECT_FROM_PERFORMANCE_PROFILE_BY_NAME = "from KruizePerformanceProfileEntry k WHERE k.name = :name"; public static final String SELECT_FROM_METRIC_PROFILE = "from KruizeMetricProfileEntry"; @@ -78,17 +85,13 @@ public static final class SQLQUERY { " WHERE container->>'container_name' = :container_name" + " AND container->>'container_image_name' = :container_image_name" + " ))"; - public static final String UPDATE_EXPERIMENT_EXP_TYPE = "UPDATE kruize_experiments SET experiment_type = :experiment_type WHERE experiment_name = :experiment_name"; - public static final String UPDATE_RECOMMENDATIONS_EXP_TYPE = "UPDATE kruize_recommendations SET experiment_type = :experiment_type WHERE experiment_name = :experiment_name and interval_end_time = :interval_end_time"; - public static final String SELECT_EXPERIMENT_EXP_TYPE = "SELECT experiment_type from kruize_experiments WHERE experiment_id = :experiment_id"; - public static final String SELECT_RECOMMENDATIONS_EXP_TYPE = "SELECT experiment_type from kruize_recommendations WHERE experiment_name = :experiment_name"; - } public static final class TABLE_NAMES { public static final String KRUIZE_EXPERIMENTS = "kruize_experiments"; public static final String KRUIZE_RESULTS = "kruize_results"; public static final String KRUIZE_RECOMMENDATIONS = "kruize_recommendations"; + public static final String KRUIZE_LM_RECOMMENDATIONS = "kruize_lm_recommendations"; public static final String KRUIZE_PERFORMANCE_PROFILES = "kruize_performance_profiles"; } diff --git a/src/main/java/com/autotune/database/helper/DBHelpers.java b/src/main/java/com/autotune/database/helper/DBHelpers.java index e5cbc53cd..cc92b9d27 100644 --- a/src/main/java/com/autotune/database/helper/DBHelpers.java +++ b/src/main/java/com/autotune/database/helper/DBHelpers.java @@ -41,6 +41,7 @@ import com.autotune.common.k8sObjects.K8sObject; import com.autotune.database.table.*; import com.autotune.database.table.lm.KruizeLMExperimentEntry; +import com.autotune.database.table.lm.KruizeLMRecommendationEntry; import com.autotune.utils.KruizeConstants; import com.autotune.utils.Utils; import com.fasterxml.jackson.core.JsonProcessingException; @@ -209,8 +210,6 @@ public static void setRecommendationsToKruizeObject(List namespaceRecommendations = clonedNamespaceData.getNamespaceRecommendations().getData(); + if (null != monitoringEndTime && namespaceRecommendations.containsKey(monitoringEndTime)) { matchFound = true; NamespaceAPIObject namespaceAPIObject = null; @@ -559,6 +692,14 @@ public static ListRecommendationsAPIObject getListRecommendationAPIObjectForDB(K continue; HashMap recommendations = clonedContainerData.getContainerRecommendations().getData(); + if (null != monitoringEndTime && !recommendations.containsKey(monitoringEndTime)) { + try { + Timestamp endInterval = containerData.getContainerRecommendations().getData().keySet().stream().max(Timestamp::compareTo).get(); + monitoringEndTime = endInterval; + } catch (Exception e) { + LOGGER.error("Error while converting ContainerData to Timestamp due to and not able to save date into recommendation table: " + e.getMessage()); + } + } if (null != monitoringEndTime && recommendations.containsKey(monitoringEndTime)) { matchFound = true; ContainerAPIObject containerAPIObject = null; @@ -594,71 +735,6 @@ public static ListRecommendationsAPIObject getListRecommendationAPIObjectForDB(K return listRecommendationsAPIObject; } - public static KruizeRecommendationEntry convertKruizeObjectTORecommendation(KruizeObject kruizeObject, Timestamp monitoringEndTime) { - KruizeRecommendationEntry kruizeRecommendationEntry = null; - Boolean checkForTimestamp = false; - Boolean getLatest = true; - Gson gson = new GsonBuilder() - .disableHtmlEscaping() - .setPrettyPrinting() - .enableComplexMapKeySerialization() - .setDateFormat(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT) - .registerTypeAdapter(Date.class, new GsonUTCDateAdapter()) - .registerTypeAdapter(AnalyzerConstants.RecommendationItem.class, new RecommendationItemAdapter()) - .registerTypeAdapter(DeviceDetails.class, new DeviceDetailsAdapter()) - .create(); - try { - ListRecommendationsAPIObject listRecommendationsAPIObject = getListRecommendationAPIObjectForDB( - kruizeObject, monitoringEndTime); - if (null == listRecommendationsAPIObject) { - return null; - } - LOGGER.debug(new GsonBuilder() - .setPrettyPrinting() - .registerTypeAdapter(AnalyzerConstants.RecommendationItem.class, new RecommendationItemAdapter()) - .registerTypeAdapter(DeviceDetails.class, new DeviceDetailsAdapter()) - .create() - .toJson(listRecommendationsAPIObject)); - kruizeRecommendationEntry = new KruizeRecommendationEntry(); - kruizeRecommendationEntry.setVersion(KruizeConstants.KRUIZE_RECOMMENDATION_API_VERSION.LATEST.getVersionNumber()); - kruizeRecommendationEntry.setExperiment_name(listRecommendationsAPIObject.getExperimentName()); - kruizeRecommendationEntry.setCluster_name(listRecommendationsAPIObject.getClusterName()); - //kruizeRecommendationEntry.setExperimentType(listRecommendationsAPIObject.getExperimentType()); - - Timestamp endInterval = null; - // todo : what happens if two k8 objects or Containers with different timestamp - for (KubernetesAPIObject k8sObject : listRecommendationsAPIObject.getKubernetesObjects()) { - if (listRecommendationsAPIObject.isNamespaceExperiment()) { - endInterval = k8sObject.getNamespaceAPIObjects().getnamespaceRecommendations().getData().keySet().stream().max(Timestamp::compareTo).get(); - } else { - for (ContainerAPIObject containerAPIObject : k8sObject.getContainerAPIObjects()) { - endInterval = containerAPIObject.getContainerRecommendations().getData().keySet().stream().max(Timestamp::compareTo).get(); - break; - } - } - } - kruizeRecommendationEntry.setInterval_end_time(endInterval); - Map k8sObjectsMap = Map.of(KruizeConstants.JSONKeys.KUBERNETES_OBJECTS, listRecommendationsAPIObject.getKubernetesObjects()); - String k8sObjectString = gson.toJson(k8sObjectsMap); - ObjectMapper objectMapper = new ObjectMapper(); - DateFormat df = new SimpleDateFormat(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT); - objectMapper.setDateFormat(df); - try { - kruizeRecommendationEntry.setExtended_data( - objectMapper.readTree( - k8sObjectString - ) - ); - } catch (JsonProcessingException e) { - throw new Exception("Error while creating Extended data due to : " + e.getMessage()); - } - } catch (Exception e) { - kruizeRecommendationEntry = null; - LOGGER.error("Error while converting KruizeObject to KruizeRecommendationEntry due to {}", e.getMessage()); - e.printStackTrace(); - } - return kruizeRecommendationEntry; - } public static List convertLMExperimentEntryToCreateExperimentAPIObject(List entries) throws Exception { List createExperimentAPIObjects = new ArrayList<>(); @@ -671,6 +747,9 @@ public static List convertLMExperimentEntryToCreateEx CreateExperimentAPIObject apiObj = new Gson().fromJson(extended_data_rawJson, CreateExperimentAPIObject.class); apiObj.setExperiment_id(entry.getExperiment_id()); apiObj.setStatus(entry.getStatus()); + apiObj.setTargetCluster(entry.getTarget_cluster()); + apiObj.setMode(entry.getMode()); + apiObj.setExperimentType(entry.getExperiment_type()); createExperimentAPIObjects.add(apiObj); } catch (Exception e) { LOGGER.error("Error in converting to apiObj from db object due to : {}", e.getMessage()); @@ -708,6 +787,7 @@ public static List convertExperimentEntryToCreateExpe return createExperimentAPIObjects; } + public static List convertResultEntryToUpdateResultsAPIObject(List kruizeResultsEntries) { ObjectMapper mapper = new ObjectMapper(); DateFormat df = new SimpleDateFormat(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT); @@ -845,6 +925,74 @@ public static List convertRecommendationEntryToRec return listRecommendationsAPIObjectList; } + public static List convertLMRecommendationEntryToRecommendationAPIObject( + List kruizeRecommendationEntryList) throws InvalidConversionOfRecommendationEntryException { + if (null == kruizeRecommendationEntryList) + return null; + if (kruizeRecommendationEntryList.size() == 0) + return null; + Gson gson = new GsonBuilder() + .disableHtmlEscaping() + .setPrettyPrinting() + .enableComplexMapKeySerialization() + .setDateFormat(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT) + .registerTypeAdapter(Date.class, new GsonUTCDateAdapter()) + .registerTypeAdapter(AnalyzerConstants.RecommendationItem.class, new RecommendationItemAdapter()) + .registerTypeAdapter(DeviceDetails.class, new DeviceDetailsAdapter()) + .create(); + List listRecommendationsAPIObjectList = new ArrayList<>(); + for (KruizeLMRecommendationEntry kruizeRecommendationEntry : kruizeRecommendationEntryList) { + // Check if instance of KruizeRecommendationEntry is null + if (null == kruizeRecommendationEntry) { + // Throw an exception stating it cannot be null + throw new InvalidConversionOfRecommendationEntryException( + String.format( + AnalyzerErrorConstants.ConversionErrors.KruizeRecommendationError.NOT_NULL, + KruizeRecommendationEntry.class.getSimpleName() + ) + ); + } + // Create an Object Mapper to extract value from JSON Node + ObjectMapper objectMapper = new ObjectMapper(); + DateFormat df = new SimpleDateFormat(KruizeConstants.DateFormats.STANDARD_JSON_DATE_FORMAT); + objectMapper.setDateFormat(df); + // Create a holder for recommendation object to save the result from object mapper + ListRecommendationsAPIObject listRecommendationsAPIObject = null; + JsonNode extendedData = kruizeRecommendationEntry.getExtended_data().get(KruizeConstants.JSONKeys.KUBERNETES_OBJECTS); + if (null == extendedData) + continue; + try { + // If successful, the object mapper returns the list recommendation API Object + List kubernetesAPIObjectList = new ArrayList<>(); + if (extendedData.isArray()) { + for (JsonNode node : extendedData) { + KubernetesAPIObject kubernetesAPIObject = gson.fromJson(objectMapper.writeValueAsString(node), KubernetesAPIObject.class); + if (null != kubernetesAPIObject) { + kubernetesAPIObjectList.add(kubernetesAPIObject); + } else { + LOGGER.debug("GSON failed to convert the DB Json object in convertRecommendationEntryToRecommendationAPIObject"); + } + } + } + if (null != kubernetesAPIObjectList) { + listRecommendationsAPIObject = new ListRecommendationsAPIObject(); + listRecommendationsAPIObject.setApiVersion(kruizeRecommendationEntry.getVersion()); + listRecommendationsAPIObject.setKubernetesObjects(kubernetesAPIObjectList); + listRecommendationsAPIObject.setExperimentName(kruizeRecommendationEntry.getExperiment_name()); + listRecommendationsAPIObject.setClusterName(kruizeRecommendationEntry.getCluster_name()); + } + } catch (JsonProcessingException e) { + e.printStackTrace(); + LOGGER.debug(e.getMessage()); + } + if (null != listRecommendationsAPIObject) + listRecommendationsAPIObjectList.add(listRecommendationsAPIObject); + } + if (listRecommendationsAPIObjectList.isEmpty()) + return null; + return listRecommendationsAPIObjectList; + } + public static KruizePerformanceProfileEntry convertPerfProfileObjToPerfProfileDBObj(PerformanceProfile performanceProfile) { KruizePerformanceProfileEntry kruizePerformanceProfileEntry = null; try { diff --git a/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java b/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java index 7d41041f1..841ac99a9 100644 --- a/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java +++ b/src/main/java/com/autotune/database/init/KruizeHibernateUtil.java @@ -18,6 +18,7 @@ import com.autotune.database.table.*; import com.autotune.database.table.lm.KruizeLMExperimentEntry; +import com.autotune.database.table.lm.KruizeLMRecommendationEntry; import com.autotune.operator.KruizeDeploymentInfo; import org.hibernate.Session; import org.hibernate.SessionFactory; @@ -59,6 +60,7 @@ public static void buildSessionFactory() { configuration.addAnnotatedClass(KruizePerformanceProfileEntry.class); if (KruizeDeploymentInfo.local) { configuration.addAnnotatedClass(KruizeLMExperimentEntry.class); + configuration.addAnnotatedClass(KruizeLMRecommendationEntry.class); configuration.addAnnotatedClass(KruizeDataSourceEntry.class); configuration.addAnnotatedClass(KruizeDSMetadataEntry.class); configuration.addAnnotatedClass(KruizeMetricProfileEntry.class); diff --git a/src/main/java/com/autotune/database/service/ExperimentDBService.java b/src/main/java/com/autotune/database/service/ExperimentDBService.java index 029e10f7a..370cf632c 100644 --- a/src/main/java/com/autotune/database/service/ExperimentDBService.java +++ b/src/main/java/com/autotune/database/service/ExperimentDBService.java @@ -34,6 +34,7 @@ import com.autotune.database.helper.DBHelpers; import com.autotune.database.table.*; import com.autotune.database.table.lm.KruizeLMExperimentEntry; +import com.autotune.database.table.lm.KruizeLMRecommendationEntry; import com.autotune.operator.KruizeDeploymentInfo; import com.autotune.operator.KruizeOperator; import org.slf4j.Logger; @@ -42,6 +43,8 @@ import java.sql.Timestamp; import java.util.*; +import static com.autotune.operator.KruizeDeploymentInfo.is_ros_enabled; + public class ExperimentDBService { private static final long serialVersionUID = 1L; private static final Logger LOGGER = LoggerFactory.getLogger(ExperimentDBService.class); @@ -129,6 +132,26 @@ public void loadAllResults(Map mainKruizeExperimentMap) th } } + public void loadAllLMRecommendations(Map mainKruizeExperimentMap) throws Exception { + ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); + // Load Recommendations from DB and save to local + List recommendationEntries = experimentDAO.loadAllLMRecommendations(); + if (null != recommendationEntries && !recommendationEntries.isEmpty()) { + List recommendationsAPIObjects = null; + try { + recommendationsAPIObjects = DBHelpers.Converters.KruizeObjectConverters + .convertLMRecommendationEntryToRecommendationAPIObject(recommendationEntries); + } catch (InvalidConversionOfRecommendationEntryException e) { + e.printStackTrace(); + } + if (null != recommendationsAPIObjects && !recommendationsAPIObjects.isEmpty()) { + experimentInterface.addRecommendationsToLocalStorage(mainKruizeExperimentMap, + recommendationsAPIObjects, + true); + } + } + } + public void loadAllRecommendations(Map mainKruizeExperimentMap) throws Exception { ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); @@ -144,7 +167,6 @@ public void loadAllRecommendations(Map mainKruizeExperimen e.printStackTrace(); } if (null != recommendationsAPIObjects && !recommendationsAPIObjects.isEmpty()) { - experimentInterface.addRecommendationsToLocalStorage(mainKruizeExperimentMap, recommendationsAPIObjects, true); @@ -228,12 +250,32 @@ public void loadRecommendationsFromDBByName(Map mainKruize } } + public void loadLMRecommendationsFromDBByName(Map mainKruizeExperimentMap, String experimentName) throws Exception { + ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); + // Load Recommendations from DB and save to local + List recommendationEntries = experimentDAO.loadLMRecommendationsByExperimentName(experimentName); + if (null != recommendationEntries && !recommendationEntries.isEmpty()) { + List recommendationsAPIObjects + = null; + try { + recommendationsAPIObjects = DBHelpers.Converters.KruizeObjectConverters + .convertLMRecommendationEntryToRecommendationAPIObject(recommendationEntries); + } catch (InvalidConversionOfRecommendationEntryException e) { + e.printStackTrace(); + } + if (null != recommendationsAPIObjects && !recommendationsAPIObjects.isEmpty()) { + experimentInterface.addRecommendationsToLocalStorage(mainKruizeExperimentMap, + recommendationsAPIObjects, + true); + } + } + } + public ValidationOutputData addExperimentToDB(CreateExperimentAPIObject createExperimentAPIObject) { ValidationOutputData validationOutputData = new ValidationOutputData(false, null, null); try { KruizeLMExperimentEntry kruizeLMExperimentEntry = DBHelpers.Converters.KruizeObjectConverters.convertCreateAPIObjToExperimentDBObj(createExperimentAPIObject); - LOGGER.debug("is_ros_enabled:{} , targetCluster:{} ", KruizeDeploymentInfo.is_ros_enabled, createExperimentAPIObject.getTargetCluster()); - if (KruizeDeploymentInfo.is_ros_enabled && createExperimentAPIObject.getTargetCluster().equalsIgnoreCase(AnalyzerConstants.REMOTE)) { + if (is_ros_enabled && createExperimentAPIObject.getTargetCluster().equalsIgnoreCase(AnalyzerConstants.REMOTE)) { KruizeExperimentEntry oldKruizeExperimentEntry = new KruizeExperimentEntry(kruizeLMExperimentEntry); validationOutputData = this.experimentDAO.addExperimentToDB(oldKruizeExperimentEntry); } else { @@ -277,10 +319,22 @@ public ValidationOutputData addRecommendationToDB(Map expe LOGGER.error("Trying to locate Recommendation for non existent experiment: " + kruizeObject.getExperimentName()); return validationOutputData; // todo: need to set the correct message } - KruizeRecommendationEntry kr = DBHelpers.Converters.KruizeObjectConverters. - convertKruizeObjectTORecommendation(kruizeObject, interval_end_time); - if (null != kr) { - if (KruizeDeploymentInfo.local == true) { //todo this code will be removed + + if (KruizeDeploymentInfo.is_ros_enabled && kruizeObject.getTarget_cluster().equalsIgnoreCase(AnalyzerConstants.REMOTE)) { + KruizeRecommendationEntry kr = DBHelpers.Converters.KruizeObjectConverters. + convertKruizeObjectTORecommendation(kruizeObject, interval_end_time); + if (null != kr) { + ValidationOutputData tempValObj = new ExperimentDAOImpl().addRecommendationToDB(kr); + if (!tempValObj.isSuccess()) { + validationOutputData.setSuccess(false); + String errMsg = String.format("Experiment name : %s , Interval end time : %s | ", kruizeObject.getExperimentName(), interval_end_time); + validationOutputData.setMessage(validationOutputData.getMessage() + errMsg); + } + } + } else { + KruizeLMRecommendationEntry kr = DBHelpers.Converters.KruizeObjectConverters. + convertKruizeObjectTOLMRecommendation(kruizeObject, interval_end_time); + if (null != kr) { // Create a Calendar object and set the time with the timestamp Calendar localDateTime = Calendar.getInstance(TimeZone.getTimeZone("UTC")); localDateTime.setTime(kr.getInterval_end_time()); @@ -288,19 +342,20 @@ public ValidationOutputData addRecommendationToDB(Map expe int dayOfTheMonth = localDateTime.get(Calendar.DAY_OF_MONTH); try { synchronized (new Object()) { - dao.addPartitions(DBConstants.TABLE_NAMES.KRUIZE_RECOMMENDATIONS, String.format("%02d", localDateTime.get(Calendar.MONTH) + 1), String.valueOf(localDateTime.get(Calendar.YEAR)), dayOfTheMonth, DBConstants.PARTITION_TYPES.BY_DAY); + dao.addPartitions(DBConstants.TABLE_NAMES.KRUIZE_LM_RECOMMENDATIONS, String.format("%02d", localDateTime.get(Calendar.MONTH) + 1), String.valueOf(localDateTime.get(Calendar.YEAR)), dayOfTheMonth, DBConstants.PARTITION_TYPES.BY_DAY); } } catch (Exception e) { LOGGER.warn(e.getMessage()); } - } - ValidationOutputData tempValObj = new ExperimentDAOImpl().addRecommendationToDB(kr); - if (!tempValObj.isSuccess()) { - validationOutputData.setSuccess(false); - String errMsg = String.format("Experiment name : %s , Interval end time : %s | ", kruizeObject.getExperimentName(), interval_end_time); - validationOutputData.setMessage(validationOutputData.getMessage() + errMsg); + ValidationOutputData tempValObj = new ExperimentDAOImpl().addRecommendationToDB(kr); + if (!tempValObj.isSuccess()) { + validationOutputData.setSuccess(false); + String errMsg = String.format("Experiment name : %s , Interval end time : %s | ", kruizeObject.getExperimentName(), interval_end_time); + validationOutputData.setMessage(validationOutputData.getMessage() + errMsg); + } } } + if (validationOutputData.getMessage().equals("")) validationOutputData.setSuccess(true); return validationOutputData; @@ -435,6 +490,13 @@ public void loadExperimentAndRecommendationsFromDBByName(Map mainKruizeExperimentMap, String experimentName) throws Exception { + + loadLMExperimentFromDBByName(mainKruizeExperimentMap, experimentName); + + loadLMRecommendationsFromDBByName(mainKruizeExperimentMap, experimentName); + } + public void loadPerformanceProfileFromDBByName(Map performanceProfileMap, String performanceProfileName) throws Exception { List entries = experimentDAO.loadPerformanceProfileByName(performanceProfileName); if (null != entries && !entries.isEmpty()) { @@ -479,6 +541,13 @@ public void loadAllExperimentsAndRecommendations(Map mainK loadAllRecommendations(mainKruizeExperimentMap); } + public void loadAllLMExperimentsAndRecommendations(Map mainKruizeExperimentMap) throws Exception { + + loadAllLMExperiments(mainKruizeExperimentMap); + + loadAllLMRecommendations(mainKruizeExperimentMap); + } + public boolean updateExperimentStatus(KruizeObject kruizeObject, AnalyzerConstants.ExperimentStatus status) { kruizeObject.setStatus(status); // TODO update into database diff --git a/src/main/java/com/autotune/database/table/KruizeRecommendationEntry.java b/src/main/java/com/autotune/database/table/KruizeRecommendationEntry.java index d3d490f0c..355590361 100644 --- a/src/main/java/com/autotune/database/table/KruizeRecommendationEntry.java +++ b/src/main/java/com/autotune/database/table/KruizeRecommendationEntry.java @@ -1,5 +1,6 @@ package com.autotune.database.table; +import com.autotune.database.table.lm.KruizeLMRecommendationEntry; import com.fasterxml.jackson.databind.JsonNode; import jakarta.persistence.*; import org.hibernate.annotations.JdbcTypeCode; @@ -30,6 +31,18 @@ public class KruizeRecommendationEntry { @Transient private String experiment_type; + public KruizeRecommendationEntry(KruizeLMRecommendationEntry recommendationEntry) { + this.experiment_name = recommendationEntry.getExperiment_name(); + this.interval_end_time = recommendationEntry.getInterval_end_time(); + this.cluster_name = recommendationEntry.getCluster_name(); + this.extended_data = recommendationEntry.getExtended_data(); + this.version = recommendationEntry.getVersion(); + } + + public KruizeRecommendationEntry() { + + } + public String getExperiment_name() { return experiment_name; } diff --git a/src/main/java/com/autotune/database/table/lm/KruizeLMRecommendationEntry.java b/src/main/java/com/autotune/database/table/lm/KruizeLMRecommendationEntry.java new file mode 100644 index 000000000..061a7b74d --- /dev/null +++ b/src/main/java/com/autotune/database/table/lm/KruizeLMRecommendationEntry.java @@ -0,0 +1,82 @@ +package com.autotune.database.table.lm; + +import com.fasterxml.jackson.databind.JsonNode; +import jakarta.persistence.Entity; +import jakarta.persistence.Id; +import jakarta.persistence.Index; +import jakarta.persistence.Table; +import org.hibernate.annotations.JdbcTypeCode; +import org.hibernate.type.SqlTypes; + +import java.sql.Timestamp; + +@Entity +@Table(name = "kruize_lm_recommendations", indexes = { + @Index( + name = "idx_recommendation_experiment_name", + columnList = "experiment_name", + unique = false), + @Index( + name = "idx_recommendation_interval_end_time", + columnList = "interval_end_time", + unique = false) +}) +public class KruizeLMRecommendationEntry { + private String version; + @Id + private String experiment_name; + @Id + private Timestamp interval_end_time; + private String cluster_name; + @JdbcTypeCode(SqlTypes.JSON) + private JsonNode extended_data; + private String experiment_type; + + public String getExperiment_name() { + return experiment_name; + } + + public void setExperiment_name(String experiment_name) { + this.experiment_name = experiment_name; + } + + public Timestamp getInterval_end_time() { + return interval_end_time; + } + + public void setInterval_end_time(Timestamp interval_end_time) { + this.interval_end_time = interval_end_time; + } + + public String getCluster_name() { + return cluster_name; + } + + public void setCluster_name(String cluster_name) { + this.cluster_name = cluster_name; + } + + public JsonNode getExtended_data() { + return extended_data; + } + + public void setExtended_data(JsonNode extended_data) { + this.extended_data = extended_data; + } + + public String getVersion() { + return version; + } + + public void setVersion(String version) { + this.version = version; + } + + public String getExperimentType() { + return experiment_type; + } + + public void setExperimentType(String experimentType) { + this.experiment_type = experimentType; + } +} From e542a9d24038c25386bfe9e39f9e9461b8d78a7d Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sun, 15 Dec 2024 14:14:48 +0530 Subject: [PATCH 63/85] github checks fix Signed-off-by: msvinaykumar --- tests/scripts/helpers/kruize.py | 14 ++++++++++---- .../rest_apis/test_e2e_workflow.py | 13 ++++++++----- 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/tests/scripts/helpers/kruize.py b/tests/scripts/helpers/kruize.py index 0e8035073..50985cc6a 100644 --- a/tests/scripts/helpers/kruize.py +++ b/tests/scripts/helpers/kruize.py @@ -150,10 +150,12 @@ def update_recommendations(experiment_name, startTime, endTime): # Description: This function obtains the recommendations from Kruize Autotune using listRecommendations API # Input Parameters: experiment name, flag indicating latest result and monitoring end time -def list_recommendations(experiment_name=None, latest=None, monitoring_end_time=None): +def list_recommendations(experiment_name=None, latest=None, monitoring_end_time=None, rm=False): PARAMS = "" print("\nListing the recommendations...") url = URL + "/listRecommendations" + if rm: + url += "?rm=true" print("URL = ", url) if experiment_name == None: @@ -391,6 +393,7 @@ def create_metric_profile(metric_profile_json_file): print(response.text) return response + # Description: This function deletes the metric profile # Input Parameters: metric profile input json def delete_metric_profile(input_json_file, invalid_header=False): @@ -447,6 +450,7 @@ def list_metric_profiles(name=None, verbose=None, logging=True): print("\n************************************************************") return response + # Description: This function generates recommendation for the given experiment_name def generate_recommendations(experiment_name): print("\n************************************************************") @@ -464,6 +468,7 @@ def generate_recommendations(experiment_name): print("\n************************************************************") return response + def post_bulk_api(input_json_file): print("\n************************************************************") print("Sending POST request to URL: ", f"{URL}/bulk") @@ -477,18 +482,19 @@ def post_bulk_api(input_json_file): print("Response JSON: ", response.json()) return response -def get_bulk_job_status(job_id,verbose=False): + +def get_bulk_job_status(job_id, verbose=False): print("\n************************************************************") url_basic = f"{URL}/bulk?job_id={job_id}" url_verbose = f"{URL}/bulk?job_id={job_id}&verbose={verbose}" getJobIDURL = url_basic if verbose: getJobIDURL = url_verbose - print("Sending GET request to URL ( verbose=",verbose," ): ", getJobIDURL) + print("Sending GET request to URL ( verbose=", verbose, " ): ", getJobIDURL) curl_command_verbose = f"curl -X GET '{getJobIDURL}'" print("Equivalent cURL command : ", curl_command_verbose) response = requests.get(url_verbose) print("Verbose GET Response Status Code: ", response.status_code) print("Verbose GET Response JSON: ", response.json()) - return response \ No newline at end of file + return response diff --git a/tests/scripts/remote_monitoring_tests/rest_apis/test_e2e_workflow.py b/tests/scripts/remote_monitoring_tests/rest_apis/test_e2e_workflow.py index a956ac732..27d22f82f 100644 --- a/tests/scripts/remote_monitoring_tests/rest_apis/test_e2e_workflow.py +++ b/tests/scripts/remote_monitoring_tests/rest_apis/test_e2e_workflow.py @@ -18,6 +18,7 @@ import pytest import sys + sys.path.append("../../") from helpers.fixtures import * @@ -120,10 +121,11 @@ def test_list_recommendations_multiple_exps_from_diff_json_files(cluster_type): data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name - assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ + assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ 'message'] == RECOMMENDATIONS_AVAILABLE - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: recommendation_json = response.json() recommendation_section = recommendation_json[0]["kubernetes_objects"][0]["containers"][0][ @@ -133,13 +135,14 @@ def test_list_recommendations_multiple_exps_from_diff_json_files(cluster_type): assert INFO_RECOMMENDATIONS_AVAILABLE_CODE in high_level_notifications data_section = recommendation_section["data"] short_term_recommendation = \ - data_section[end_time.strftime("%Y-%m-%dT%H:%M:%S.%fZ")[:-4] + "Z"]["recommendation_terms"]["short_term"] + data_section[end_time.strftime("%Y-%m-%dT%H:%M:%S.%fZ")[:-4] + "Z"]["recommendation_terms"][ + "short_term"] short_term_notifications = short_term_recommendation["notifications"] for notification in short_term_notifications.values(): assert notification["type"] != "error" # Invoke list recommendations for the specified experiment - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE list_reco_json = response.json() @@ -157,7 +160,7 @@ def test_list_recommendations_multiple_exps_from_diff_json_files(cluster_type): # Invoke list recommendations for a non-existing experiment experiment_name = "Non-existing-exp" - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == ERROR_STATUS_CODE data = response.json() From ad2936cbaa13fee6b39832b9069f684f263f11d0 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 16 Dec 2024 18:44:51 +0530 Subject: [PATCH 64/85] resolving conflict Signed-off-by: msvinaykumar --- .../database/dao/ExperimentDAOImpl.java | 24 +------------------ 1 file changed, 1 insertion(+), 23 deletions(-) diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index 597c51f60..b772a62ea 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -793,29 +793,7 @@ public List loadAllLMExperiments() throws Exception { return entries; } - @Override - public List loadAllLMExperiments() throws Exception { - //todo load only experimentStatus=inprogress , playback may not require completed experiments - List entries = null; - String statusValue = "failure"; - Timer.Sample timerLoadAllExp = Timer.start(MetricsConfig.meterRegistry()); - try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { - entries = session.createQuery(SELECT_FROM_LM_EXPERIMENTS, KruizeLMExperimentEntry.class).list(); - // TODO: remove native sql query and transient - //getExperimentTypeInKruizeExperimentEntry(entries); - statusValue = "success"; - } catch (Exception e) { - LOGGER.error("Not able to load experiment due to {}", e.getMessage()); - throw new Exception("Error while loading exsisting experiments from database due to : " + e.getMessage()); - } finally { - if (null != timerLoadAllExp) { - MetricsConfig.timerLoadAllExp = MetricsConfig.timerBLoadAllExp.tag("status", statusValue).register(MetricsConfig.meterRegistry()); - timerLoadAllExp.stop(MetricsConfig.timerLoadAllExp); - } - } - return entries; - } - + @Override public List loadAllResults() throws Exception { // TODO: load only experimentStatus=inProgress , playback may not require completed experiments From 514de698cdbd2379acfb2a7db4a7c22a40c947fc Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Fri, 13 Dec 2024 22:59:47 +0530 Subject: [PATCH 65/85] list Recommendations to support both rm and lm Signed-off-by: msvinaykumar --- src/main/java/com/autotune/database/dao/ExperimentDAO.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAO.java b/src/main/java/com/autotune/database/dao/ExperimentDAO.java index a918c71de..f4beb40bd 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAO.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAO.java @@ -48,7 +48,7 @@ public interface ExperimentDAO { // If Kruize object restarts load all experiment which are in inprogress public List loadAllExperiments() throws Exception; - // If Kruize object restarts load all local monitoring experiments which are in inprogress + public List loadAllLMExperiments() throws Exception; // If Kruize object restarts load all results from the experiments which are in inprogress From f2f1d48874c2d2a3aa32d8391503b89c4b1693a1 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Sun, 15 Dec 2024 15:57:26 +0530 Subject: [PATCH 66/85] listExperiments fix for concurrent rm and lm Signed-off-by: msvinaykumar --- .../analyzer/services/ListExperiments.java | 52 ++++++-- .../autotune/database/dao/ExperimentDAO.java | 2 + .../database/dao/ExperimentDAOImpl.java | 32 +++++ .../autotune/database/helper/DBConstants.java | 9 ++ .../database/service/ExperimentDBService.java | 20 +++ .../autotune/utils/KruizeSupportedTypes.java | 119 ++++++++---------- 6 files changed, 153 insertions(+), 81 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/services/ListExperiments.java b/src/main/java/com/autotune/analyzer/services/ListExperiments.java index b8ca71447..cd8631aee 100644 --- a/src/main/java/com/autotune/analyzer/services/ListExperiments.java +++ b/src/main/java/com/autotune/analyzer/services/ListExperiments.java @@ -107,12 +107,14 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t String latest = request.getParameter(LATEST); String recommendations = request.getParameter(KruizeConstants.JSONKeys.RECOMMENDATIONS); String experimentName = request.getParameter(EXPERIMENT_NAME); + String rm = request.getParameter(AnalyzerConstants.ServiceConstants.RM); String requestBody = request.getReader().lines().collect(Collectors.joining(System.lineSeparator())); StringBuilder clusterName = new StringBuilder(); List kubernetesAPIObjectList = new ArrayList<>(); boolean isJSONValid = true; Map mKruizeExperimentMap = new ConcurrentHashMap<>(); boolean error = false; + boolean rmTable = false; // validate Query params Set invalidParams = new HashSet<>(); for (String param : request.getParameterMap().keySet()) { @@ -120,6 +122,12 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t invalidParams.add(param); } } + if (null != rm + && !rm.isEmpty() + && rm.equalsIgnoreCase(AnalyzerConstants.BooleanString.TRUE) + ) { + rmTable = true; + } try { if (invalidParams.isEmpty()) { // Set default values if absent @@ -142,13 +150,21 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t // parse the requestBody JSON into corresponding classes parseInputJSON(requestBody, clusterName, kubernetesAPIObjectList); try { - new ExperimentDBService().loadExperimentFromDBByInputJSON(mKruizeExperimentMap, clusterName, kubernetesAPIObjectList); + if (rmTable) + new ExperimentDBService().loadExperimentFromDBByInputJSON(mKruizeExperimentMap, clusterName, kubernetesAPIObjectList); + else { + new ExperimentDBService().loadLMExperimentFromDBByInputJSON(mKruizeExperimentMap, clusterName, kubernetesAPIObjectList); + } } catch (Exception e) { LOGGER.error("Failed to load saved experiment data: {} ", e.getMessage()); } } else { // Fetch experiments data from the DB and check if the requested experiment exists - loadExperimentsFromDatabase(mKruizeExperimentMap, experimentName); + if (rmTable) { + loadExperimentsFromDatabase(mKruizeExperimentMap, experimentName); + } else { + loadLMExperimentsFromDatabase(mKruizeExperimentMap, experimentName); + } } // Check if experiment exists if (experimentName != null && !mKruizeExperimentMap.containsKey(experimentName)) { @@ -161,18 +177,18 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t ); } if (!error) { - // create Gson Object - Gson gsonObj = createGsonObject(); + // create Gson Object + Gson gsonObj = createGsonObject(); - // Modify the JSON response here based on query params. - gsonStr = buildResponseBasedOnQuery(mKruizeExperimentMap, gsonObj, results, recommendations, latest, experimentName); - if (gsonStr.isEmpty()) { - gsonStr = generateDefaultResponse(); - } - response.getWriter().println(gsonStr); - response.getWriter().close(); - statusValue = "success"; + // Modify the JSON response here based on query params. + gsonStr = buildResponseBasedOnQuery(mKruizeExperimentMap, gsonObj, results, recommendations, latest, experimentName); + if (gsonStr.isEmpty()) { + gsonStr = generateDefaultResponse(); } + response.getWriter().println(gsonStr); + response.getWriter().close(); + statusValue = "success"; + } } catch (Exception e) { LOGGER.error("Exception: " + e.getMessage()); e.printStackTrace(); @@ -278,6 +294,18 @@ private void loadExperimentsFromDatabase(Map mKruizeExperi } } + private void loadLMExperimentsFromDatabase(Map mKruizeExperimentMap, String experimentName) { + try { + if (experimentName == null || experimentName.isEmpty()) + new ExperimentDBService().loadAllLMExperiments(mKruizeExperimentMap); + else + new ExperimentDBService().loadLMExperimentFromDBByName(mKruizeExperimentMap, experimentName); + + } catch (Exception e) { + LOGGER.error("Failed to load saved experiment data: {} ", e.getMessage()); + } + } + private Gson createGsonObject() { return new GsonBuilder() .disableHtmlEscaping() diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAO.java b/src/main/java/com/autotune/database/dao/ExperimentDAO.java index f4beb40bd..b8cc9c897 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAO.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAO.java @@ -108,6 +108,8 @@ public interface ExperimentDAO { List loadExperimentFromDBByInputJSON(StringBuilder clusterName, KubernetesAPIObject kubernetesAPIObject) throws Exception; + List loadLMExperimentFromDBByInputJSON(StringBuilder clusterName, KubernetesAPIObject kubernetesAPIObject) throws Exception; + // Load all the datasources List loadAllDataSources() throws Exception; diff --git a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java index b772a62ea..47c5d5b31 100644 --- a/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java +++ b/src/main/java/com/autotune/database/dao/ExperimentDAOImpl.java @@ -986,6 +986,38 @@ public List loadExperimentFromDBByInputJSON(StringBuilder return entries; } + @Override + public List loadLMExperimentFromDBByInputJSON(StringBuilder clusterName, KubernetesAPIObject kubernetesAPIObject) throws Exception { + //todo load only experimentStatus=inprogress , playback may not require completed experiments + List entries; + String statusValue = "failure"; + Timer.Sample timerLoadExpName = Timer.start(MetricsConfig.meterRegistry()); + try (Session session = KruizeHibernateUtil.getSessionFactory().openSession()) { + // assuming there will be only one container + ContainerAPIObject containerAPIObject = kubernetesAPIObject.getContainerAPIObjects().get(0); + // Set parameters for KubernetesObject and Container + Query query = session.createNativeQuery(SELECT_FROM_LM_EXPERIMENTS_BY_INPUT_JSON, KruizeLMExperimentEntry.class); + query.setParameter(CLUSTER_NAME, clusterName.toString()); + query.setParameter(KruizeConstants.JSONKeys.NAME, kubernetesAPIObject.getName()); + query.setParameter(KruizeConstants.JSONKeys.NAMESPACE, kubernetesAPIObject.getNamespace()); + query.setParameter(KruizeConstants.JSONKeys.TYPE, kubernetesAPIObject.getType()); + query.setParameter(KruizeConstants.JSONKeys.CONTAINER_NAME, containerAPIObject.getContainer_name()); + query.setParameter(KruizeConstants.JSONKeys.CONTAINER_IMAGE_NAME, containerAPIObject.getContainer_image_name()); + + entries = query.getResultList(); + statusValue = "success"; + } catch (Exception e) { + LOGGER.error("Error fetching experiment data: {}", e.getMessage()); + throw new Exception("Error while fetching experiment data from database: " + e.getMessage()); + } finally { + if (null != timerLoadExpName) { + MetricsConfig.timerLoadExpName = MetricsConfig.timerBLoadExpName.tag("status", statusValue).register(MetricsConfig.meterRegistry()); + timerLoadExpName.stop(MetricsConfig.timerLoadExpName); + } + } + return entries; + } + @Override public List loadResultsByExperimentName(String experimentName, String cluster_name, Timestamp calculated_start_time, Timestamp interval_end_time) throws Exception { diff --git a/src/main/java/com/autotune/database/helper/DBConstants.java b/src/main/java/com/autotune/database/helper/DBConstants.java index 2e9b55f93..fdcacc163 100644 --- a/src/main/java/com/autotune/database/helper/DBConstants.java +++ b/src/main/java/com/autotune/database/helper/DBConstants.java @@ -85,6 +85,15 @@ public static final class SQLQUERY { " WHERE container->>'container_name' = :container_name" + " AND container->>'container_image_name' = :container_image_name" + " ))"; + public static final String SELECT_FROM_LM_EXPERIMENTS_BY_INPUT_JSON = "SELECT * FROM kruize_lm_experiments WHERE cluster_name = :cluster_name " + + "AND EXISTS (SELECT 1 FROM jsonb_array_elements(extended_data->'kubernetes_objects') AS kubernetes_object" + + " WHERE kubernetes_object->>'name' = :name " + + " AND kubernetes_object->>'namespace' = :namespace " + + " AND kubernetes_object->>'type' = :type " + + " AND EXISTS (SELECT 1 FROM jsonb_array_elements(kubernetes_object->'containers') AS container" + + " WHERE container->>'container_name' = :container_name" + + " AND container->>'container_image_name' = :container_image_name" + + " ))"; } public static final class TABLE_NAMES { diff --git a/src/main/java/com/autotune/database/service/ExperimentDBService.java b/src/main/java/com/autotune/database/service/ExperimentDBService.java index 370cf632c..1572c4873 100644 --- a/src/main/java/com/autotune/database/service/ExperimentDBService.java +++ b/src/main/java/com/autotune/database/service/ExperimentDBService.java @@ -476,6 +476,26 @@ public void loadExperimentFromDBByInputJSON(Map mKruizeExp } } + public void loadLMExperimentFromDBByInputJSON(Map mKruizeExperimentMap, StringBuilder clusterName, List kubernetesAPIObjectList) throws Exception { + ExperimentInterface experimentInterface = new ExperimentInterfaceImpl(); + // assuming there will be only one Kubernetes object + KubernetesAPIObject kubernetesAPIObject = kubernetesAPIObjectList.get(0); + List entries = experimentDAO.loadLMExperimentFromDBByInputJSON(clusterName, kubernetesAPIObject); + if (null != entries && !entries.isEmpty()) { + List createExperimentAPIObjects = DBHelpers.Converters.KruizeObjectConverters.convertLMExperimentEntryToCreateExperimentAPIObject(entries); + if (!createExperimentAPIObjects.isEmpty()) { + List kruizeExpList = new ArrayList<>(); + for (CreateExperimentAPIObject createExperimentAPIObject : createExperimentAPIObjects) { + KruizeObject kruizeObject = Converters.KruizeObjectConverters.convertCreateExperimentAPIObjToKruizeObject(createExperimentAPIObject); + if (null != kruizeObject) { + kruizeExpList.add(kruizeObject); + } + } + experimentInterface.addExperimentToLocalStorage(mKruizeExperimentMap, kruizeExpList); + } + } + } + public void loadExperimentAndResultsFromDBByName(Map mainKruizeExperimentMap, String experimentName) throws Exception { loadExperimentFromDBByName(mainKruizeExperimentMap, experimentName); diff --git a/src/main/java/com/autotune/utils/KruizeSupportedTypes.java b/src/main/java/com/autotune/utils/KruizeSupportedTypes.java index d297efd2f..638fd2e5b 100644 --- a/src/main/java/com/autotune/utils/KruizeSupportedTypes.java +++ b/src/main/java/com/autotune/utils/KruizeSupportedTypes.java @@ -22,73 +22,54 @@ /** * Supported types to both Autotune and KruizeLayer objects */ -public class KruizeSupportedTypes -{ - private KruizeSupportedTypes() { } - - public static final Set DIRECTIONS_SUPPORTED = - new HashSet<>(Arrays.asList("minimize", "maximize")); - - public static final Set MONITORING_AGENTS_SUPPORTED = - new HashSet<>(Arrays.asList("prometheus")); - - public static final Set MODES_SUPPORTED = - new HashSet<>(Arrays.asList("experiment", "monitor")); - - public static final Set TARGET_CLUSTERS_SUPPORTED = - new HashSet<>(Arrays.asList("local", "remote")); - - public static final Set PRESENCE_SUPPORTED = - new HashSet<>(Arrays.asList("always", "", null)); - - public static final Set SLO_CLASSES_SUPPORTED = - new HashSet<>(Arrays.asList("throughput", "response_time", "resource_usage")); - - public static final Set LAYERS_SUPPORTED = - new HashSet<>(Arrays.asList("container", "hotspot", "quarkus")); - - public static final Set VALUE_TYPES_SUPPORTED = - new HashSet<>(Arrays.asList("double", "int", "string", "categorical")); - - public static final Set CLUSTER_TYPES_SUPPORTED = - new HashSet<>(Arrays.asList("kubernetes")); - - public static final Set K8S_TYPES_SUPPORTED = - new HashSet<>(Arrays.asList("minikube", "openshift", "icp", null)); - - public static final Set AUTH_TYPES_SUPPORTED = - new HashSet<>(Arrays.asList("saml", "oidc", "", null)); - - public static final Set LOGGING_TYPES_SUPPORTED = - new HashSet<>(Arrays.asList("all", "debug", "error", "info", "off", "warn")); - - public static final Set HPO_ALGOS_SUPPORTED = - new HashSet<>(Arrays.asList("optuna_tpe", "optuna_tpe_multivariate", "optuna_skopt", null)); - - public static final Set MATH_OPERATORS_SUPPORTED = - new HashSet<>(Arrays.asList("+", "-", "*", "/", "^","%","sin", "cos", "tan", "log")); - - public static final Set OBJECTIVE_FUNCTION_LIST = - new HashSet<>(Arrays.asList("(( throughput / transaction_response_time) / max_response_time) * 100", - "request_sum/request_count", - "(1.25 * request_count) - (1.5 * (request_sum / request_count)) - (0.25 * request_max)", - "((request_count / (request_sum / request_count)) / request_max) * 100")); - - public static final Set KUBERNETES_OBJECTS_SUPPORTED = - new HashSet<>(Arrays.asList("deployment", "pod", "container", "namespace")); - - public static final Set DSMETADATA_QUERY_PARAMS_SUPPORTED = new HashSet<>(Arrays.asList( - "datasource", "cluster_name", "namespace", "verbose" - )); - - public static final Set SUPPORTED_FORMATS = - new HashSet<>(Arrays.asList("cores", "m", "Bytes", "bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "kB", "KB", "MB", "GB", "TB", "PB", "EB", "K", "k", "M", "G", "T", "P", "E")); - - public static final Set QUERY_PARAMS_SUPPORTED = new HashSet<>(Arrays.asList( - "experiment_name", "results", "recommendations", "latest" - )); - - public static final Set LIST_METRIC_PROFILES_QUERY_PARAMS_SUPPORTED = new HashSet<>(Arrays.asList( - "name", "verbose" - )); +public class KruizeSupportedTypes { + public static final Set DIRECTIONS_SUPPORTED = + new HashSet<>(Arrays.asList("minimize", "maximize")); + public static final Set MONITORING_AGENTS_SUPPORTED = + new HashSet<>(Arrays.asList("prometheus")); + public static final Set MODES_SUPPORTED = + new HashSet<>(Arrays.asList("experiment", "monitor")); + public static final Set TARGET_CLUSTERS_SUPPORTED = + new HashSet<>(Arrays.asList("local", "remote")); + public static final Set PRESENCE_SUPPORTED = + new HashSet<>(Arrays.asList("always", "", null)); + public static final Set SLO_CLASSES_SUPPORTED = + new HashSet<>(Arrays.asList("throughput", "response_time", "resource_usage")); + public static final Set LAYERS_SUPPORTED = + new HashSet<>(Arrays.asList("container", "hotspot", "quarkus")); + public static final Set VALUE_TYPES_SUPPORTED = + new HashSet<>(Arrays.asList("double", "int", "string", "categorical")); + public static final Set CLUSTER_TYPES_SUPPORTED = + new HashSet<>(Arrays.asList("kubernetes")); + public static final Set K8S_TYPES_SUPPORTED = + new HashSet<>(Arrays.asList("minikube", "openshift", "icp", null)); + public static final Set AUTH_TYPES_SUPPORTED = + new HashSet<>(Arrays.asList("saml", "oidc", "", null)); + public static final Set LOGGING_TYPES_SUPPORTED = + new HashSet<>(Arrays.asList("all", "debug", "error", "info", "off", "warn")); + public static final Set HPO_ALGOS_SUPPORTED = + new HashSet<>(Arrays.asList("optuna_tpe", "optuna_tpe_multivariate", "optuna_skopt", null)); + public static final Set MATH_OPERATORS_SUPPORTED = + new HashSet<>(Arrays.asList("+", "-", "*", "/", "^", "%", "sin", "cos", "tan", "log")); + public static final Set OBJECTIVE_FUNCTION_LIST = + new HashSet<>(Arrays.asList("(( throughput / transaction_response_time) / max_response_time) * 100", + "request_sum/request_count", + "(1.25 * request_count) - (1.5 * (request_sum / request_count)) - (0.25 * request_max)", + "((request_count / (request_sum / request_count)) / request_max) * 100")); + public static final Set KUBERNETES_OBJECTS_SUPPORTED = + new HashSet<>(Arrays.asList("deployment", "pod", "container", "namespace")); + public static final Set DSMETADATA_QUERY_PARAMS_SUPPORTED = new HashSet<>(Arrays.asList( + "datasource", "cluster_name", "namespace", "verbose" + )); + public static final Set SUPPORTED_FORMATS = + new HashSet<>(Arrays.asList("cores", "m", "Bytes", "bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "kB", "KB", "MB", "GB", "TB", "PB", "EB", "K", "k", "M", "G", "T", "P", "E")); + public static final Set QUERY_PARAMS_SUPPORTED = new HashSet<>(Arrays.asList( + "experiment_name", "results", "recommendations", "latest", "rm" + )); + public static final Set LIST_METRIC_PROFILES_QUERY_PARAMS_SUPPORTED = new HashSet<>(Arrays.asList( + "name", "verbose" + )); + + private KruizeSupportedTypes() { + } } From 97968771257661e467f9f977e1a19c79eb9c8513 Mon Sep 17 00:00:00 2001 From: Dinakar Guniguntala Date: Tue, 17 Dec 2024 18:55:24 +0530 Subject: [PATCH 67/85] Bump mvp_demo to Kruize version v0.3 Signed-off-by: Dinakar Guniguntala --- .../BYODB-installation/minikube/kruize-crc-minikube.yaml | 2 +- .../BYODB-installation/openshift/kruize-crc-openshift.yaml | 2 +- .../aks/kruize-crc-aks.yaml | 6 +++--- .../minikube/kruize-crc-minikube.yaml | 6 +++--- .../openshift/kruize-crc-openshift.yaml | 6 +++--- pom.xml | 2 +- 6 files changed, 12 insertions(+), 12 deletions(-) diff --git a/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml b/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml index 3b7cce4f1..06c5e75f5 100644 --- a/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml +++ b/manifests/crc/BYODB-installation/minikube/kruize-crc-minikube.yaml @@ -91,7 +91,7 @@ spec: done containers: - name: kruize - image: kruize/autotune_operator:0.2 + image: kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume diff --git a/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml b/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml index 052d267df..8b8cd80f7 100644 --- a/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml +++ b/manifests/crc/BYODB-installation/openshift/kruize-crc-openshift.yaml @@ -119,7 +119,7 @@ spec: done containers: - name: kruize - image: kruize/autotune_operator:0.2 + image: kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume diff --git a/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml b/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml index 3d21a4acc..ba4c86b37 100644 --- a/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml +++ b/manifests/crc/default-db-included-installation/aks/kruize-crc-aks.yaml @@ -154,7 +154,7 @@ spec: spec: containers: - name: kruize - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume @@ -219,7 +219,7 @@ spec: spec: containers: - name: kruizecronjob - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume @@ -345,7 +345,7 @@ spec: spec: containers: - name: kruizedeletejob - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume diff --git a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml index 69ad63f59..f584b7e83 100644 --- a/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml +++ b/manifests/crc/default-db-included-installation/minikube/kruize-crc-minikube.yaml @@ -233,7 +233,7 @@ spec: done containers: - name: kruize - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume @@ -298,7 +298,7 @@ spec: spec: containers: - name: kruizecronjob - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume @@ -424,7 +424,7 @@ spec: spec: containers: - name: kruizedeletejob - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume diff --git a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml index cf5b9c56b..62b4d08aa 100644 --- a/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml +++ b/manifests/crc/default-db-included-installation/openshift/kruize-crc-openshift.yaml @@ -298,7 +298,7 @@ spec: done containers: - name: kruize - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume @@ -370,7 +370,7 @@ spec: spec: containers: - name: kruizecronjob - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume @@ -411,7 +411,7 @@ spec: spec: containers: - name: kruizedeletejob - image: quay.io/kruize/autotune_operator:0.2 + image: quay.io/kruize/autotune_operator:0.3 imagePullPolicy: Always volumeMounts: - name: config-volume diff --git a/pom.xml b/pom.xml index 7d9f8655a..bd63d88f9 100644 --- a/pom.xml +++ b/pom.xml @@ -6,7 +6,7 @@ org.autotune autotune - 0.2 + 0.3 7.0.0 20240303 From dd531eb3dd1b8208c4f7e3f3cbe57bac70cfeea5 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Mon, 16 Dec 2024 17:53:49 +0530 Subject: [PATCH 68/85] Concurrent RM and LM Functional test scripts Signed-off-by: msvinaykumar --- .../kruize_pod_restart_test.py | 15 +- .../rest_apis/test_list_recommendations.py | 327 ++++++++++-------- .../rest_apis/test_update_recommendations.py | 19 +- 3 files changed, 208 insertions(+), 153 deletions(-) diff --git a/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py b/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py index b6b56d1dd..758f395ff 100644 --- a/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py +++ b/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py @@ -14,21 +14,24 @@ limitations under the License. """ -import sys, getopt +import getopt import json import os +import sys import time + sys.path.append("../../") from helpers.kruize import * from helpers.utils import * from helpers.generate_rm_jsons import * + def main(argv): cluster_type = "minikube" results_dir = "." failed = 0 try: - opts, args = getopt.getopt(argv,"h:c:a:u:r:") + opts, args = getopt.getopt(argv, "h:c:a:u:r:") except getopt.GetoptError: print("kruize_pod_restart_test.py -c -a -r ") print("Note: -a option is required only on openshift when kruize service is exposed") @@ -43,7 +46,6 @@ def main(argv): server_ip_addr = arg elif opt == '-r': results_dir = arg - print(f"Cluster type = {cluster_type}") print(f"Results dir = {results_dir}") @@ -110,7 +112,7 @@ def main(argv): experiment_name = None response = list_experiments(results, recommendations, latest, experiment_name) if response.status_code == SUCCESS_200_STATUS_CODE: - list_exp_json = response.json() + list_exp_json = response.json() else: print(f"listExperiments failed!") failed = 1 @@ -122,7 +124,7 @@ def main(argv): experiment_name = None latest = "false" interval_end_time = None - response = list_recommendations(experiment_name, latest, interval_end_time) + response = list_recommendations(experiment_name, latest, interval_end_time, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: list_reco_json_file_before = list_reco_json_dir + '/list_reco_json_before.json' write_json_data_to_file(list_reco_json_file_before, response.json()) @@ -193,7 +195,7 @@ def main(argv): # Fetch the recommendations for all the experiments latest = "false" interval_end_time = None - response = list_recommendations(experiment_name, latest, interval_end_time) + response = list_recommendations(experiment_name, latest, interval_end_time, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: list_reco_json_file_after = list_reco_json_dir + '/list_reco_json_after.json' write_json_data_to_file(list_reco_json_file_after, response.json()) @@ -268,5 +270,6 @@ def main(argv): print("Test Passed! Check the logs for details") sys.exit(0) + if __name__ == '__main__': main(sys.argv[1:]) diff --git a/tests/scripts/remote_monitoring_tests/rest_apis/test_list_recommendations.py b/tests/scripts/remote_monitoring_tests/rest_apis/test_list_recommendations.py index 9e2dd8c17..d22695861 100644 --- a/tests/scripts/remote_monitoring_tests/rest_apis/test_list_recommendations.py +++ b/tests/scripts/remote_monitoring_tests/rest_apis/test_list_recommendations.py @@ -18,6 +18,7 @@ import pytest import sys + sys.path.append("../../") from helpers.all_terms_list_reco_json_schema import all_terms_list_reco_json_schema @@ -54,10 +55,10 @@ ("medium_term_test_192_data_points_non_contiguous", 192, medium_term_list_reco_json_schema, 48.0, 0, False), ("long_term_test_768_data_points_non_continguous", 768, long_term_list_reco_json_schema, 192.0, 0, False), # Uncomment below in future when monitoring_end_time to updateRecommendations need not have result uploaded with the same end_time - #("short_term_test_2_data_points_end_time_after_1hr", 2, list_reco_json_schema, 0.5, 60), - #("medium_term_test_192_data_points_end_time_after_1hr", 192, medium_term_list_reco_json_schema, 48, 60), - #("long_term_test_768_data_points_end_time_after_1hr", 768, long_term_list_reco_json_schema, 192, 60), - #("long_term_test_769_data_points_end_time_after_1hr", 769, long_term_list_reco_json_schema, 192.25, 60), + # ("short_term_test_2_data_points_end_time_after_1hr", 2, list_reco_json_schema, 0.5, 60), + # ("medium_term_test_192_data_points_end_time_after_1hr", 192, medium_term_list_reco_json_schema, 48, 60), + # ("long_term_test_768_data_points_end_time_after_1hr", 768, long_term_list_reco_json_schema, 192, 60), + # ("long_term_test_769_data_points_end_time_after_1hr", 769, long_term_list_reco_json_schema, 192.25, 60), ] term_input_for_missing_terms = [ # test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging, @@ -68,21 +69,30 @@ ("only_short_term_min_recomm", 2, short_term_list_reco_json_schema, 0.5, 0, True, False, False, True), ("only_medium_term_min_recomm", 192, medium_term_list_reco_json_schema, 48.0, 0, False, False, True, False), ("only_long_term_min_recomm", 768, long_term_list_reco_json_schema, 192.0, 0, False, True, False, False), - ("short_term_and_medium_term_min_recomm", 192, short_and_medium_term_list_reco_json_schema, 24.0, 0, False, False, True, True), - ("short_term_and_long_term_min_recomm", 768, short_and_long_term_list_reco_json_schema, 24.0, 0, False, True, False, True), - ("medium_term_and_long_term_min_recomm", 768, medium_and_long_term_list_reco_json_schema, 168.0, 0, False, True, True, False) + ("short_term_and_medium_term_min_recomm", 192, short_and_medium_term_list_reco_json_schema, 24.0, 0, False, False, + True, True), + ("short_term_and_long_term_min_recomm", 768, short_and_long_term_list_reco_json_schema, 24.0, 0, False, True, False, + True), + ("medium_term_and_long_term_min_recomm", 768, medium_and_long_term_list_reco_json_schema, 168.0, 0, False, True, + True, False) ] term_input_for_missing_terms_non_contiguous = [ # test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by # non-contiguous scenarios ("all_terms_min_recomm_non_contiguous", 768, all_terms_list_reco_json_schema, 192, 15, False, True, True, True), - ("only_short_term_min_recomm_non_contiguous", 2, short_term_list_reco_json_schema, 0.5, 15, True, False, False, True), - ("only_medium_term_min_recomm_non_contiguous", 192, medium_term_list_reco_json_schema, 48.0, 15, False, False, True, False), - ("only_long_term_min_recomm_non_contiguous", 768, long_term_list_reco_json_schema, 192.0, 15, False, True, False, False), - ("short_term_and_medium_term_min_recomm_non_contiguous", 192, short_and_medium_term_list_reco_json_schema, 24.0, 15, False, False, True, True), - ("short_term_and_long_term_min_recomm_non_contiguous", 768, short_and_long_term_list_reco_json_schema, 24.0, 15, False, True, False, True), - ("medium_term_and_long_term_min_recomm_non_contiguous", 768, medium_and_long_term_list_reco_json_schema, 168.0, 15, False, True, True, False) + ("only_short_term_min_recomm_non_contiguous", 2, short_term_list_reco_json_schema, 0.5, 15, True, False, False, + True), + ("only_medium_term_min_recomm_non_contiguous", 192, medium_term_list_reco_json_schema, 48.0, 15, False, False, True, + False), + ("only_long_term_min_recomm_non_contiguous", 768, long_term_list_reco_json_schema, 192.0, 15, False, True, False, + False), + ("short_term_and_medium_term_min_recomm_non_contiguous", 192, short_and_medium_term_list_reco_json_schema, 24.0, 15, + False, False, True, True), + ("short_term_and_long_term_min_recomm_non_contiguous", 768, short_and_long_term_list_reco_json_schema, 24.0, 15, + False, True, False, True), + ("medium_term_and_long_term_min_recomm_non_contiguous", 768, medium_and_long_term_list_reco_json_schema, 168.0, 15, + False, True, True, False) ] invalid_term_input = [ @@ -94,61 +104,65 @@ term_input_exceeding_limit = [ ("short_term_test_non_contiguous_2_data_points_exceeding_24_hours", 2, list_reco_json_schema, 0.5, 1440, True), - ("medium_term_test_non_contiguous_192_data_points_exceeding_7_days", 192, medium_term_list_reco_json_schema, 48.0, 420, False), - ("long_term_test_non_contiguous_768_data_points_exceeding_15_days", 768, long_term_list_reco_json_schema, 192.0, 360, False) + ("medium_term_test_non_contiguous_192_data_points_exceeding_7_days", 192, medium_term_list_reco_json_schema, 48.0, + 420, False), + ( + "long_term_test_non_contiguous_768_data_points_exceeding_15_days", 768, long_term_list_reco_json_schema, 192.0, + 360, + False) ] profile_notifications = [ - ("cpu_zero_test",1,True, [ - {"cpuRequest" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuLimit" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuUsage" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "cores"}}, - {"cpuThrottle" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "cores"}} - ], - NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO,NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO_MESSAGE - ), - ("cpu_usage_less_than_millicore_test",1,True, [ - {"cpuRequest" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuLimit" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuUsage" : {'sum':0.000001 , "avg":0.000001 , "min":0.000001 , "max":0.000001 , "format": "cores"}}, - {"cpuThrottle" : {'sum':0.000001 , "avg":0.000001 , "min":0.000001 , "max":0.000001 , "format": "cores"}} - ], - NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_IDLE,NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_IDLE_MESSAGE - ), - ("memory_zero_test",1,True, [ - {"memoryRequest" : {'sum':100 , "avg":100 , "min":100 , "max":100 , "format": "MiB"}}, - {"memoryLimit" : {'sum':100 , "avg":100 , "min":100 , "max":100 , "format": "MiB"}}, - {"memoryUsage" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "MiB"}}, - {"memoryRSS" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "MiB"}} - ], - NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO,NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO_MESSAGE - ) - , - ("cpu_memory_zero_test",1,True, [ - {"memoryRequest" : {'sum':100 , "avg":100 , "min":100 , "max":100 , "format": "MiB"}}, - {"memoryLimit" : {'sum':100 , "avg":100 , "min":100 , "max":100 , "format": "MiB"}}, - {"memoryUsage" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "MiB"}}, - {"memoryRSS" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "MiB"}}, - {"cpuRequest" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuLimit" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuUsage" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "cores"}}, - {"cpuThrottle" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "cores"}} - ], - NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO,NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO_MESSAGE - ) + ("cpu_zero_test", 1, True, [ + {"cpuRequest": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuLimit": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuUsage": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "cores"}}, + {"cpuThrottle": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "cores"}} + ], + NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO, NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO_MESSAGE + ), + ("cpu_usage_less_than_millicore_test", 1, True, [ + {"cpuRequest": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuLimit": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuUsage": {'sum': 0.000001, "avg": 0.000001, "min": 0.000001, "max": 0.000001, "format": "cores"}}, + {"cpuThrottle": {'sum': 0.000001, "avg": 0.000001, "min": 0.000001, "max": 0.000001, "format": "cores"}} + ], + NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_IDLE, NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_IDLE_MESSAGE + ), + ("memory_zero_test", 1, True, [ + {"memoryRequest": {'sum': 100, "avg": 100, "min": 100, "max": 100, "format": "MiB"}}, + {"memoryLimit": {'sum': 100, "avg": 100, "min": 100, "max": 100, "format": "MiB"}}, + {"memoryUsage": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "MiB"}}, + {"memoryRSS": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "MiB"}} + ], + NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO, NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO_MESSAGE + ) + , + ("cpu_memory_zero_test", 1, True, [ + {"memoryRequest": {'sum': 100, "avg": 100, "min": 100, "max": 100, "format": "MiB"}}, + {"memoryLimit": {'sum': 100, "avg": 100, "min": 100, "max": 100, "format": "MiB"}}, + {"memoryUsage": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "MiB"}}, + {"memoryRSS": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "MiB"}}, + {"cpuRequest": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuLimit": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuUsage": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "cores"}}, + {"cpuThrottle": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "cores"}} + ], + NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO, NOTIFICATION_CODE_FOR_CPU_RECORDS_ARE_ZERO_MESSAGE + ) , - ("memory_cpu_zero_test",1,True, [ - {"memoryRequest" : {'sum':100 , "avg":100 , "min":100 , "max":100 , "format": "MiB"}}, - {"memoryLimit" : {'sum':100 , "avg":100 , "min":100 , "max":100 , "format": "MiB"}}, - {"memoryUsage" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "MiB"}}, - {"memoryRSS" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "MiB"}}, - {"cpuRequest" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuLimit" : {'sum':5 , "avg":5 , "min":5 , "max":5 , "format": "cores"}}, - {"cpuUsage" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "cores"}}, - {"cpuThrottle" : {'sum':0 , "avg":0 , "min":0 , "max":0 , "format": "cores"}} - ], - NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO,NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO_MESSAGE - ) + ("memory_cpu_zero_test", 1, True, [ + {"memoryRequest": {'sum': 100, "avg": 100, "min": 100, "max": 100, "format": "MiB"}}, + {"memoryLimit": {'sum': 100, "avg": 100, "min": 100, "max": 100, "format": "MiB"}}, + {"memoryUsage": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "MiB"}}, + {"memoryRSS": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "MiB"}}, + {"cpuRequest": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuLimit": {'sum': 5, "avg": 5, "min": 5, "max": 5, "format": "cores"}}, + {"cpuUsage": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "cores"}}, + {"cpuThrottle": {'sum': 0, "avg": 0, "min": 0, "max": 0, "format": "cores"}} + ], + NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO, NOTIFICATION_CODE_FOR_MEMORY_RECORDS_ARE_ZERO_MESSAGE + ) ] @@ -196,7 +210,7 @@ def test_list_recommendations_single_result(cluster_type): json_data = json.load(open(input_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -256,7 +270,7 @@ def test_list_recommendations_without_parameters(cluster_type): # Get the experiment name experiment_name = None - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -310,7 +324,7 @@ def test_list_recommendations_invalid_exp(cluster_type): # Get the experiment name experiment_name = "xyz" - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) data = response.json() print(data) @@ -346,7 +360,7 @@ def test_list_recommendations_without_results(cluster_type): json_data = json.load(open(input_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -401,9 +415,9 @@ def test_list_recommendations_single_exp_multiple_results(cluster_type): assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ - NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -428,7 +442,9 @@ def test_list_recommendations_single_exp_multiple_results(cluster_type): @pytest.mark.sanity -@pytest.mark.parametrize("memory_format_type", ["bytes", "Bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "kB", "KB", "MB", "GB", "TB", "PB", "EB", "K", "k", "M", "G", "T", "P", "E"]) +@pytest.mark.parametrize("memory_format_type", + ["bytes", "Bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB", "Ki", "Mi", "Gi", "Ti", "Pi", + "Ei", "kB", "KB", "MB", "GB", "TB", "PB", "EB", "K", "k", "M", "G", "T", "P", "E"]) @pytest.mark.parametrize("cpu_format_type", ["cores", "m"]) def test_list_recommendations_supported_metric_formats(memory_format_type, cpu_format_type, cluster_type): """ @@ -451,7 +467,6 @@ def test_list_recommendations_supported_metric_formats(memory_format_type, cpu_f # Update results for the experiment result_json_file = "../json_files/multiple_results_single_exp.json" - # Update the memory format and cpu format result_json = read_json_data_from_file(result_json_file) @@ -479,16 +494,16 @@ def test_list_recommendations_supported_metric_formats(memory_format_type, cpu_f # Get the experiment name json_data = json.load(open(input_json_file)) experiment_name = json_data[0]['experiment_name'] - end_time = "2023-04-14T23:59:20.982Z" + end_time = "2023-04-14T23:59:20.982Z" response = update_recommendations(experiment_name, None, end_time) data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ - NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -587,11 +602,12 @@ def test_list_recommendations_multiple_exps_from_diff_json_files_2(cluster_type) data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name - assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ + assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ 'message'] == RECOMMENDATIONS_AVAILABLE # Invoke list recommendations for the specified experiment - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE list_reco_json = response.json() @@ -678,10 +694,11 @@ def test_list_recommendations_exp_name_and_latest(latest, cluster_type): data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name - assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ + assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ 'message'] == RECOMMENDATIONS_AVAILABLE - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -759,7 +776,7 @@ def test_list_recommendations_exp_name_and_monitoring_end_time_invalid(monitorin experiment_name = json_data[0]['experiment_name'] latest = None - response = list_recommendations(experiment_name, latest, monitoring_end_time) + response = list_recommendations(experiment_name, latest, monitoring_end_time, rm=True) list_reco_json = response.json() print(list_reco_json['message']) @@ -828,11 +845,12 @@ def test_list_recommendations_exp_name_and_monitoring_end_time(test_name, monito data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name - assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ + assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ 'message'] == RECOMMENDATIONS_AVAILABLE latest = None - response = list_recommendations(experiment_name, latest, monitoring_end_time) + response = list_recommendations(experiment_name, latest, monitoring_end_time, rm=True) list_reco_json = response.json() @@ -924,7 +942,7 @@ def test_list_recommendations_multiple_exps_with_missing_metrics(cluster_type): json_data = json.load(open(create_exp_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -947,8 +965,11 @@ def test_list_recommendations_multiple_exps_with_missing_metrics(cluster_type): @pytest.mark.extended -@pytest.mark.parametrize("test_name, num_days, reco_json_schema, expected_duration_in_hours, latest, logging", reco_term_input) -def test_list_recommendations_for_diff_reco_terms_with_only_latest(test_name, num_days, reco_json_schema, expected_duration_in_hours, latest, logging, cluster_type): +@pytest.mark.parametrize("test_name, num_days, reco_json_schema, expected_duration_in_hours, latest, logging", + reco_term_input) +def test_list_recommendations_for_diff_reco_terms_with_only_latest(test_name, num_days, reco_json_schema, + expected_duration_in_hours, latest, logging, + cluster_type): """ Test Description: This test validates list recommendations for all the terms for multiple experiments posted using different json files and query with only the parameter latest and with both latest=true and latest=false @@ -1021,7 +1042,7 @@ def test_list_recommendations_for_diff_reco_terms_with_only_latest(test_name, nu json_data = json.load(open(create_exp_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE list_of_result_json_arr.append(result_json_arr) @@ -1034,7 +1055,7 @@ def test_list_recommendations_for_diff_reco_terms_with_only_latest(test_name, nu NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ 'message'] == RECOMMENDATIONS_AVAILABLE experiment_name = None - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -1169,7 +1190,7 @@ def test_list_recommendations_notification_codes(cluster_type: str): json_data = json.load(open(create_exp_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE ############################################################################################# @@ -1226,7 +1247,8 @@ def test_list_recommendations_notification_codes(cluster_type: str): assert "config" in short_term_recommendation["recommendation_engines"]["performance"] short_term_recommendation_config = short_term_recommendation["recommendation_engines"]["cost"]["config"] - short_term_recommendation_variation = short_term_recommendation["recommendation_engines"]["cost"]["variation"] + short_term_recommendation_variation = short_term_recommendation["recommendation_engines"]["cost"][ + "variation"] response = update_recommendations(experiment_name, None, end_time) data = response.json() @@ -1237,9 +1259,12 @@ def test_list_recommendations_notification_codes(cluster_type: str): NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE assert short_term_recommendation['notifications'][NOTIFICATION_CODE_FOR_COST_RECOMMENDATIONS_AVAILABLE][ 'message'] == COST_RECOMMENDATIONS_AVAILABLE - assert short_term_recommendation['notifications'][NOTIFICATION_CODE_FOR_PERFORMANCE_RECOMMENDATIONS_AVAILABLE][ - 'message'] == PERFORMANCE_RECOMMENDATIONS_AVAILABLE - validate_variation(recommendation_current, short_term_recommendation_config, short_term_recommendation_variation) + assert \ + short_term_recommendation['notifications'][ + NOTIFICATION_CODE_FOR_PERFORMANCE_RECOMMENDATIONS_AVAILABLE][ + 'message'] == PERFORMANCE_RECOMMENDATIONS_AVAILABLE + validate_variation(recommendation_current, short_term_recommendation_config, + short_term_recommendation_variation) # Delete the experiments for i in range(num_exps): @@ -1250,17 +1275,21 @@ def test_list_recommendations_notification_codes(cluster_type: str): def validate_error_msgs(j: int, status_message, cname, experiment_name): - if j == 96: - assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + CPU_REQUEST + CONTAINER_AND_EXPERIMENT_NAME % (cname, experiment_name) + assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + CPU_REQUEST + CONTAINER_AND_EXPERIMENT_NAME % ( + cname, experiment_name) elif j == 97: - assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + MEMORY_REQUEST + CONTAINER_AND_EXPERIMENT_NAME % (cname, experiment_name) + assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + MEMORY_REQUEST + CONTAINER_AND_EXPERIMENT_NAME % ( + cname, experiment_name) elif j == 98: - assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + CPU_LIMIT + CONTAINER_AND_EXPERIMENT_NAME % (cname, experiment_name) + assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + CPU_LIMIT + CONTAINER_AND_EXPERIMENT_NAME % ( + cname, experiment_name) elif j == 99: - assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + MEMORY_LIMIT + CONTAINER_AND_EXPERIMENT_NAME % (cname, experiment_name) + assert status_message == UPDATE_RESULTS_INVALID_METRIC_VALUE_ERROR_MSG + MEMORY_LIMIT + CONTAINER_AND_EXPERIMENT_NAME % ( + cname, experiment_name) elif j > 100: - assert status_message == UPDATE_RESULTS_INVALID_METRIC_FORMAT_ERROR_MSG + CONTAINER_AND_EXPERIMENT_NAME % (cname, experiment_name) + assert status_message == UPDATE_RESULTS_INVALID_METRIC_FORMAT_ERROR_MSG + CONTAINER_AND_EXPERIMENT_NAME % ( + cname, experiment_name) @pytest.mark.negative @@ -1446,7 +1475,8 @@ def test_invalid_list_recommendations_notification_codes(cluster_type: str): if j in range(96, 104): assert response.status_code == ERROR_STATUS_CODE assert data['status'] == ERROR_STATUS - validate_error_msgs(j, data['data'][0]['errors'][0]['message'], container_name_to_update, experiment_name) + validate_error_msgs(j, data['data'][0]['errors'][0]['message'], container_name_to_update, + experiment_name) else: assert response.status_code == SUCCESS_STATUS_CODE assert data['status'] == SUCCESS_STATUS @@ -1462,7 +1492,7 @@ def test_invalid_list_recommendations_notification_codes(cluster_type: str): json_data = json.load(open(create_exp_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE ############################################################################################# @@ -1514,7 +1544,8 @@ def test_invalid_list_recommendations_notification_codes(cluster_type: str): assert "config" in short_term_recommendation["recommendation_engines"]["performance"] short_term_recommendation_config = short_term_recommendation["recommendation_engines"]["cost"]["config"] - short_term_recommendation_variation = short_term_recommendation["recommendation_engines"]["cost"]["variation"] + short_term_recommendation_variation = short_term_recommendation["recommendation_engines"]["cost"][ + "variation"] if j == 104: response = update_recommendations(experiment_name, None, end_time) @@ -1590,8 +1621,9 @@ def validate_term_recommendations(data, end_time, term): @pytest.mark.sanity -@pytest.mark.parametrize("test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging", - term_input) +@pytest.mark.parametrize( + "test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging", + term_input) def test_list_recommendations_term_min_data_threshold(test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging, cluster_type): """ @@ -1752,7 +1784,7 @@ def test_list_recommendations_term_min_data_threshold(test_name, num_res, reco_j experiment_name = None latest = True - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -1793,7 +1825,8 @@ def test_list_recommendations_term_min_data_threshold(test_name, num_res, reco_j @pytest.mark.negative -@pytest.mark.parametrize("test_name, num_res, reco_json_schema, expected_duration_in_hours, logging", invalid_term_input) +@pytest.mark.parametrize("test_name, num_res, reco_json_schema, expected_duration_in_hours, logging", + invalid_term_input) def test_list_recommendations_invalid_term_min_data_threshold(test_name, num_res, reco_json_schema, expected_duration_in_hours, logging, cluster_type): """ @@ -1936,7 +1969,7 @@ def test_list_recommendations_invalid_term_min_data_threshold(test_name, num_res experiment_name = None latest = True - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -1982,8 +2015,9 @@ def test_list_recommendations_invalid_term_min_data_threshold(test_name, num_res @pytest.mark.negative -@pytest.mark.parametrize("test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging", - term_input_exceeding_limit) +@pytest.mark.parametrize( + "test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging", + term_input_exceeding_limit) def test_list_recommendations_min_data_threshold_exceeding_max_duration(test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging, cluster_type): @@ -2043,7 +2077,6 @@ def test_list_recommendations_min_data_threshold_exceeding_max_duration(test_nam else: start_time = end_time - result_json[0]['interval_start_time'] = start_time end_time = increment_timestamp_by_given_mins(start_time, 15) result_json[0]['interval_end_time'] = end_time @@ -2132,7 +2165,7 @@ def test_list_recommendations_min_data_threshold_exceeding_max_duration(test_nam experiment_name = None latest = True - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -2173,7 +2206,8 @@ def test_list_recommendations_min_data_threshold_exceeding_max_duration(test_nam @pytest.mark.sanity @pytest.mark.parametrize("test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, " - "logging, long_term_present, medium_term_present, short_term_present", term_input_for_missing_terms) + "logging, long_term_present, medium_term_present, short_term_present", + term_input_for_missing_terms) def test_list_recommendations_for_missing_terms(test_name, num_res, reco_json_schema, expected_duration_in_hours, increment_end_time_by, logging, long_term_present, medium_term_present, short_term_present, cluster_type): @@ -2335,7 +2369,7 @@ def test_list_recommendations_for_missing_terms(test_name, num_res, reco_json_sc experiment_name = None latest = True - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -2535,7 +2569,7 @@ def test_list_recommendations_for_missing_terms_non_contiguous(test_name, num_re experiment_name = None latest = True - response = list_recommendations(experiment_name, latest) + response = list_recommendations(experiment_name, latest, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE @@ -2590,6 +2624,7 @@ def validate_and_assert_term_recommendations(data, end_time, term): assert_notification_presence(data, end_time, term, TERMS_NOTIFICATION_CODES[term]) validate_term_recommendations(data, end_time, term) + @pytest.mark.sanity def test_list_recommendations_cpu_mem_optimised(cluster_type: str): """ @@ -2610,7 +2645,7 @@ def test_list_recommendations_cpu_mem_optimised(cluster_type: str): # Create experiment using the specified json num_exps = 1 - num_res = 1450 # 15 days + 10 entries buffer + num_res = 1450 # 15 days + 10 entries buffer for i in range(num_exps): create_exp_json_file = "/tmp/create_exp_" + str(i) + ".json" @@ -2736,7 +2771,7 @@ def test_list_recommendations_cpu_mem_optimised(cluster_type: str): json_data = json.load(open(create_exp_json_file)) experiment_name = json_data[0]['experiment_name'] - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE recommendation_json = response.json() @@ -2773,9 +2808,9 @@ def test_list_recommendations_cpu_mem_optimised(cluster_type: str): short_term_recommendation = data_section[str(end_time)]["recommendation_terms"]["short_term"] medium_term_recommendation = None long_term_recommendation = None - if j > 671: # 7 days + if j > 671: # 7 days medium_term_recommendation = data_section[str(end_time)]["recommendation_terms"]["medium_term"] - if j > 1439: # 15 days + if j > 1439: # 15 days long_term_recommendation = data_section[str(end_time)]["recommendation_terms"]["long_term"] if INFO_COST_RECOMMENDATIONS_AVAILABLE_CODE in short_term_recommendation["notifications"]: @@ -2806,24 +2841,31 @@ def test_list_recommendations_cpu_mem_optimised(cluster_type: str): current=recommendation_current, profile="performance") - short_term_recommendation_cost_notifications = short_term_recommendation["recommendation_engines"]["cost"]["notifications"] - short_term_recommendation_perf_notifications = short_term_recommendation["recommendation_engines"]["performance"]["notifications"] - - check_optimised_codes(short_term_recommendation_cost_notifications, short_term_recommendation_perf_notifications) + short_term_recommendation_cost_notifications = \ + short_term_recommendation["recommendation_engines"]["cost"]["notifications"] + short_term_recommendation_perf_notifications = \ + short_term_recommendation["recommendation_engines"]["performance"]["notifications"] + check_optimised_codes(short_term_recommendation_cost_notifications, + short_term_recommendation_perf_notifications) if j > 672: - medium_term_recommendation_cost_notifications = medium_term_recommendation["recommendation_engines"]["cost"]["notifications"] - medium_term_recommendation_perf_notifications = medium_term_recommendation["recommendation_engines"]["performance"]["notifications"] + medium_term_recommendation_cost_notifications = \ + medium_term_recommendation["recommendation_engines"]["cost"]["notifications"] + medium_term_recommendation_perf_notifications = \ + medium_term_recommendation["recommendation_engines"]["performance"]["notifications"] - check_optimised_codes(medium_term_recommendation_cost_notifications, medium_term_recommendation_perf_notifications) + check_optimised_codes(medium_term_recommendation_cost_notifications, + medium_term_recommendation_perf_notifications) if j > 1439: - long_term_recommendation_cost_notifications = long_term_recommendation["recommendation_engines"]["cost"]["notifications"] - long_term_recommendation_perf_notifications = long_term_recommendation["recommendation_engines"]["performance"]["notifications"] - - check_optimised_codes(long_term_recommendation_cost_notifications, long_term_recommendation_perf_notifications) + long_term_recommendation_cost_notifications = \ + long_term_recommendation["recommendation_engines"]["cost"]["notifications"] + long_term_recommendation_perf_notifications = \ + long_term_recommendation["recommendation_engines"]["performance"]["notifications"] + check_optimised_codes(long_term_recommendation_cost_notifications, + long_term_recommendation_perf_notifications) # Delete the experiments for i in range(num_exps): @@ -2834,14 +2876,15 @@ def test_list_recommendations_cpu_mem_optimised(cluster_type: str): @pytest.mark.sanity -@pytest.mark.parametrize("test_name,num_days,logging,update_metrics,code,message",profile_notifications) -def test_list_recommendations_profile_notifications(test_name,num_days,logging,update_metrics,code,message,cluster_type: str): +@pytest.mark.parametrize("test_name,num_days,logging,update_metrics,code,message", profile_notifications) +def test_list_recommendations_profile_notifications(test_name, num_days, logging, update_metrics, code, message, + cluster_type: str): """ Test Description: Check if notifications are generated at profile level if cpu_usage is less than millicore """ input_json_file = "../json_files/create_exp.json" result_json_file = "../json_files/update_results.json" - print ("Test Name --- %s " %(test_name) ) + print("Test Name --- %s " % (test_name)) find = [] json_data = json.load(open(input_json_file)) @@ -2906,12 +2949,12 @@ def test_list_recommendations_profile_notifications(test_name,num_days,logging,u response = update_recommendations(experiment_name, None, end_time) data = response.json() assert response.status_code == SUCCESS_STATUS_CODE - validate_recommendations_notifications(experiment_name,end_time,code,message,data) + validate_recommendations_notifications(experiment_name, end_time, code, message, data) - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE data = response.json() - validate_recommendations_notifications(experiment_name,end_time,code,message,data) + validate_recommendations_notifications(experiment_name, end_time, code, message, data) # Delete the experiments for i in range(num_exps): @@ -2920,12 +2963,16 @@ def test_list_recommendations_profile_notifications(test_name,num_days,logging,u response = delete_experiment(json_file) print("delete exp = ", response.status_code) -def validate_recommendations_notifications(experiment_name,end_time,code,message,data): + +def validate_recommendations_notifications(experiment_name, end_time, code, message, data): assert data[0]['experiment_name'] == experiment_name assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE][ 'message'] == RECOMMENDATIONS_AVAILABLE - short_term_recommendation = data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['data'][str(end_time)]["recommendation_terms"]["short_term"] + short_term_recommendation = \ + data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['data'][str(end_time)][ + "recommendation_terms"][ + "short_term"] assert short_term_recommendation['notifications'][NOTIFICATION_CODE_FOR_COST_RECOMMENDATIONS_AVAILABLE][ 'message'] == COST_RECOMMENDATIONS_AVAILABLE @@ -2933,7 +2980,9 @@ def validate_recommendations_notifications(experiment_name,end_time,code,message 'message'] == PERFORMANCE_RECOMMENDATIONS_AVAILABLE assert short_term_recommendation['recommendation_engines']['cost']['notifications'][code]['message'] == message - assert short_term_recommendation['recommendation_engines']['performance']['notifications'][code]['message'] == message + assert short_term_recommendation['recommendation_engines']['performance']['notifications'][code][ + 'message'] == message + @pytest.mark.sanity def test_list_recommendations_job_type_exp(cluster_type): @@ -2973,9 +3022,9 @@ def test_list_recommendations_job_type_exp(cluster_type): assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ - NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE + NOTIFICATION_CODE_FOR_RECOMMENDATIONS_AVAILABLE]['message'] == RECOMMENDATIONS_AVAILABLE - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) list_reco_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE diff --git a/tests/scripts/remote_monitoring_tests/rest_apis/test_update_recommendations.py b/tests/scripts/remote_monitoring_tests/rest_apis/test_update_recommendations.py index da643a78a..64e018dd7 100644 --- a/tests/scripts/remote_monitoring_tests/rest_apis/test_update_recommendations.py +++ b/tests/scripts/remote_monitoring_tests/rest_apis/test_update_recommendations.py @@ -15,6 +15,7 @@ """ import pytest import sys + sys.path.append("../../") from helpers.fixtures import * from helpers.kruize import * @@ -102,7 +103,7 @@ def test_update_valid_recommendations_after_results_after_create_exp(cluster_typ assert data[0]['experiment_name'] == experiment_name assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications']['111000'][ 'message'] == 'Recommendations Are Available' - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: recommendation_json = response.json() recommendation_section = recommendation_json[0]["kubernetes_objects"][0]["containers"][0][ @@ -124,7 +125,7 @@ def test_update_valid_recommendations_after_results_after_create_exp(cluster_typ 'message'] == 'Recommendations Are Available' # Invoke list recommendations for the specified experiment - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE list_reco_json = response.json() @@ -228,7 +229,7 @@ def test_plots_with_no_recommendations_in_some_terms(cluster_type): assert data[0]['experiment_name'] == experiment_name assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications']['111000'][ 'message'] == 'Recommendations Are Available' - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: recommendation_json = response.json() recommendation_section = recommendation_json[0]["kubernetes_objects"][0]["containers"][0][ @@ -250,7 +251,7 @@ def test_plots_with_no_recommendations_in_some_terms(cluster_type): 'message'] == 'Recommendations Are Available' # Invoke list recommendations for the specified experiment - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE list_reco_json = response.json() @@ -351,9 +352,10 @@ def test_update_valid_recommendations_just_endtime_input_after_results_after_cre data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name - assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][INFO_RECOMMENDATIONS_AVAILABLE_CODE][ + assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ + INFO_RECOMMENDATIONS_AVAILABLE_CODE][ 'message'] == RECOMMENDATIONS_AVAILABLE - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: recommendation_json = response.json() recommendation_section = recommendation_json[0]["kubernetes_objects"][0]["containers"][0][ @@ -371,11 +373,12 @@ def test_update_valid_recommendations_just_endtime_input_after_results_after_cre data = response.json() assert response.status_code == SUCCESS_STATUS_CODE assert data[0]['experiment_name'] == experiment_name - assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][INFO_RECOMMENDATIONS_AVAILABLE_CODE][ + assert data[0]['kubernetes_objects'][0]['containers'][0]['recommendations']['notifications'][ + INFO_RECOMMENDATIONS_AVAILABLE_CODE][ 'message'] == RECOMMENDATIONS_AVAILABLE # Invoke list recommendations for the specified experiment - response = list_recommendations(experiment_name) + response = list_recommendations(experiment_name, rm=True) assert response.status_code == SUCCESS_200_STATUS_CODE list_reco_json = response.json() From 65602f727948409d10f4e4a95945afa600c1c2a8 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Tue, 17 Dec 2024 16:32:50 +0530 Subject: [PATCH 69/85] Review comments incoporated Signed-off-by: msvinaykumar --- .../fault_tolerant_test/kruize_pod_restart_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py b/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py index 758f395ff..89225fbad 100644 --- a/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py +++ b/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py @@ -124,7 +124,7 @@ def main(argv): experiment_name = None latest = "false" interval_end_time = None - response = list_recommendations(experiment_name, latest, interval_end_time, rm=True) + response = list_recommendations(experiment_name, latest, interval_end_time) if response.status_code == SUCCESS_200_STATUS_CODE: list_reco_json_file_before = list_reco_json_dir + '/list_reco_json_before.json' write_json_data_to_file(list_reco_json_file_before, response.json()) From 234aa670050c54cc99b922dc30a1982e3a08351e Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Wed, 18 Dec 2024 10:52:55 +0530 Subject: [PATCH 70/85] Review comments incoporated Signed-off-by: msvinaykumar --- .../fault_tolerant_test/kruize_pod_restart_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py b/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py index 89225fbad..9fc03eeb3 100644 --- a/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py +++ b/tests/scripts/local_monitoring_tests/fault_tolerant_test/kruize_pod_restart_test.py @@ -195,7 +195,7 @@ def main(argv): # Fetch the recommendations for all the experiments latest = "false" interval_end_time = None - response = list_recommendations(experiment_name, latest, interval_end_time, rm=True) + response = list_recommendations(experiment_name, latest, interval_end_time) if response.status_code == SUCCESS_200_STATUS_CODE: list_reco_json_file_after = list_reco_json_dir + '/list_reco_json_after.json' write_json_data_to_file(list_reco_json_file_after, response.json()) From 8872576e5076f6c6941fa2408486fe6da31da8b7 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Tue, 17 Dec 2024 12:49:29 +0530 Subject: [PATCH 71/85] FaultTolerence test update Signed-off-by: msvinaykumar --- tests/scripts/helpers/kruize.py | 8 +++- .../kruize_pod_restart_test.py | 41 +++++++++++-------- 2 files changed, 29 insertions(+), 20 deletions(-) diff --git a/tests/scripts/helpers/kruize.py b/tests/scripts/helpers/kruize.py index 50985cc6a..82c4486ab 100644 --- a/tests/scripts/helpers/kruize.py +++ b/tests/scripts/helpers/kruize.py @@ -229,7 +229,7 @@ def create_performance_profile(perf_profile_json_file): # Description: This function obtains the experiments from Kruize Autotune using listExperiments API # Input Parameters: None -def list_experiments(results=None, recommendations=None, latest=None, experiment_name=None): +def list_experiments(results=None, recommendations=None, latest=None, experiment_name=None, rm=False): print("\nListing the experiments...") query_params = {} @@ -245,7 +245,11 @@ def list_experiments(results=None, recommendations=None, latest=None, experiment query_string = "&".join(f"{key}={value}" for key, value in query_params.items()) url = URL + "/listExperiments" - if query_string: + if rm: + url += "?rm=true" + if query_string: + url += "&" + query_string + else: url += "?" + query_string print("URL = ", url) response = requests.get(url) diff --git a/tests/scripts/remote_monitoring_tests/fault_tolerant_tests/kruize_pod_restart_test.py b/tests/scripts/remote_monitoring_tests/fault_tolerant_tests/kruize_pod_restart_test.py index 50118a982..f77ae3481 100644 --- a/tests/scripts/remote_monitoring_tests/fault_tolerant_tests/kruize_pod_restart_test.py +++ b/tests/scripts/remote_monitoring_tests/fault_tolerant_tests/kruize_pod_restart_test.py @@ -14,15 +14,18 @@ limitations under the License. """ -import sys, getopt +import getopt import json import os +import sys import time + sys.path.append("../../") from helpers.kruize import * from helpers.utils import * from helpers.generate_rm_jsons import * + def main(argv): cluster_type = "minikube" results_dir = "." @@ -30,14 +33,16 @@ def main(argv): num_exps = 1 failed = 0 try: - opts, args = getopt.getopt(argv,"h:c:a:u:r:d:") + opts, args = getopt.getopt(argv, "h:c:a:u:r:d:") except getopt.GetoptError: - print("kruize_pod_restart_test.py -c -a -u -d -r ") + print( + "kruize_pod_restart_test.py -c -a -u -d -r ") print("Note: -a option is required only on openshift when kruize service is exposed") sys.exit(2) for opt, arg in opts: if opt == '-h': - print("kruize_pod_restart_test.py -c -a -u -d -r ") + print( + "kruize_pod_restart_test.py -c -a -u -d -r ") sys.exit(0) elif opt == '-c': cluster_type = arg @@ -49,7 +54,6 @@ def main(argv): results_dir = arg elif opt == '-d': iterations = int(arg) - print(f"Cluster type = {cluster_type}") print(f"No. of experiments = {num_exps}") @@ -72,11 +76,10 @@ def main(argv): split = False split_count = 1 - # Post 100 results num_res = 100 - for i in range(1, iterations+1): + for i in range(1, iterations + 1): print("\n*************************") print(f"Iteration {i}...") print("*************************\n") @@ -92,13 +95,15 @@ def main(argv): if i == 1: new_timestamp = None - create_update_results_jsons(csv_filename, split, split_count, result_json_dir, num_exps, num_res, new_timestamp) + create_update_results_jsons(csv_filename, split, split_count, result_json_dir, num_exps, num_res, + new_timestamp) start_ts = get_datetime() else: # Increment the time by 1505 mins for the next set of data timestamps new_timestamp = increment_timestamp_by_given_mins(start_ts, 1505) start_ts = new_timestamp - create_update_results_jsons(csv_filename, split, split_count, result_json_dir, num_exps, num_res, new_timestamp) + create_update_results_jsons(csv_filename, split, split_count, result_json_dir, num_exps, num_res, + new_timestamp) reco_json_dir = results_dir + "/reco_jsons" + "_iter" + str(i) os.mkdir(reco_json_dir) @@ -107,7 +112,7 @@ def main(argv): # create the experiment and post it create_exp_json_file = exp_json_dir + "/create_exp_" + str(exp_num) + ".json" create_experiment(create_exp_json_file) - + # Obtain the experiment name json_data = json.load(open(create_exp_json_file)) @@ -123,12 +128,12 @@ def main(argv): interval_end_time = json_data[0]['interval_end_time'] # sleep for a while before fetching recommendations for the experiments - #time.sleep(1) + # time.sleep(1) # Fetch the recommendations for all the experiments latest = None reco = update_recommendations(experiment_name, latest, interval_end_time) - filename = reco_json_dir + '/update_reco_' + str(res_num) + '_' + str(exp_num) + '.json' + filename = reco_json_dir + '/update_reco_' + str(res_num) + '_' + str(exp_num) + '.json' write_json_data_to_file(filename, reco.json()) # Fetch listExperiments @@ -138,9 +143,9 @@ def main(argv): recommendations = "true" latest = "false" experiment_name = None - response = list_experiments(results, recommendations, latest, experiment_name) + response = list_experiments(results, recommendations, latest, experiment_name, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: - list_exp_json = response.json() + list_exp_json = response.json() else: print(f"listExperiments failed!") failed = 1 @@ -152,7 +157,7 @@ def main(argv): experiment_name = None latest = "false" interval_end_time = None - response = list_recommendations(experiment_name, latest, interval_end_time) + response = list_recommendations(experiment_name, latest, interval_end_time, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: list_reco_json_file_before = list_reco_json_dir + '/list_reco_json_before_' + str(i) + '.json' write_json_data_to_file(list_reco_json_file_before, response.json()) @@ -182,7 +187,7 @@ def main(argv): recommendations = "true" latest = "false" experiment_name = None - response = list_experiments(results, recommendations, latest, experiment_name) + response = list_experiments(results, recommendations, latest, experiment_name, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: list_exp_json = response.json() else: @@ -195,7 +200,7 @@ def main(argv): # Fetch the recommendations for all the experiments latest = "false" interval_end_time = None - response = list_recommendations(experiment_name, latest, interval_end_time) + response = list_recommendations(experiment_name, latest, interval_end_time, rm=True) if response.status_code == SUCCESS_200_STATUS_CODE: list_reco_json_file_after = list_reco_json_dir + '/list_reco_json_after_' + str(i) + '.json' write_json_data_to_file(list_reco_json_file_after, response.json()) @@ -204,7 +209,6 @@ def main(argv): failed = 1 sys.exit(1) - # Compare the listExperiments before and after kruize pod restart result = compare_json_files(list_exp_json_file_before, list_exp_json_file_after) if result == True: @@ -236,5 +240,6 @@ def main(argv): print("Test Passed! Check the logs for details") sys.exit(0) + if __name__ == '__main__': main(sys.argv[1:]) From d2b24146f12c91759e2d0c1b5bbfdc5206def9f7 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Wed, 18 Dec 2024 17:11:59 +0530 Subject: [PATCH 72/85] resolving deserialization issue with experiment type Signed-off-by: Shekhar Saxena --- .../CreateExperimentAPIObject.java | 2 ++ .../analyzer/utils/ExperimentTypeUtil.java | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java b/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java index 76ae02f69..2602c6290 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java @@ -23,6 +23,7 @@ import com.autotune.common.data.ValidationOutputData; import com.autotune.common.k8sObjects.TrialSettings; import com.autotune.utils.KruizeConstants; +import com.google.gson.annotations.JsonAdapter; import com.google.gson.annotations.SerializedName; import java.util.List; @@ -50,6 +51,7 @@ public class CreateExperimentAPIObject extends BaseSO implements ExperimentTypeA @SerializedName(KruizeConstants.JSONKeys.DATASOURCE) //TODO: to be used in future private String datasource; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) //TODO: to be used in future + @JsonAdapter(ExperimentTypeUtil.ExperimentTypeDeserializer.class) private AnalyzerConstants.ExperimentType experimentType; private AnalyzerConstants.ExperimentStatus status; private String experiment_id; // this id is UUID and getting set at createExperiment API diff --git a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java index 51b567617..7425f9566 100644 --- a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java +++ b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java @@ -16,6 +16,13 @@ package com.autotune.analyzer.utils; +import com.google.gson.JsonDeserializationContext; +import com.google.gson.JsonDeserializer; +import com.google.gson.JsonElement; +import com.google.gson.JsonParseException; + +import java.lang.reflect.Type; + /** * This class contains utility functions to determine experiment type */ @@ -27,4 +34,15 @@ public static boolean isContainerExperiment(AnalyzerConstants.ExperimentType exp public static boolean isNamespaceExperiment(AnalyzerConstants.ExperimentType experimentType) { return experimentType != null && AnalyzerConstants.ExperimentType.NAMESPACE.equals(experimentType); } + + public class ExperimentTypeDeserializer implements JsonDeserializer { + @Override + public AnalyzerConstants.ExperimentType deserialize(JsonElement json, Type typeOfT, JsonDeserializationContext context) throws JsonParseException { + String experimentType = json.getAsString(); + if (experimentType != null) { + return AnalyzerConstants.ExperimentType.valueOf(experimentType.toUpperCase()); + } + return null; + } + } } From 900e3653d802f441fb09abbe5abcff25e0661cfe Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Wed, 18 Dec 2024 17:44:42 +0530 Subject: [PATCH 73/85] adding a serializer function for experiment type Signed-off-by: Shekhar Saxena --- .../analyzer/kruizeObject/KruizeObject.java | 2 ++ .../serviceObjects/CreateExperimentAPIObject.java | 2 +- .../ListRecommendationsAPIObject.java | 2 ++ .../analyzer/utils/ExperimentTypeUtil.java | 15 ++++++++++----- 4 files changed, 15 insertions(+), 6 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java b/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java index 3badf4b51..d11be2c3d 100644 --- a/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java +++ b/src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java @@ -26,6 +26,7 @@ import com.autotune.utils.KruizeConstants; import com.autotune.utils.KruizeSupportedTypes; import com.autotune.utils.Utils; +import com.google.gson.annotations.JsonAdapter; import com.google.gson.annotations.SerializedName; import io.fabric8.kubernetes.api.model.ObjectReference; @@ -50,6 +51,7 @@ public final class KruizeObject implements ExperimentTypeAware { @SerializedName("datasource") private String datasource; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) //TODO: to be used in future + @JsonAdapter(ExperimentTypeUtil.ExperimentTypeSerializer.class) private AnalyzerConstants.ExperimentType experimentType; @SerializedName("default_updater") private String defaultUpdater; diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java b/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java index 2602c6290..803be140c 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java @@ -51,7 +51,7 @@ public class CreateExperimentAPIObject extends BaseSO implements ExperimentTypeA @SerializedName(KruizeConstants.JSONKeys.DATASOURCE) //TODO: to be used in future private String datasource; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) //TODO: to be used in future - @JsonAdapter(ExperimentTypeUtil.ExperimentTypeDeserializer.class) + @JsonAdapter(ExperimentTypeUtil.ExperimentTypeSerializer.class) private AnalyzerConstants.ExperimentType experimentType; private AnalyzerConstants.ExperimentStatus status; private String experiment_id; // this id is UUID and getting set at createExperiment API diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java b/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java index ad53ce8a5..69f4a3fec 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/ListRecommendationsAPIObject.java @@ -19,6 +19,7 @@ import com.autotune.analyzer.utils.ExperimentTypeAware; import com.autotune.analyzer.utils.ExperimentTypeUtil; import com.autotune.utils.KruizeConstants; +import com.google.gson.annotations.JsonAdapter; import com.google.gson.annotations.SerializedName; import java.util.List; @@ -27,6 +28,7 @@ public class ListRecommendationsAPIObject extends BaseSO implements ExperimentTy @SerializedName(KruizeConstants.JSONKeys.CLUSTER_NAME) private String clusterName; @SerializedName(KruizeConstants.JSONKeys.EXPERIMENT_TYPE) + @JsonAdapter(ExperimentTypeUtil.ExperimentTypeSerializer.class) private AnalyzerConstants.ExperimentType experimentType; @SerializedName(KruizeConstants.JSONKeys.KUBERNETES_OBJECTS) diff --git a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java index 7425f9566..70d85089a 100644 --- a/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java +++ b/src/main/java/com/autotune/analyzer/utils/ExperimentTypeUtil.java @@ -16,10 +16,7 @@ package com.autotune.analyzer.utils; -import com.google.gson.JsonDeserializationContext; -import com.google.gson.JsonDeserializer; -import com.google.gson.JsonElement; -import com.google.gson.JsonParseException; +import com.google.gson.*; import java.lang.reflect.Type; @@ -35,7 +32,15 @@ public static boolean isNamespaceExperiment(AnalyzerConstants.ExperimentType exp return experimentType != null && AnalyzerConstants.ExperimentType.NAMESPACE.equals(experimentType); } - public class ExperimentTypeDeserializer implements JsonDeserializer { + public class ExperimentTypeSerializer implements JsonSerializer, JsonDeserializer { + @Override + public JsonElement serialize(AnalyzerConstants.ExperimentType experimentType, Type typeOfT, JsonSerializationContext context) { + if (experimentType != null) { + return new JsonPrimitive(experimentType.name().toLowerCase()); + } + return null; + } + @Override public AnalyzerConstants.ExperimentType deserialize(JsonElement json, Type typeOfT, JsonDeserializationContext context) throws JsonParseException { String experimentType = json.getAsString(); From 271ecb0361b29bb6d93d83a2f0669cdc2f95d1d4 Mon Sep 17 00:00:00 2001 From: Shekhar Saxena Date: Wed, 18 Dec 2024 23:47:43 +0530 Subject: [PATCH 74/85] fixing listExperiments api Signed-off-by: Shekhar Saxena --- .../analyzer/serviceObjects/Converters.java | 162 +++++++++--------- .../analyzer/services/ListExperiments.java | 36 ++-- 2 files changed, 109 insertions(+), 89 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java b/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java index 176d678d0..ae2e6e9a9 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/Converters.java @@ -142,86 +142,9 @@ public static ListRecommendationsAPIObject convertKruizeObjectToListRecommendati kubernetesAPIObject = new KubernetesAPIObject(k8sObject.getName(), k8sObject.getType(), k8sObject.getNamespace()); // namespace recommendations experiment type if (kruizeObject.isNamespaceExperiment()) { - NamespaceAPIObject namespaceAPIObject; - NamespaceData clonedNamespaceData = Utils.getClone(k8sObject.getNamespaceData(), NamespaceData.class); - - if (null != clonedNamespaceData) { - HashMap namespaceRecommendations = clonedNamespaceData.getNamespaceRecommendations().getData(); - LOGGER.info("Namespace Recommendations: " + namespaceRecommendations.toString()); - clonedNamespaceData.getNamespaceRecommendations().setData(namespaceRecommendations); - namespaceAPIObject = new NamespaceAPIObject(clonedNamespaceData.getNamespace_name(), clonedNamespaceData.getNamespaceRecommendations(), null); - kubernetesAPIObject.setNamespaceAPIObject(namespaceAPIObject); - } - } - - HashMap containerDataMap = new HashMap<>(); - List containerAPIObjects = new ArrayList<>(); - for (ContainerData containerData : k8sObject.getContainerDataMap().values()) { - ContainerAPIObject containerAPIObject; - // if a Time stamp is passed it holds the priority than latest - if (checkForTimestamp) { - // This step causes a performance degradation, need to be replaced with a better flow of creating SO's - ContainerData clonedContainerData = Utils.getClone(containerData, ContainerData.class); - if (null != clonedContainerData) { - HashMap recommendations - = clonedContainerData.getContainerRecommendations().getData(); - if (null != monitoringEndTime && recommendations.containsKey(monitoringEndTime)) { - List tempList = new ArrayList<>(); - for (Timestamp timestamp : recommendations.keySet()) { - if (!timestamp.equals(monitoringEndTime)) - tempList.add(timestamp); - } - for (Timestamp timestamp : tempList) { - recommendations.remove(timestamp); - } - clonedContainerData.getContainerRecommendations().setData(recommendations); - containerAPIObject = new ContainerAPIObject(clonedContainerData.getContainer_name(), - clonedContainerData.getContainer_image_name(), - clonedContainerData.getContainerRecommendations(), - null); - containerAPIObjects.add(containerAPIObject); - } - } - } else if (getLatest) { - // This step causes a performance degradation, need to be replaced with a better flow of creating SO's - ContainerData clonedContainerData = Utils.getClone(containerData, ContainerData.class); - if (null != clonedContainerData) { - HashMap recommendations - = clonedContainerData.getContainerRecommendations().getData(); - Timestamp latestTimestamp = null; - List tempList = new ArrayList<>(); - for (Timestamp timestamp : recommendations.keySet()) { - if (null == latestTimestamp) { - latestTimestamp = timestamp; - } else { - if (timestamp.after(latestTimestamp)) { - tempList.add(latestTimestamp); - latestTimestamp = timestamp; - } else { - tempList.add(timestamp); - } - } - } - for (Timestamp timestamp : tempList) { - recommendations.remove(timestamp); - } - clonedContainerData.getContainerRecommendations().setData(recommendations); - containerAPIObject = new ContainerAPIObject(clonedContainerData.getContainer_name(), - clonedContainerData.getContainer_image_name(), - clonedContainerData.getContainerRecommendations(), - null); - containerAPIObjects.add(containerAPIObject); - } - } else { - containerAPIObject = new ContainerAPIObject(containerData.getContainer_name(), - containerData.getContainer_image_name(), - containerData.getContainerRecommendations(), - null); - containerAPIObjects.add(containerAPIObject); - containerDataMap.put(containerData.getContainer_name(), containerData); - } + processNamespaceRecommendations(k8sObject, kubernetesAPIObject, checkForTimestamp, getLatest, monitoringEndTime); } - kubernetesAPIObject.setContainerAPIObjects(containerAPIObjects); + processContainerRecommendations(k8sObject, kubernetesAPIObject, checkForTimestamp, getLatest, monitoringEndTime); kubernetesAPIObjects.add(kubernetesAPIObject); } listRecommendationsAPIObject.setKubernetesObjects(kubernetesAPIObjects); @@ -231,6 +154,87 @@ public static ListRecommendationsAPIObject convertKruizeObjectToListRecommendati return listRecommendationsAPIObject; } + private static void processNamespaceRecommendations(K8sObject k8sObject, KubernetesAPIObject kubernetesAPIObject, + boolean checkForTimestamp, boolean getLatest, Timestamp monitoringEndTime) { + NamespaceData clonedNamespaceData = Utils.getClone(k8sObject.getNamespaceData(), NamespaceData.class); + if (clonedNamespaceData != null) { + HashMap namespaceRecommendations = clonedNamespaceData.getNamespaceRecommendations().getData(); + + if (checkForTimestamp) { + filterRecommendationsByTimestamp(namespaceRecommendations, monitoringEndTime); + } else if (getLatest) { + filterRecommendationsByLatest(namespaceRecommendations); + } + + NamespaceAPIObject namespaceAPIObject = new NamespaceAPIObject( + clonedNamespaceData.getNamespace_name(), + clonedNamespaceData.getNamespaceRecommendations(), + null); + kubernetesAPIObject.setNamespaceAPIObject(namespaceAPIObject); + } + } + + private static void processContainerRecommendations(K8sObject k8sObject, KubernetesAPIObject kubernetesAPIObject, + boolean checkForTimestamp, boolean getLatest, Timestamp monitoringEndTime) { + List containerAPIObjects = new ArrayList<>(); + + for (ContainerData containerData : k8sObject.getContainerDataMap().values()) { + ContainerData clonedContainerData = Utils.getClone(containerData, ContainerData.class); + + if (clonedContainerData != null) { + HashMap recommendations = clonedContainerData.getContainerRecommendations().getData(); + + if (checkForTimestamp) { + filterRecommendationsByTimestamp(recommendations, monitoringEndTime); + } else if (getLatest) { + filterRecommendationsByLatest(recommendations); + } + + ContainerAPIObject containerAPIObject = new ContainerAPIObject( + clonedContainerData.getContainer_name(), + clonedContainerData.getContainer_image_name(), + clonedContainerData.getContainerRecommendations(), + null); + containerAPIObjects.add(containerAPIObject); + } else { + containerAPIObjects.add(new ContainerAPIObject( + containerData.getContainer_name(), + containerData.getContainer_image_name(), + containerData.getContainerRecommendations(), + null)); + } + } + + kubernetesAPIObject.setContainerAPIObjects(containerAPIObjects); + } + + private static void filterRecommendationsByTimestamp(HashMap recommendations, + Timestamp monitoringEndTime) { + if (monitoringEndTime != null && recommendations.containsKey(monitoringEndTime)) { + recommendations.keySet().removeIf(timestamp -> !timestamp.equals(monitoringEndTime)); + } + } + + private static void filterRecommendationsByLatest(HashMap recommendations) { + Timestamp latestTimestamp = null; + List timestampsToRemove = new ArrayList<>(); + + for (Timestamp timestamp : recommendations.keySet()) { + if (latestTimestamp == null || timestamp.after(latestTimestamp)) { + if (latestTimestamp != null) { + timestampsToRemove.add(latestTimestamp); + } + latestTimestamp = timestamp; + } else { + timestampsToRemove.add(timestamp); + } + } + + for (Timestamp timestamp : timestampsToRemove) { + recommendations.remove(timestamp); + } + } + /** * @param containerData */ diff --git a/src/main/java/com/autotune/analyzer/services/ListExperiments.java b/src/main/java/com/autotune/analyzer/services/ListExperiments.java index cd8631aee..41c777056 100644 --- a/src/main/java/com/autotune/analyzer/services/ListExperiments.java +++ b/src/main/java/com/autotune/analyzer/services/ListExperiments.java @@ -31,6 +31,7 @@ import com.autotune.common.data.metrics.MetricResults; import com.autotune.common.data.result.ContainerData; import com.autotune.common.data.result.IntervalResults; +import com.autotune.common.data.result.NamespaceData; import com.autotune.common.data.system.info.device.DeviceDetails; import com.autotune.common.k8sObjects.K8sObject; import com.autotune.common.target.kubernetes.service.KubernetesServices; @@ -85,6 +86,15 @@ private static List convertKubernetesAPIObjectListToK8sObjectList(Lis containerDataMap.put(containerAPIObject.getContainer_name(), containerData); } k8sObject.setContainerDataMap(containerDataMap); + + // adding namespace recommendations to K8sObject + NamespaceData namespaceData = new NamespaceData(); + if (kubernetesAPIObject.getNamespaceAPIObjects() != null && kubernetesAPIObject.getNamespaceAPIObjects().getnamespaceRecommendations() != null) { + namespaceData.setNamespace_name(kubernetesAPIObject.getNamespace()); + namespaceData.setNamespaceRecommendations(kubernetesAPIObject.getNamespaceAPIObjects().getnamespaceRecommendations()); + k8sObject.setNamespaceData(namespaceData); + } + k8sObjectList.add(k8sObject); } return k8sObjectList; @@ -181,7 +191,7 @@ protected void doGet(HttpServletRequest request, HttpServletResponse response) t Gson gsonObj = createGsonObject(); // Modify the JSON response here based on query params. - gsonStr = buildResponseBasedOnQuery(mKruizeExperimentMap, gsonObj, results, recommendations, latest, experimentName); + gsonStr = buildResponseBasedOnQuery(mKruizeExperimentMap, gsonObj, results, recommendations, latest, experimentName, rmTable); if (gsonStr.isEmpty()) { gsonStr = generateDefaultResponse(); } @@ -364,7 +374,7 @@ private void checkPercentileInfo(Map mainKruizeExperimentM } private String buildResponseBasedOnQuery(Map mKruizeExperimentMap, Gson gsonObj, String results, - String recommendations, String latest, String experimentName) { + String recommendations, String latest, String experimentName, boolean rmTable) { // Case : default // return the response without results or recommendations if (results.equalsIgnoreCase(AnalyzerConstants.BooleanString.FALSE) && recommendations.equalsIgnoreCase(AnalyzerConstants.BooleanString.FALSE)) { @@ -376,7 +386,7 @@ private String buildResponseBasedOnQuery(Map mKruizeExperi AnalyzerConstants.BooleanString.TRUE)) { // Case: results=true , recommendations=true // fetch results and recomm. from the DB - loadRecommendations(mKruizeExperimentMap, experimentName); + loadRecommendations(mKruizeExperimentMap, experimentName, rmTable); buildRecommendationsResponse(mKruizeExperimentMap, latest); loadResults(mKruizeExperimentMap, experimentName); @@ -398,7 +408,7 @@ private String buildResponseBasedOnQuery(Map mKruizeExperi return gsonObj.toJson(new ArrayList<>(mKruizeExperimentMap.values())); } else { // Case: results=false , recommendations=true - loadRecommendations(mKruizeExperimentMap, experimentName); + loadRecommendations(mKruizeExperimentMap, experimentName, rmTable); buildRecommendationsResponse(mKruizeExperimentMap, latest); return gsonObj.toJson(new ArrayList<>(mKruizeExperimentMap.values())); } @@ -421,13 +431,19 @@ private void loadResults(Map mKruizeExperimentMap, String } } - private void loadRecommendations(Map mKruizeExperimentMap, String experimentName) { + private void loadRecommendations(Map mKruizeExperimentMap, String experimentName, boolean rmTable) { try { - if (experimentName == null || experimentName.isEmpty()) - new ExperimentDBService().loadAllRecommendations(mKruizeExperimentMap); - else - new ExperimentDBService().loadRecommendationsFromDBByName(mKruizeExperimentMap, experimentName); - + if (rmTable) { + if (experimentName == null || experimentName.isEmpty()) + new ExperimentDBService().loadAllRecommendations(mKruizeExperimentMap); + else + new ExperimentDBService().loadRecommendationsFromDBByName(mKruizeExperimentMap, experimentName); + } else { + if (experimentName == null || experimentName.isEmpty()) + new ExperimentDBService().loadAllLMRecommendations(mKruizeExperimentMap); + else + new ExperimentDBService().loadLMRecommendationsFromDBByName(mKruizeExperimentMap, experimentName); + } } catch (Exception e) { LOGGER.error("Failed to load saved recommendations data: {} ", e.getMessage()); } From 4ad8e36e9df3b1a6e34d06ac3c6b24572263a9a0 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Thu, 19 Dec 2024 09:31:26 +0530 Subject: [PATCH 75/85] Bulk demo fix Signed-off-by: msvinaykumar --- .../serviceObjects/BulkJobStatus.java | 25 ++++++++++++------- .../analyzer/services/BulkService.java | 5 +++- .../analyzer/workerimpl/BulkJobManager.java | 17 ++++++------- 3 files changed, 28 insertions(+), 19 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java b/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java index 42a10b874..a8eba253f 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java @@ -27,6 +27,7 @@ import java.util.Collections; import java.util.HashMap; import java.util.Map; +import java.util.concurrent.atomic.AtomicInteger; import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.JOB_ID; import static com.autotune.utils.KruizeConstants.KRUIZE_BULK_API.NotificationConstants.Status.UNPROCESSED; @@ -41,7 +42,7 @@ public class BulkJobStatus { private String jobID; private String status; private int total_experiments; - private int processed_experiments; //todo : If the primary operations are increments or simple atomic updates, use AtomicInteger. It is designed for lock-free thread-safe access + private AtomicInteger processed_experiments; //todo : If the primary operations are increments or simple atomic updates, use AtomicInteger. It is designed for lock-free thread-safe access @JsonProperty("job_start_time") private String startTime; // Change to String to store formatted time @JsonProperty("job_end_time") @@ -54,6 +55,7 @@ public BulkJobStatus(String jobID, String status, Instant startTime) { this.jobID = jobID; this.status = status; setStartTime(startTime); + this.processed_experiments = new AtomicInteger(0); } @@ -144,12 +146,17 @@ public void setTotal_experiments(int total_experiments) { this.total_experiments = total_experiments; } - public synchronized int getProcessed_experiments() { + + public synchronized void incrementProcessed_experiments() { + this.processed_experiments.incrementAndGet(); + } + + public AtomicInteger getProcessed_experiments() { return processed_experiments; } - public synchronized void setProcessed_experiments(int processed_experiments) { - this.processed_experiments = processed_experiments; + public void setProcessed_experiments(int count) { + this.processed_experiments.set(count); } // Utility function to format Instant into the required UTC format @@ -194,16 +201,16 @@ public Recommendation getRecommendations() { return recommendations; } - public void setNotification(Notification notification) { - this.notification = notification; + public void setRecommendations(Recommendation recommendations) { + this.recommendations = recommendations; } public Notification getNotification() { return notification; } - public void setRecommendations(Recommendation recommendations) { - this.recommendations = recommendations; + public void setNotification(Notification notification) { + this.notification = notification; } } @@ -297,4 +304,4 @@ public void setNotifications(Notification notifications) { } } - } +} diff --git a/src/main/java/com/autotune/analyzer/services/BulkService.java b/src/main/java/com/autotune/analyzer/services/BulkService.java index c695bc70a..d22e7ac7d 100644 --- a/src/main/java/com/autotune/analyzer/services/BulkService.java +++ b/src/main/java/com/autotune/analyzer/services/BulkService.java @@ -109,7 +109,10 @@ protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws Se filters.addFilter("jobFilter", SimpleBeanPropertyFilter.serializeAll()); } objectMapper.setFilterProvider(filters); - String jsonResponse = objectMapper.writeValueAsString(jobDetails); + String jsonResponse = ""; + synchronized (jobDetails) { + jsonResponse = objectMapper.writeValueAsString(jobDetails); + } resp.getWriter().write(jsonResponse); statusValue = "success"; } catch (Exception e) { diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index e8250df67..cf12052b2 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -153,8 +153,7 @@ public void run() { if (null != daterange) { metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, (Long) daterange.get(START_TIME), (Long) daterange.get(END_TIME), (Integer) daterange.get(STEPS), includeResourcesMap, excludeResourcesMap); - } - else { + } else { metadataInfo = dataSourceManager.importMetadataFromDataSource(datasource, labelString, 0, 0, 0, includeResourcesMap, excludeResourcesMap); } @@ -197,10 +196,10 @@ public void run() { } finally { if (!expriment_exists) { LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); + jobData.incrementProcessed_experiments(); } synchronized (new Object()) { - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments().get()) { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } } @@ -228,9 +227,9 @@ public void run() { experiment.getRecommendations().setStatus(NotificationConstants.Status.FAILED); experiment.getRecommendations().setNotifications(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); } finally { - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); + jobData.incrementProcessed_experiments(); synchronized (new Object()) { - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments().get()) { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } } @@ -240,8 +239,8 @@ public void run() { } catch (Exception e) { e.printStackTrace(); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); - jobData.setProcessed_experiments(jobData.getProcessed_experiments() + 1); - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + jobData.incrementProcessed_experiments(); + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments().get()) { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } } @@ -270,7 +269,7 @@ public void run() { } } - if (jobData.getTotal_experiments() == jobData.getProcessed_experiments()) { + if (jobData.getTotal_experiments() == jobData.getProcessed_experiments().get()) { statusValue = "success"; } } From 49ae6a3751b59e69f9437c8d3ef05ffe84dc1e3a Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Thu, 19 Dec 2024 11:52:12 +0530 Subject: [PATCH 76/85] incorporated review comments Signed-off-by: msvinaykumar --- .../com/autotune/analyzer/serviceObjects/BulkJobStatus.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java b/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java index a8eba253f..a4abfd735 100644 --- a/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java +++ b/src/main/java/com/autotune/analyzer/serviceObjects/BulkJobStatus.java @@ -147,7 +147,7 @@ public void setTotal_experiments(int total_experiments) { } - public synchronized void incrementProcessed_experiments() { + public void incrementProcessed_experiments() { this.processed_experiments.incrementAndGet(); } From 2ab10a80206dac09a1008533cd928066197b0d7e Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Thu, 19 Dec 2024 15:07:04 +0530 Subject: [PATCH 77/85] Review comments incorporated Signed-off-by: msvinaykumar --- .../autotune/analyzer/workerimpl/BulkJobManager.java | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index cf12052b2..a299de26d 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -179,14 +179,14 @@ public void run() { GenericRestApiClient apiClient = new GenericRestApiClient(finalDatasource); apiClient.setBaseURL(KruizeDeploymentInfo.experiments_url); GenericRestApiClient.HttpResponseWrapper responseCode; - boolean expriment_exists = false; + boolean experiment_exists = false; try { responseCode = apiClient.callKruizeAPI("[" + new Gson().toJson(apiObject) + "]"); LOGGER.debug("API Response code: {}", responseCode); if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CREATED) { - expriment_exists = true; + experiment_exists = true; } else if (responseCode.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { - expriment_exists = true; + experiment_exists = true; } else { experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, responseCode.getResponseBody().toString(), responseCode.getStatusCode())); } @@ -194,7 +194,7 @@ public void run() { e.printStackTrace(); experiment.setNotification(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_BAD_REQUEST)); } finally { - if (!expriment_exists) { + if (!experiment_exists) { LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); jobData.incrementProcessed_experiments(); } @@ -205,7 +205,7 @@ public void run() { } } - if (expriment_exists) { + if (experiment_exists) { generateExecutor.submit(() -> { // send request to generateRecommendations API GenericRestApiClient recommendationApiClient = new GenericRestApiClient(finalDatasource); From 53d2d8df3bf7c41dbb7ec6953735db6876001d83 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Thu, 19 Dec 2024 17:45:34 +0530 Subject: [PATCH 78/85] update list recomm URL with 'rm' flag Signed-off-by: Saad Khan --- .../common/datasource/DataSourceMetadataOperator.java | 4 ++-- .../db_migration_test/db_migration_test.sh | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java b/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java index 4fb1552e1..496e5c414 100644 --- a/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java +++ b/src/main/java/com/autotune/common/datasource/DataSourceMetadataOperator.java @@ -217,7 +217,7 @@ public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(Da LOGGER.info("containerQuery: {}", containerQuery); JsonArray namespacesDataResultArray = fetchQueryResults(dataSourceInfo, namespaceQuery, startTime, endTime, steps); - LOGGER.info("namespacesDataResultArray: {}", namespacesDataResultArray); + LOGGER.debug("namespacesDataResultArray: {}", namespacesDataResultArray); if (!op.validateResultArray(namespacesDataResultArray)) { dataSourceMetadataInfo = dataSourceDetailsHelper.createDataSourceMetadataInfoObject(dataSourceName, null); } else { @@ -226,7 +226,7 @@ public DataSourceMetadataInfo processQueriesAndPopulateDataSourceMetadataInfo(Da * Value: DataSourceNamespace object corresponding to a namespace */ HashMap datasourceNamespaces = dataSourceDetailsHelper.getActiveNamespaces(namespacesDataResultArray); - LOGGER.info("datasourceNamespaces: {}", datasourceNamespaces.keySet()); + LOGGER.debug("datasourceNamespaces: {}", datasourceNamespaces.keySet()); dataSourceMetadataInfo = dataSourceDetailsHelper.createDataSourceMetadataInfoObject(dataSourceName, datasourceNamespaces); /** diff --git a/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test.sh b/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test.sh index e149ea2a2..e0047758b 100755 --- a/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test.sh +++ b/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test.sh @@ -166,7 +166,8 @@ do reco_json_dir="${LOG_DIR}/reco_jsons" mkdir -p ${reco_json_dir} - curl -s http://${SERVER_IP_ADDR}/listRecommendations?experiment_name=${exp_name} > ${reco_json_dir}/${exp_name}_reco.json + echo "curl -s http://${SERVER_IP_ADDR}/listRecommendations?experiment_name=${exp_name}&rm=true" + curl -s "http://${SERVER_IP_ADDR}/listRecommendations?experiment_name=${exp_name}&rm=true" > ${reco_json_dir}/${exp_name}_reco.json python3 validate_reco_json.py -f ${reco_json_dir}/${exp_name}_reco.json -e ${end_time} if [ $? != 0 ]; then From 7f90b2d938fad23283b387f0fdb104b62e0428c8 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Thu, 19 Dec 2024 18:16:47 +0530 Subject: [PATCH 79/85] update the URL in the other migration file with the rm flag Signed-off-by: Saad Khan --- .../db_migration_test_without_postgres_restart.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test_without_postgres_restart.sh b/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test_without_postgres_restart.sh index 7f521da68..50cd1c027 100755 --- a/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test_without_postgres_restart.sh +++ b/tests/scripts/remote_monitoring_tests/db_migration_test/db_migration_test_without_postgres_restart.sh @@ -170,7 +170,8 @@ do reco_json_dir="${LOG_DIR}/reco_jsons" mkdir -p ${reco_json_dir} - curl -s http://${SERVER_IP_ADDR}/listRecommendations?experiment_name=${exp_name} > ${reco_json_dir}/${exp_name}_reco.json + echo "curl -s http://${SERVER_IP_ADDR}/listRecommendations?experiment_name=${exp_name}&rm=true" + curl -s "http://${SERVER_IP_ADDR}/listRecommendations?experiment_name=${exp_name}&rm=true" > ${reco_json_dir}/${exp_name}_reco.json python3 validate_reco_json.py -f ${reco_json_dir}/${exp_name}_reco.json -e ${end_time} if [ $? != 0 ]; then From 0db4124b4fc3078ec60ce5c5821d00c8ad94d043 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Thu, 19 Dec 2024 18:46:47 +0530 Subject: [PATCH 80/85] add new test plan for 0.3 Signed-off-by: Saad Khan --- tests/test_plans/test_plan_rel_0.3.md | 141 ++++++++++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 tests/test_plans/test_plan_rel_0.3.md diff --git a/tests/test_plans/test_plan_rel_0.3.md b/tests/test_plans/test_plan_rel_0.3.md new file mode 100644 index 000000000..f66639d1b --- /dev/null +++ b/tests/test_plans/test_plan_rel_0.3.md @@ -0,0 +1,141 @@ +# KRUIZE TEST PLAN RELEASE 0.3 + +- [INTRODUCTION](#introduction) +- [FEATURES TO BE TESTED](#features-to-be-tested) +- [BUG FIXES TO BE TESTED](#bug-fixes-to-be-tested) +- [TEST ENVIRONMENT](#test-environment) +- [TEST DELIVERABLES](#test-deliverables) + - [New Test Cases Developed](#new-test-cases-developed) + - [Regression Testing](#regresion-testing) +- [SCALABILITY TESTING](#scalability-testing) +- [RELEASE TESTING](#release-testing) +- [TEST METRICS](#test-metrics) +- [RISKS AND CONTINGENCIES](#risks-and-contingencies) +- [APPROVALS](#approvals) + +----- + +## INTRODUCTION + +This document describes the test plan for Kruize remote monitoring release 0.3 + +---- + +## FEATURES TO BE TESTED + +* Concurrent RM and LM changes +* ListExperiments & ListRecommendations update based on new DB changes +* Bulk API filtration feature +* Jetty Server version upgrade + + +------ + +## BUG FIXES TO BE TESTED + +* ListExperiments API Not Listing Recommendation +* Namespace experiments not getting created +* `No data available` issue in GenereateRecommendations API +* Bulk API test failures +* Parallel requests issue with Bulk API + +--- + +## TEST ENVIRONMENT + +* Minikube Cluster +* Openshift Cluster + +--- + +## TEST DELIVERABLES + +### New Test Cases Developed + +| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | +|---|----------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------|---------|----------| +| 1 | Concurrent RM and LM changes | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [](https://github.com/kruize/autotune/pull/) | | PASSED | | +| 2 | ListExperiments & ListRecommendations update based on new DB changes | | | PASSED | | +| 3 | Bulk API filtration feature | Tests will be added later, tested using the Bulk demo | | PASSED | +| 4 | Jetty Server version upgrade | | PASSED | | + +### Regression Testing + +| # | ISSUE (BUG/NEW FEATURE) | TEST CASE | RESULTS | COMMENTS | +|---|-----------------------------------------------------------|-------------------------------------------|---------|----------| +| 1 | ListExperiments API Not Listing Recommendation | Kruize local monitoring Bulk demo | PASSED | | +| 2 | Namespace experiments not getting created | Kruize local monitoring demo | PASSED | | +| 3 | `No data available` issue in GenereateRecommendations API | Kruize local monitoring demo | PASSED | | +| 4 | Bulk API test failures | Kruize local monitoring bulk service demo | PASSED | | +| 5 | Parallel requests issue with Bulk API | Kruize local monitoring bulk service demo | PASSED | | + +--- + +## SCALABILITY TESTING + +Evaluate Kruize Scalability on OCP, with 5k experiments by uploading resource usage data for 15 days and update recommendations. +Changes do not have scalability implications. Short scalability test will be run as part of the release testing + +Short Scalability run +- 5K exps / 15 days of results / 2 containers per exp +- Kruize replicas - 10 +- OCP - Scalelab cluster + +| Kruize Release | Exps / Results / Recos | Execution time | Latency (Max/ Avg) in seconds | | | Postgres DB size(MB) | Kruize Max CPU | Kruize Max Memory (GB) | +|----------------|------------------------|----------------|-------------------------------|---------------|----------------------|----------------------|----------------|------------------------| +| | | | UpdateRecommendations | UpdateResults | LoadResultsByExpName | | | | +| 0.1 | 5K / 72L / 3L | 5h 02 mins | 0.97 / 0.55 | 0.16 / 0.14 | 0.52 / 0.36 | 21757 | 7.3 | 33.67 | +| 0.2 | 5K / 72L / 3L | 4h 08 mins | 0.81 / 0.48 | 0.14 / 0.12 | 0.55 / 0.38 | 21749 | 4.78 | 25.31 GB | +| 0.3 | 5K / 72L / 3L | 4h 15 mins | 0.91 / 0.48 | 0.08 / 0.07 | 0.58 / 0.38 | 21751 | 6.42 | 11.25 GB | + +---- +## RELEASE TESTING + +As part of the release testing, following tests will be executed: +- [Kruize Remote monitoring Functional tests](/tests/scripts/remote_monitoring_tests/Remote_monitoring_tests.md) +- [Fault tolerant test](/tests/scripts/remote_monitoring_tests/fault_tolerant_tests.md) +- [Stress test](/tests/scripts/remote_monitoring_tests/README.md) +- [DB Migration test](/tests/scripts/remote_monitoring_tests/db_migration_test.md) +- [Recommendation and box plot values validation test](https://github.com/kruize/kruize-demos/blob/main/monitoring/remote_monitoring_demo/recommendations_infra_demo/README.md) +- [Scalability test (On openshift)](/tests/scripts/remote_monitoring_tests/scalability_test.md) - scalability test with 5000 exps / 15 days usage data +- [Kruize remote monitoring demo (On minikube)](https://github.com/kruize/kruize-demos/blob/main/monitoring/remote_monitoring_demo/README.md) +- [Kruize local monitoring demo (On openshift)](https://github.com/kruize/kruize-demos/blob/main/monitoring/local_monitoring_demo) +- [Kruize local monitoring Functional tests](/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md) + + +| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | +|---|------------------------------------------------|------------------------------------|------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - , PASSED - / FAILED - | TOTAL - , PASSED - / FAILED - | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | +| 2 | Fault tolerant test | PASSED | PASSED | | +| 3 | Stress test | | | [Intermittent failure](https://github.com/kruize/autotune/issues/1106) | +| 4 | Scalability test (short run) | | | Exps - 5000, Results - 72000, execution time - 4 hours, 8 mins | +| 5 | DB Migration test | PASSED | PASSED | Tested on openshift | +| 6 | Recommendation and box plot values validations | | | Tested on minikube | +| 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | +| 8 | Kruize Local monitoring demo | PASSED | PASSED | | +| 9 | Kruize Local Functional tests | TOTAL - , PASSED - / FAILED - | TOTAL - , PASSED - / FAILED - | [Issue 1395](https://github.com/kruize/autotune/issues/1395), [Issue 1217](https://github.com/kruize/autotune/issues/1217), [Issue 1273](https://github.com/kruize/autotune/issues/1273) GPU accelerator test failed, failure can be ignored for now | + +--- + +## TEST METRICS + +### Test Completion Criteria + +* All must_fix defects identified for the release are fixed +* New features work as expected and tests have been added to validate these +* No new regressions in the functional tests +* All non-functional tests work as expected without major issues +* Documentation updates have been completed + +---- + +## RISKS AND CONTINGENCIES + +* None + +---- +## APPROVALS + +Sign-off + +---- From b410b2784fa653c678494ba4ec2958e50fbfc47b Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Thu, 19 Dec 2024 23:29:24 +0530 Subject: [PATCH 81/85] add missing rm flag to fix negative test failures Signed-off-by: Saad Khan --- .../remote_monitoring_tests/rest_apis/test_update_results.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/scripts/remote_monitoring_tests/rest_apis/test_update_results.py b/tests/scripts/remote_monitoring_tests/rest_apis/test_update_results.py index 44a77a98b..5cbe2c59e 100644 --- a/tests/scripts/remote_monitoring_tests/rest_apis/test_update_results.py +++ b/tests/scripts/remote_monitoring_tests/rest_apis/test_update_results.py @@ -785,7 +785,7 @@ def test_update_results__duplicate_records_with_single_exp_multiple_results(clus results = "true" recommendations = "false" latest = "false" - response = list_experiments(results, recommendations, latest, experiment_name) + response = list_experiments(results, recommendations, latest, experiment_name, True) list_exp_json = response.json() assert response.status_code == SUCCESS_200_STATUS_CODE From df7c423d1deee3a24aa68d2545f94cbf6a7d87e0 Mon Sep 17 00:00:00 2001 From: msvinaykumar Date: Fri, 20 Dec 2024 11:52:39 +0530 Subject: [PATCH 82/85] incorporated review comments Signed-off-by: msvinaykumar --- .../java/com/autotune/analyzer/workerimpl/BulkJobManager.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java index a299de26d..f223510bb 100644 --- a/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java +++ b/src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java @@ -198,7 +198,7 @@ public void run() { LOGGER.info("Processing experiment {}", jobData.getProcessed_experiments()); jobData.incrementProcessed_experiments(); } - synchronized (new Object()) { + synchronized (jobData) { if (jobData.getTotal_experiments() == jobData.getProcessed_experiments().get()) { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } @@ -228,7 +228,7 @@ public void run() { experiment.getRecommendations().setNotifications(new BulkJobStatus.Notification(BulkJobStatus.NotificationType.ERROR, e.getMessage(), HttpURLConnection.HTTP_INTERNAL_ERROR)); } finally { jobData.incrementProcessed_experiments(); - synchronized (new Object()) { + synchronized (jobData) { if (jobData.getTotal_experiments() == jobData.getProcessed_experiments().get()) { setFinalJobStatus(COMPLETED, null, null, finalDatasource); } From b9d132e73f0cc04e651deb599a7c5de816e9c590 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Mon, 23 Dec 2024 18:09:04 +0530 Subject: [PATCH 83/85] update test plan with latest results Signed-off-by: Saad Khan --- tests/test_plans/test_plan_rel_0.3.md | 37 +++++++++++++-------------- 1 file changed, 18 insertions(+), 19 deletions(-) diff --git a/tests/test_plans/test_plan_rel_0.3.md b/tests/test_plans/test_plan_rel_0.3.md index f66639d1b..986f68085 100644 --- a/tests/test_plans/test_plan_rel_0.3.md +++ b/tests/test_plans/test_plan_rel_0.3.md @@ -24,7 +24,6 @@ This document describes the test plan for Kruize remote monitoring release 0.3 ## FEATURES TO BE TESTED * Concurrent RM and LM changes -* ListExperiments & ListRecommendations update based on new DB changes * Bulk API filtration feature * Jetty Server version upgrade @@ -52,12 +51,12 @@ This document describes the test plan for Kruize remote monitoring release 0.3 ### New Test Cases Developed -| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | -|---|----------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------|---------|----------| -| 1 | Concurrent RM and LM changes | [New tests added](https://github.com/kruize/autotune/blob/mvp_demo/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md#authentication-test) | [](https://github.com/kruize/autotune/pull/) | | PASSED | | -| 2 | ListExperiments & ListRecommendations update based on new DB changes | | | PASSED | | -| 3 | Bulk API filtration feature | Tests will be added later, tested using the Bulk demo | | PASSED | -| 4 | Jetty Server version upgrade | | PASSED | | +| # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | +|---|------------------------------|-------------------------------------------------------|--------------------------------------------------------|---------|----------| +| 1 | Concurrent RM and LM changes | Functional suite updated | [1424](https://github.com/kruize/autotune/pull/1424) | PASSED | | +| 2 | VPA changes | Demo code added | [107](https://github.com/kruize/kruize-demos/pull/107) | PASSED | | +| 3 | Bulk API filtration feature | Tests will be added later, tested using the Bulk demo | | | | +| 4 | Jetty Server version upgrade | Quay repo scan | PASSED | PASSED | | ### Regression Testing @@ -86,7 +85,7 @@ Short Scalability run | | | | UpdateRecommendations | UpdateResults | LoadResultsByExpName | | | | | 0.1 | 5K / 72L / 3L | 5h 02 mins | 0.97 / 0.55 | 0.16 / 0.14 | 0.52 / 0.36 | 21757 | 7.3 | 33.67 | | 0.2 | 5K / 72L / 3L | 4h 08 mins | 0.81 / 0.48 | 0.14 / 0.12 | 0.55 / 0.38 | 21749 | 4.78 | 25.31 GB | -| 0.3 | 5K / 72L / 3L | 4h 15 mins | 0.91 / 0.48 | 0.08 / 0.07 | 0.58 / 0.38 | 21751 | 6.42 | 11.25 GB | +| 0.3 | 5K / 72L / 3L | 4h 36 mins | 1.0 / 0.52 | 0.07 / 0.07 | 0.53 / 0.33 | 21753 | 7.76 | 11.74 GB | ---- ## RELEASE TESTING @@ -103,17 +102,17 @@ As part of the release testing, following tests will be executed: - [Kruize local monitoring Functional tests](/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md) -| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | -|---|------------------------------------------------|------------------------------------|------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - , PASSED - / FAILED - | TOTAL - , PASSED - / FAILED - | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | -| 2 | Fault tolerant test | PASSED | PASSED | | -| 3 | Stress test | | | [Intermittent failure](https://github.com/kruize/autotune/issues/1106) | -| 4 | Scalability test (short run) | | | Exps - 5000, Results - 72000, execution time - 4 hours, 8 mins | -| 5 | DB Migration test | PASSED | PASSED | Tested on openshift | -| 6 | Recommendation and box plot values validations | | | Tested on minikube | -| 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | -| 8 | Kruize Local monitoring demo | PASSED | PASSED | | -| 9 | Kruize Local Functional tests | TOTAL - , PASSED - / FAILED - | TOTAL - , PASSED - / FAILED - | [Issue 1395](https://github.com/kruize/autotune/issues/1395), [Issue 1217](https://github.com/kruize/autotune/issues/1217), [Issue 1273](https://github.com/kruize/autotune/issues/1273) GPU accelerator test failed, failure can be ignored for now | +| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | +|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | +| 2 | Fault tolerant test | PASSED | PASSED | | +| 3 | Stress test | PASSED | PASSED | [Intermittent failure](https://github.com/kruize/autotune/issues/1106) | +| 4 | Scalability test (short run) | PASSED | PASSED | Exps - 5000, Results - 72000, execution time - 4 hours, 36 mins | +| 5 | DB Migration test | PASSED | PASSED | Tested on scalelab openshift cluster | +| 6 | Recommendation and box plot values validations | PASSED | PASSED | Tested on scalelab | +| 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | +| 8 | Kruize Local monitoring demo | PASSED | PASSED | Tested manually | +| 9 | Kruize Local Functional tests | TOTAL - 81 , PASSED - 78 / FAILED - 3 | TOTAL - 81 , PASSED - 61 / FAILED - 20 | [Issue 1395](https://github.com/kruize/autotune/issues/1395), [Issue 1217](https://github.com/kruize/autotune/issues/1217), [Issue 1273](https://github.com/kruize/autotune/issues/1273) GPU accelerator test failed, failure can be ignored for now | --- From 56da88ac77a90cec0be8b95f1488e96d8a3cc744 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Mon, 23 Dec 2024 18:32:03 +0530 Subject: [PATCH 84/85] address review comments Signed-off-by: Saad Khan --- tests/test_plans/test_plan_rel_0.3.md | 33 ++++++++++++++------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/tests/test_plans/test_plan_rel_0.3.md b/tests/test_plans/test_plan_rel_0.3.md index 986f68085..d5a5685ed 100644 --- a/tests/test_plans/test_plan_rel_0.3.md +++ b/tests/test_plans/test_plan_rel_0.3.md @@ -25,6 +25,7 @@ This document describes the test plan for Kruize remote monitoring release 0.3 * Concurrent RM and LM changes * Bulk API filtration feature +* Auto mode support in Kruize for VPA integration PoC * Jetty Server version upgrade @@ -54,9 +55,8 @@ This document describes the test plan for Kruize remote monitoring release 0.3 | # | ISSUE (NEW FEATURE) | TEST DESCRIPTION | TEST DELIVERABLES | RESULTS | COMMENTS | |---|------------------------------|-------------------------------------------------------|--------------------------------------------------------|---------|----------| | 1 | Concurrent RM and LM changes | Functional suite updated | [1424](https://github.com/kruize/autotune/pull/1424) | PASSED | | -| 2 | VPA changes | Demo code added | [107](https://github.com/kruize/kruize-demos/pull/107) | PASSED | | +| 2 | VPA changes PoC | Demo code added | [107](https://github.com/kruize/kruize-demos/pull/107) | PASSED | | | 3 | Bulk API filtration feature | Tests will be added later, tested using the Bulk demo | | | | -| 4 | Jetty Server version upgrade | Quay repo scan | PASSED | PASSED | | ### Regression Testing @@ -67,6 +67,7 @@ This document describes the test plan for Kruize remote monitoring release 0.3 | 3 | `No data available` issue in GenereateRecommendations API | Kruize local monitoring demo | PASSED | | | 4 | Bulk API test failures | Kruize local monitoring bulk service demo | PASSED | | | 5 | Parallel requests issue with Bulk API | Kruize local monitoring bulk service demo | PASSED | | +| 6 | Jetty Server version upgrade | Quay repo scan | PASSED | | --- @@ -83,9 +84,8 @@ Short Scalability run | Kruize Release | Exps / Results / Recos | Execution time | Latency (Max/ Avg) in seconds | | | Postgres DB size(MB) | Kruize Max CPU | Kruize Max Memory (GB) | |----------------|------------------------|----------------|-------------------------------|---------------|----------------------|----------------------|----------------|------------------------| | | | | UpdateRecommendations | UpdateResults | LoadResultsByExpName | | | | -| 0.1 | 5K / 72L / 3L | 5h 02 mins | 0.97 / 0.55 | 0.16 / 0.14 | 0.52 / 0.36 | 21757 | 7.3 | 33.67 | -| 0.2 | 5K / 72L / 3L | 4h 08 mins | 0.81 / 0.48 | 0.14 / 0.12 | 0.55 / 0.38 | 21749 | 4.78 | 25.31 GB | -| 0.3 | 5K / 72L / 3L | 4h 36 mins | 1.0 / 0.52 | 0.07 / 0.07 | 0.53 / 0.33 | 21753 | 7.76 | 11.74 GB | +| 0.2 | 5K / 72L / 3L | 4h 08 mins | 0.81 / 0.48 | 0.14 / 0.12 | 0.55 / 0.38 | 21749 | 4.78 | 25.31 | +| 0.3 | 5K / 72L / 3L | 4h 36 mins | 1.0 / 0.52 | 0.07 / 0.07 | 0.53 / 0.33 | 21753 | 7.76 | 11.74 | ---- ## RELEASE TESTING @@ -102,17 +102,18 @@ As part of the release testing, following tests will be executed: - [Kruize local monitoring Functional tests](/tests/scripts/local_monitoring_tests/Local_monitoring_tests.md) -| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | -|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | -| 2 | Fault tolerant test | PASSED | PASSED | | -| 3 | Stress test | PASSED | PASSED | [Intermittent failure](https://github.com/kruize/autotune/issues/1106) | -| 4 | Scalability test (short run) | PASSED | PASSED | Exps - 5000, Results - 72000, execution time - 4 hours, 36 mins | -| 5 | DB Migration test | PASSED | PASSED | Tested on scalelab openshift cluster | -| 6 | Recommendation and box plot values validations | PASSED | PASSED | Tested on scalelab | -| 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | -| 8 | Kruize Local monitoring demo | PASSED | PASSED | Tested manually | -| 9 | Kruize Local Functional tests | TOTAL - 81 , PASSED - 78 / FAILED - 3 | TOTAL - 81 , PASSED - 61 / FAILED - 20 | [Issue 1395](https://github.com/kruize/autotune/issues/1395), [Issue 1217](https://github.com/kruize/autotune/issues/1217), [Issue 1273](https://github.com/kruize/autotune/issues/1273) GPU accelerator test failed, failure can be ignored for now | +| # | TEST SUITE | EXPECTED RESULTS | ACTUAL RESULTS | COMMENTS | +|---|------------------------------------------------|-----------------------------------------|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| 1 | Kruize Remote monitoring Functional testsuite | TOTAL - 359, PASSED - 316 / FAILED - 43 | TOTAL - 359, PASSED - 316 / FAILED - 43 | Intermittent issue seen [1281](https://github.com/kruize/autotune/issues/1281), [1393](https://github.com/kruize/autotune/issues/1393), existing issues - [559](https://github.com/kruize/autotune/issues/559), [610](https://github.com/kruize/autotune/issues/610) | +| 2 | Fault tolerant test | PASSED | PASSED | | +| 3 | Stress test | PASSED | | [Intermittent failure](https://github.com/kruize/autotune/issues/1106) | +| 4 | Scalability test (short run) | PASSED | PASSED | Exps - 5000, Results - 72000, execution time - 4 hours, 36 mins | +| 5 | DB Migration test | PASSED | PASSED | Tested on scalelab openshift cluster | +| 6 | Recommendation and box plot values validations | PASSED | PASSED | Tested on scalelab | +| 7 | Kruize remote monitoring demo | PASSED | PASSED | Tested manually | +| 8 | Kruize Local monitoring demo | PASSED | PASSED | Tested manually | +| 8 | Kruize Bulk demo | PASSED | PASSED | Tested manually | +| 9 | Kruize Local Functional tests | TOTAL - 81 , PASSED - 78 / FAILED - 3 | TOTAL - 81 , PASSED - 59 / FAILED - 22 | [Issue 1395](https://github.com/kruize/autotune/issues/1395), [Issue 1217](https://github.com/kruize/autotune/issues/1217), [Issue 1273](https://github.com/kruize/autotune/issues/1273) GPU accelerator test failed, failure can be ignored for now [PR 1437](https://github.com/kruize/autotune/pull/1437) added for create_exp failures which will go in 0.4 | --- From 11630ad90f221a2a01d8a44f6b22e299b4d71cd3 Mon Sep 17 00:00:00 2001 From: Saad Khan Date: Mon, 23 Dec 2024 18:33:55 +0530 Subject: [PATCH 85/85] update features/bugs section Signed-off-by: Saad Khan --- tests/test_plans/test_plan_rel_0.3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/test_plans/test_plan_rel_0.3.md b/tests/test_plans/test_plan_rel_0.3.md index d5a5685ed..a2d4947e5 100644 --- a/tests/test_plans/test_plan_rel_0.3.md +++ b/tests/test_plans/test_plan_rel_0.3.md @@ -26,7 +26,6 @@ This document describes the test plan for Kruize remote monitoring release 0.3 * Concurrent RM and LM changes * Bulk API filtration feature * Auto mode support in Kruize for VPA integration PoC -* Jetty Server version upgrade ------ @@ -38,6 +37,7 @@ This document describes the test plan for Kruize remote monitoring release 0.3 * `No data available` issue in GenereateRecommendations API * Bulk API test failures * Parallel requests issue with Bulk API +* Jetty Server version upgrade ---