Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FlowSpec Updating Function #3823

Merged
merged 8 commits into from
Nov 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
import com.linkedin.data.template.StringMap;
import com.typesafe.config.Config;
import com.typesafe.config.ConfigFactory;
import com.typesafe.config.ConfigValueFactory;
import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
import java.net.URI;
import java.net.URISyntaxException;
Expand Down Expand Up @@ -76,7 +75,7 @@ public class FlowSpec implements Configurable, Spec {
/** Human-readable description of the flow spec */
final String description;

/** Flow config as a typesafe config object*/
/** Flow config as a typesafe config object */
final Config config;

/** Flow config as a properties collection for backwards compatibility */
Expand Down Expand Up @@ -125,6 +124,7 @@ public static FlowSpec.Builder builder(URI catalogURI, Properties flowProps) {
throw new RuntimeException("Unable to create a FlowSpec URI: " + e, e);
}
}

public void addCompilationError(String src, String dst, String errorMessage, int numberOfHops) {
this.compilationErrors.add(new CompilationError(getConfig(), src, dst, errorMessage, numberOfHops));
}
Expand Down Expand Up @@ -518,15 +518,5 @@ public static int maxFlowSpecUriLength() {
return URI_SCHEME.length() + ":".length() // URI separator
+ URI_PATH_SEPARATOR.length() + ServiceConfigKeys.MAX_FLOW_NAME_LENGTH + URI_PATH_SEPARATOR.length() + ServiceConfigKeys.MAX_FLOW_GROUP_LENGTH;
}

/**
* Create a new FlowSpec object with the added property defined by path and value parameters
* @param path key for new property
* @param value
*/
public static FlowSpec createFlowSpecWithProperty(FlowSpec flowSpec, String path, String value) {
Config updatedConfig = flowSpec.getConfig().withValue(path, ConfigValueFactory.fromAnyRef(value));
return new Builder(flowSpec.getUri()).withConfig(updatedConfig).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my hunch is that this .build() is what generated all the garbage. most likely the conversion of config to props

}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,13 @@
* than epsilon and encapsulate executor communication latency including retry attempts
*
* The `event_timestamp` is the time of the flow_action event request.
* --- Note ---
* --- Database event_timestamp laundering ---
* We only use the participant's local event_timestamp internally to identify the particular flow_action event, but
* after interacting with the database utilize the CURRENT_TIMESTAMP of the database to insert or keep
* track of our event. This is to avoid any discrepancies due to clock drift between participants as well as
* variation in local time and database time for future comparisons.
* ---Event consolidation---
* track of our event, "laundering" or replacing the local timestamp with the database one. This is to avoid any
* discrepancies due to clock drift between participants as well as variation in local time and database time for
* future comparisons.
* --- Event consolidation ---
* Note that for the sake of simplification, we only allow one event associated with a particular flow's flow_action
* (ie: only one LAUNCH for example of flow FOO, but there can be a LAUNCH, KILL, & RESUME for flow FOO at once) during
* the time it takes to execute the flow action. In most cases, the execution time should be so negligible that this
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,8 @@ private void handleLeadershipChange(NotificationContext changeContext) {
this.gitConfigMonitor.setActive(true);
}

// TODO: surround by try/catch to disconnect from Helix and fail the leader transition if DagManager is not
// transitioned properly
if (configuration.isDagManagerEnabled()) {
//Activate DagManager only if TopologyCatalog is initialized. If not; skip activation.
if (this.topologyCatalog.getInitComplete().getCount() == 0) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -504,9 +504,12 @@ public void handleLaunchFlowEvent(DagActionStore.DagAction launchAction) {
URI flowUri = FlowSpec.Utils.createFlowSpecUri(flowId);
FlowSpec spec = (FlowSpec) flowCatalog.getSpecs(flowUri);
Optional<Dag<JobExecutionPlan>> optionalJobExecutionPlanDag =
this.flowCompilationValidationHelper.createExecutionPlanIfValid(spec);
this.flowCompilationValidationHelper.createExecutionPlanIfValid(spec, Optional.absent());
if (optionalJobExecutionPlanDag.isPresent()) {
addDag(optionalJobExecutionPlanDag.get(), true, true);
} else {
log.warn("Failed flow compilation of spec causing launch flow event to be skipped on startup. Flow {}", flowId);
this.dagManagerMetrics.incrementFailedLaunchCount();
Comment on lines +511 to +512
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't actually say what/how it failed flow compilation... is that because such a message would have already been logged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we don't have that information available here yet. We emit a flow compilation failed event that we can check

populateFlowCompilationFailedEventMessage(eventSubmitter, flowSpec, flowMetadata);

}
// Upon handling the action, delete it so on leadership change this is not duplicated
this.dagActionStore.get().deleteDagAction(launchAction);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -335,9 +335,9 @@ public void orchestrate(Spec spec, Properties jobProps, long triggerTimestampMil
Instrumented.updateTimer(this.flowOrchestrationTimer, System.nanoTime() - startTime, TimeUnit.NANOSECONDS);
}

public void submitFlowToDagManager(FlowSpec flowSpec) throws IOException, InterruptedException {
public void submitFlowToDagManager(FlowSpec flowSpec, Optional<String> optionalFlowExecutionId) throws IOException, InterruptedException {
Optional<Dag<JobExecutionPlan>> optionalJobExecutionPlanDag =
this.flowCompilationValidationHelper.createExecutionPlanIfValid(flowSpec);
this.flowCompilationValidationHelper.createExecutionPlanIfValid(flowSpec, optionalFlowExecutionId);
if (optionalJobExecutionPlanDag.isPresent()) {
submitFlowToDagManager(flowSpec, optionalJobExecutionPlanDag.get());
} else {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,14 @@ public final class FlowCompilationValidationHelper {
* For a given a flowSpec, verifies that an execution is allowed (in case there is an ongoing execution) and the
* flowspec can be compiled. If the pre-conditions hold, then a JobExecutionPlan is constructed and returned to the
* caller.
* @param flowSpec
* @param optionalFlowExecutionId for scheduled (non-ad-hoc) flows, to pass the ID "laundered" via the DB;
* see: {@link MysqlMultiActiveLeaseArbiter javadoc section titled
* `Database event_timestamp laundering`}
* @return jobExecutionPlan dag if one can be constructed for the given flowSpec
*/
public Optional<Dag<JobExecutionPlan>> createExecutionPlanIfValid(FlowSpec flowSpec)
throws IOException, InterruptedException {
public Optional<Dag<JobExecutionPlan>> createExecutionPlanIfValid(FlowSpec flowSpec,
Optional<String> optionalFlowExecutionId) throws IOException, InterruptedException {
Comment on lines +73 to +74
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though not related to this PR's changes per se, just noting that guava lib upgrades cause us considerable hassle, and a special challenge we'd face if we ever elect to shade guava classes, would arise from using guava types as params or return types.

unless there's a pressing mandate to use guava Optional, we should always prefer java.util.Optional. (guava has been deprecated in favor of that one for a very long time.)

Copy link
Contributor Author

@umustafi umustafi Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other optionals used in this class are the guava one and reliant on intertwined Orchestrator/DagManager code. I want to avoid the inconsistency in this class for now.

Config flowConfig = flowSpec.getConfig();
String flowGroup = flowConfig.getString(ConfigurationKeys.FLOW_GROUP_KEY);
String flowName = flowConfig.getString(ConfigurationKeys.FLOW_NAME_KEY);
Expand All @@ -90,7 +94,7 @@ public Optional<Dag<JobExecutionPlan>> createExecutionPlanIfValid(FlowSpec flowS
return Optional.absent();
}

addFlowExecutionIdIfAbsent(flowMetadata, jobExecutionPlanDagOptional.get());
addFlowExecutionIdIfAbsent(flowMetadata, optionalFlowExecutionId, jobExecutionPlanDagOptional.get());
if (flowCompilationTimer.isPresent()) {
flowCompilationTimer.get().stop(flowMetadata);
}
Expand Down Expand Up @@ -177,11 +181,23 @@ public static void populateFlowCompilationFailedEventMessage(Optional<EventSubmi
}

/**
* If it is a scheduled flow (and hence, does not have flowExecutionId in the FlowSpec) and the flow compilation is
* successful, retrieve the flowExecutionId from the JobSpec.
* If it is a scheduled flow (which does not have flowExecutionId in the FlowSpec) and the flow compilation is
* successful, retrieve flowExecutionId from the JobSpec.
*/
public static void addFlowExecutionIdIfAbsent(Map<String,String> flowMetadata,
Dag<JobExecutionPlan> jobExecutionPlanDag) {
addFlowExecutionIdIfAbsent(flowMetadata, Optional.absent(), jobExecutionPlanDag);
}

/**
* If it is a scheduled flow (which does not have flowExecutionId in the FlowSpec) and the flow compilation is
* successful, add a flowExecutionId using the optional parameter if it exists otherwise retrieve it from the JobSpec.
*/
public static void addFlowExecutionIdIfAbsent(Map<String,String> flowMetadata,
Optional<String> optionalFlowExecutionId, Dag<JobExecutionPlan> jobExecutionPlanDag) {
if (optionalFlowExecutionId.isPresent()) {
flowMetadata.putIfAbsent(TimingEvent.FlowEventConstants.FLOW_EXECUTION_ID_FIELD, optionalFlowExecutionId.get());
}
flowMetadata.putIfAbsent(TimingEvent.FlowEventConstants.FLOW_EXECUTION_ID_FIELD,
jobExecutionPlanDag.getNodes().get(0).getValue().getJobSpec().getConfigAsProperties().getProperty(
ConfigurationKeys.FLOW_EXECUTION_ID_KEY));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
package org.apache.gobblin.service.monitoring;

import com.google.common.annotations.VisibleForTesting;
import com.google.common.base.Optional;
import com.google.common.cache.CacheBuilder;
import com.google.common.cache.CacheLoader;
import com.google.common.cache.LoadingCache;
Expand All @@ -30,7 +31,6 @@
import java.util.concurrent.TimeUnit;
import lombok.Getter;
import lombok.extern.slf4j.Slf4j;
import org.apache.gobblin.configuration.ConfigurationKeys;
import org.apache.gobblin.kafka.client.DecodeableKafkaRecord;
import org.apache.gobblin.metrics.ContextAwareGauge;
import org.apache.gobblin.metrics.ContextAwareMeter;
Expand Down Expand Up @@ -202,9 +202,8 @@ protected void submitFlowToDagManagerHelper(String flowGroup, String flowName, S
try {
URI flowUri = FlowSpec.Utils.createFlowSpecUri(flowId);
spec = (FlowSpec) flowCatalog.getSpecs(flowUri);
// Adds flowExecutionId to config to ensure they are consistent across hosts
FlowSpec updatedSpec = FlowSpec.Utils.createFlowSpecWithProperty(spec, ConfigurationKeys.FLOW_EXECUTION_ID_KEY, flowExecutionId);
this.orchestrator.submitFlowToDagManager(updatedSpec);
// Pass flowExecutionId to DagManager to be used for scheduled flows that do not already contain a flowExecutionId
this.orchestrator.submitFlowToDagManager(spec, Optional.of(flowExecutionId));
} catch (URISyntaxException e) {
log.warn("Could not create URI object for flowId {}. Exception {}", flowId, e.getMessage());
this.failedFlowLaunchSubmissions.mark();
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.gobblin.service.modules.utils;

import com.google.common.base.Optional;
import java.net.URISyntaxException;
import java.util.HashMap;
import org.apache.gobblin.metrics.event.TimingEvent;
import org.apache.gobblin.service.modules.flowgraph.Dag;
import org.apache.gobblin.service.modules.orchestration.DagTestUtils;
import org.apache.gobblin.service.modules.spec.JobExecutionPlan;
import org.junit.Assert;
import org.testng.annotations.BeforeClass;
import org.testng.annotations.Test;


/**
* Test functionality provided by the helper class re-used between the DagManager and Orchestrator for flow compilation.
*/
public class FlowCompilationValidationHelperTest {
private String dagId = "testDag";
private Long jobSpecFlowExecutionId = 1234L;
private String newFlowExecutionId = "5678";
private String existingFlowExecutionId = "9999";
private Dag<JobExecutionPlan> jobExecutionPlanDag;

@BeforeClass
public void setup() throws URISyntaxException {
jobExecutionPlanDag = DagTestUtils.buildDag(dagId, jobSpecFlowExecutionId);

}

/*
Tests that addFlowExecutionIdIfAbsent adds flowExecutionId to a flowMetadata object when it is absent, prioritizing
the optional flowExecutionId over the one from the job spec
*/
@Test
public void testAddFlowExecutionIdWhenAbsent() {
umustafi marked this conversation as resolved.
Show resolved Hide resolved
HashMap<String, String> flowMetadata = new HashMap<>();
FlowCompilationValidationHelper.addFlowExecutionIdIfAbsent(flowMetadata, Optional.of(newFlowExecutionId), jobExecutionPlanDag);
Assert.assertEquals(flowMetadata.get(TimingEvent.FlowEventConstants.FLOW_EXECUTION_ID_FIELD), newFlowExecutionId);
}

/*
Tests that addFlowExecutionIdIfAbsent does not update an existing flowExecutionId in a flowMetadata object
*/
@Test
public void testSkipAddingFlowExecutionIdWhenPresent() {
HashMap<String, String> flowMetadata = new HashMap<>();
flowMetadata.put(TimingEvent.FlowEventConstants.FLOW_EXECUTION_ID_FIELD, existingFlowExecutionId);
FlowCompilationValidationHelper.addFlowExecutionIdIfAbsent(flowMetadata, Optional.of(newFlowExecutionId), jobExecutionPlanDag);
Assert.assertEquals(flowMetadata.get(TimingEvent.FlowEventConstants.FLOW_EXECUTION_ID_FIELD), existingFlowExecutionId);
}
}
Loading