You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Dataflow job creation outputted Template successfully created, but the template wasn't created in GCS.
With the debug logs enabled, it turned out that the root cause was the GCS error The specified bucket does not exist.. The template creation command specified the bucket for stagingLocation correctly, but a non-existing bucket for templateLocation. So, it could upload pipeline graph and JAR artifacts to the staging location, but couldn't write the template file. See the output [1] for the outputs.
The problem in this issue is DataflowRunner outputs Template successfully created and doesn't report the error messages. As you see at [1], the error messages are reported in debug logs only (CONFIG, FINE, FINER and FINEST in JUL), which are not available at the default log level.
[1]
... skipped ...
Jan 16, 2025 9:47:03 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Dataflow SDK version: 2.59.0
Jan 16, 2025 9:47:03 PM org.apache.beam.sdk.util.construction.Environments$JavaVersion forSpecification
WARNING: Unsupported Java version: 22, falling back to: 21
... skipped ...
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl create
FINER: create(gs://<REDACTED>/template.json)
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl getWriteGeneration
FINER: getWriteGeneration(gs://<REDACTED>/template.json, true)
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl getItemInfo
FINER: getItemInfo(gs://<REDACTED>/template.json)
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl getObject
FINER: getObject(gs://<REDACTED>/template.json)
Jan 16, 2025 9:47:05 PM com.google.api.client.http.HttpRequest execute
CONFIG: -------------- REQUEST --------------
GET https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
Accept-Encoding: gzip
Authorization: <Not Logged>
User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)
x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed
x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2
x-cloud-trace-context: 8073c91c355dbed3a8dffbe1a9db8dcc/9140331792875765399;o=0
Jan 16, 2025 9:47:05 PM com.google.api.client.http.HttpRequest execute
CONFIG: curl -v --compressed -H 'Accept-Encoding: gzip' -H 'Authorization: <Not Logged>' -H 'User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)' -H 'x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed' -H 'x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2' -H 'x-cloud-trace-context: 8073c91c355dbed3a8dffbe1a9db8dcc/9140331792875765399;o=0' -- 'https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata'
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.http.HttpURLConnection plainConnect0
FINEST: ProxySelector Request for https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.https.HttpsClient New
FINEST: Looking for HttpClient for URL https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata and proxy value of DIRECT
Jan 16, 2025 9:47:05 PM sun.net.www.http.KeepAliveCache$ClientVector get
FINEST: cached HttpClient was idle for 3716
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.https.HttpsClient New
FINEST: KeepAlive stream retrieved from the cache, sun.net.www.protocol.https.HttpsClient(https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/staging%2Fpipeline-ueMS_MzNWJnRhXG05hdnWlMpUI5kQfCcy81r-I7R380.pb)
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.http.HttpURLConnection plainConnect0
FINEST: Proxy used: DIRECT
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.http.HttpURLConnection writeRequests
FINE: sun.net.www.MessageHeader@a0e33db 10 pairs: {GET /storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata HTTP/1.1: null}{Accept-Encoding: gzip}{Authorization: Bearer <REDACTED>>}{User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)}{x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed}{x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2}{x-cloud-trace-context: 8073c91c355dbed3a8dffbe1a9db8dcc/9140331792875765399;o=0}{Host: storage.googleapis.com}{Accept: */*}{Connection: keep-alive}
Jan 16, 2025 9:47:06 PM sun.net.www.http.HttpClient logFinest
FINEST: KeepAlive stream used: https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
Jan 16, 2025 9:47:06 PM sun.net.www.protocol.http.HttpURLConnection getInputStream0
FINE: sun.net.www.MessageHeader@3ef46749 12 pairs: {null: HTTP/1.1 404 Not Found}{X-GUploader-UploadID: AFIdbgR2rWfCdgSNLific-SJEW1Du32LTPwRGWp77LCdOunj6ZiguVvs_01wGi6WORYBsUJtnCrI50Y}{Content-Type: application/json; charset=UTF-8}{Date: Fri, 17 Jan 2025 05:47:06 GMT}{Vary: Origin}{Vary: X-Origin}{Cache-Control: no-cache, no-store, max-age=0, must-revalidate}{Expires: Mon, 01 Jan 1990 00:00:00 GMT}{Pragma: no-cache}{Content-Length: 247}{Server: UploadServer}{Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000}
Jan 16, 2025 9:47:06 PM com.google.api.client.http.HttpResponse <init>
CONFIG: -------------- RESPONSE --------------
HTTP/1.1 404 Not Found
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Server: UploadServer
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
X-GUploader-UploadID: AFIdbgR2rWfCdgSNLific-SJEW1Du32LTPwRGWp77LCdOunj6ZiguVvs_01wGi6WORYBsUJtnCrI50Y
Vary: Origin
Vary: X-Origin
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Content-Length: 247
Date: Fri, 17 Jan 2025 05:47:06 GMT
Content-Type: application/json; charset=UTF-8
Jan 16, 2025 9:47:06 PM org.apache.beam.sdk.extensions.gcp.util.RetryHttpRequestInitializer$LoggingHttpBackOffHandler handleResponse
FINE: Request failed with code 404, performed 0 retries due to IOExceptions, performed 0 retries due to unsuccessful status codes, HTTP framework says request can be retried, (caller responsible for retrying): https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata.
Jan 16, 2025 9:47:06 PM com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: Total: 247 bytes
Jan 16, 2025 9:47:06 PM com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: {
"error": {
"code": 404,
"message": "The specified bucket does not exist.",
"errors": [
{
"message": "The specified bucket does not exist.",
"domain": "global",
"reason": "notFound"
}
]
}
}
Jan 16, 2025 9:47:07 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl getObject
FINER: getObject(gs://<REDACTED>/template.json): not found
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
GET https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
{
"code": 404,
"errors": [
{
"domain": "global",
"message": "The specified bucket does not exist.",
"reason": "notFound"
}
],
"message": "The specified bucket does not exist."
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:466)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:552)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:493)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:603)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:2229)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:2122)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getWriteGeneration(GoogleCloudStorageImpl.java:2197)
at com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.create(GoogleCloudStorageImpl.java:568)
at org.apache.beam.sdk.extensions.gcp.util.GcsUtil.create(GcsUtil.java:714)
at org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystem.create(GcsFileSystem.java:155)
at org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystem.create(GcsFileSystem.java:72)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:246)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:233)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:1438)
at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:203)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:325)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:310)
at baeminbo.MultimapTimerPipeline.main(MultimapTimerPipeline.java:97)
Jan 16, 2025 9:47:07 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl getItemInfo
FINER: getItemInfo: gs://<REDACTED>/template.json: exists: no
Jan 16, 2025 9:47:13 PM com.google.cloud.hadoop.util.LoggingMediaHttpUploaderProgressListener progressChanged
FINE: Uploading: template.json
Jan 16, 2025 9:47:14 PM com.google.api.client.http.HttpRequest execute
CONFIG: -------------- REQUEST --------------
POST https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
Accept-Encoding: gzip
Authorization: <Not Logged>
User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)
x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed
x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2
x-upload-content-type: text/plain
x-cloud-trace-context: 3042899c41252b26f51144930ff4cb1e/3008447248009734846;o=0
Content-Type: application/json; charset=UTF-8
Content-Length: 38
Jan 16, 2025 9:47:14 PM com.google.api.client.http.HttpRequest execute
CONFIG: curl -v --compressed -X POST -H 'Accept-Encoding: gzip' -H 'Authorization: <Not Logged>' -H 'User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)' -H 'x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed' -H 'x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2' -H 'x-upload-content-type: text/plain' -H 'x-cloud-trace-context: 3042899c41252b26f51144930ff4cb1e/3008447248009734846;o=0' -H 'Content-Type: application/json; charset=UTF-8' -d '@-' -- 'https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable' << $$$
Jan 16, 2025 9:47:14 PM sun.net.www.protocol.http.HttpURLConnection plainConnect0
FINEST: ProxySelector Request for https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
Jan 16, 2025 9:47:14 PM sun.net.www.protocol.https.HttpsClient New
FINEST: Looking for HttpClient for URL https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable and proxy value of DIRECT
Jan 16, 2025 9:47:14 PM sun.net.www.protocol.https.HttpsClient <init>
FINEST: Creating new HttpsClient with url:https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable and proxy:DIRECT with connect timeout:20000
Jan 16, 2025 9:47:15 PM sun.net.www.protocol.http.HttpURLConnection plainConnect0
FINEST: Proxy used: DIRECT
Jan 16, 2025 9:47:16 PM jdk.internal.event.EventHelper logTLSHandshakeEvent
FINE: TLSHandshake: storage.googleapis.com:443, TLSv1.3, TLS_AES_256_GCM_SHA384, 2230027675
Jan 16, 2025 9:47:16 PM sun.net.www.protocol.http.HttpURLConnection writeRequests
FINE: sun.net.www.MessageHeader@67030140 13 pairs: {POST /upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable HTTP/1.1: null}{Accept-Encoding: gzip}{Authorization: Bearer <REDACTED>>}{User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)}{x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed}{x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2}{x-upload-content-type: text/plain}{x-cloud-trace-context: 3042899c41252b26f51144930ff4cb1e/3008447248009734846;o=0}{Content-Type: application/json; charset=UTF-8}{Host: storage.googleapis.com}{Accept: */*}{Connection: keep-alive}{Content-Length: 38}
Jan 16, 2025 9:47:25 PM com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: Total: 38 bytes
Jan 16, 2025 9:47:25 PM com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: {"metadata":{},"name":"template.json"}
Jan 16, 2025 9:47:26 PM sun.net.www.http.HttpClient logFinest
FINEST: KeepAlive stream used: https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
Jan 16, 2025 9:47:26 PM sun.net.www.protocol.http.HttpURLConnection getInputStream0
FINE: sun.net.www.MessageHeader@2fb6cf1d 12 pairs: {null: HTTP/1.1 404 Not Found}{X-GUploader-UploadID: AFIdbgToxMGUfI8spHUMSMKI_alHRDQsCZVLVKM3g7-yhU36YonczlEbFyGzT9_qWkovP7YXteS9tjr4HrxGxLJipI4wFzjdvzEgDfL44wkvaQ}{Date: Fri, 17 Jan 2025 05:47:25 GMT}{Vary: Origin}{Vary: X-Origin}{Cache-Control: no-cache, no-store, max-age=0, must-revalidate}{Expires: Mon, 01 Jan 1990 00:00:00 GMT}{Pragma: no-cache}{Content-Length: 247}{Server: UploadServer}{Content-Type: text/html; charset=UTF-8}{Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000}
Jan 16, 2025 9:47:26 PM com.google.api.client.http.HttpResponse <init>
CONFIG: -------------- RESPONSE --------------
HTTP/1.1 404 Not Found
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Server: UploadServer
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
X-GUploader-UploadID: AFIdbgToxMGUfI8spHUMSMKI_alHRDQsCZVLVKM3g7-yhU36YonczlEbFyGzT9_qWkovP7YXteS9tjr4HrxGxLJipI4wFzjdvzEgDfL44wkvaQ
Vary: Origin
Vary: X-Origin
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Content-Length: 247
Date: Fri, 17 Jan 2025 05:47:25 GMT
Content-Type: text/html; charset=UTF-8
Jan 16, 2025 9:47:26 PM org.apache.beam.sdk.extensions.gcp.util.RetryHttpRequestInitializer$LoggingHttpBackOffHandler handleResponse
FINE: Request failed with code 404, performed 0 retries due to IOExceptions, performed 0 retries due to unsuccessful status codes, HTTP framework says request can be retried, (caller responsible for retrying): https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable.
Jan 16, 2025 9:47:26 PM org.apache.beam.sdk.extensions.gcp.util.UploadIdResponseInterceptor interceptResponse
FINE: Upload ID for url https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable on worker null is AFIdbgToxMGUfI8spHUMSMKI_alHRDQsCZVLVKM3g7-yhU36YonczlEbFyGzT9_qWkovP7YXteS9tjr4HrxGxLJipI4wFzjdvzEgDfL44wkvaQ
Jan 16, 2025 9:47:27 PM com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: Total: 247 bytes
Jan 16, 2025 9:47:27 PM com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: {
"error": {
"code": 404,
"message": "The specified bucket does not exist.",
"errors": [
{
"message": "The specified bucket does not exist.",
"domain": "global",
"reason": "notFound"
}
]
}
}
Jan 16, 2025 9:47:27 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Printed job specification to gs://<REDACTED>/template.json
Jan 16, 2025 9:57:03 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Template successfully created.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam YAML
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Infrastructure
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
The test outputted Printed job specification to ... and Template successfully created. even if the writing was failed. This means GCS exception wasn't caught and handled properly at DataflowRunner.java#L1434.
This happens because PrintWriter never throws I/O Exceptions. The close() just set trouble = true for an IOException. The PrintWriter client should call checkError() for any error in the method invocation.
I believe this issue can happen with different GCS errors as well (e.g. GCS permission issue). The GCS upload may detect the error at close(), but the PrintWriter in DataflowRunner ignores the exception at the close().
What happened?
A Dataflow job creation outputted
Template successfully created
, but the template wasn't created in GCS.With the debug logs enabled, it turned out that the root cause was the GCS error
The specified bucket does not exist.
. The template creation command specified the bucket forstagingLocation
correctly, but a non-existing bucket fortemplateLocation
. So, it could upload pipeline graph and JAR artifacts to the staging location, but couldn't write the template file. See the output [1] for the outputs.The problem in this issue is
DataflowRunner
outputsTemplate successfully created
and doesn't report the error messages. As you see at [1], the error messages are reported in debug logs only (CONFIG, FINE, FINER and FINEST in JUL), which are not available at the default log level.[1]
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: