Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCF to BQ Timeout Issue #699

Closed
adaykin opened this issue Jun 9, 2021 · 4 comments
Closed

VCF to BQ Timeout Issue #699

adaykin opened this issue Jun 9, 2021 · 4 comments
Assignees

Comments

@adaykin
Copy link

adaykin commented Jun 9, 2021

Hello, we recently ran into an issue during the VCF to BQ process.

I'd like to know if there's anything that we can do differently to avoid this error since it causes a pretty big pain point for us when the VCF to BQ process fails.

The command we're running is:
/opt/gcp_variant_transforms/src/gcp_variant_transforms/vcf_to_bq.py --setup_file ./setup.py --project --input_file <vcf_blobs_storage_file> --representative_header_file <vcf_header_storage_file> --output_table <variants_output_table> --sample_lookup_optimized_output_table <samples_output_table> --sample_name_encoding WITHOUT_FILE_PATH --temp_location <temp_location> --job_name <job_name --runner DataflowRunner --region us-central1 --append True --update_schema_on_append --use_1_based_coordinate --keep_intermediate_avro_files --experiments shuffle_mode=service

Before the exception we received a message in the logs that says "ERROR:root:Something unexpected happened during the loading rows to sample optimized table stage: Operation did not complete within the designated timeout.
WARNING:root:Since tables were appended, added rows cannot be reverted. You can utilize BigQuery snapshot decorators to recover your table up to 7 days ago. For more information please refer to: https://cloud.google.com/bigquery/table-decorators Here is the list of tables that you need to manually rollback:"

The stack trace is:

Traceback (most recent call last):
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/retry.py", line 184, in retry_target
return target()
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/future/polling.py", line 84, in _done_or_raise
raise _OperationNotComplete()
google.api_core.future.polling._OperationNotComplete

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/future/polling.py", line 104, in blocking_poll
retry
(self._done_or_raise)()
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/retry.py", line 286, in retry_wrapped_func
on_error=on_error,
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/retry.py", line 206, in retry_target
last_exc,
File "", line 3, in raise_from
google.api_core.exceptions.RetryError: Deadline of 600.0s exceeded while calling functools.partial(<bound method PollingFuture._done_or_raise of <google.cloud.bigquery.job.QueryJob object at 0x7fba110daf28>>), last exception:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/gcp_variant_transforms/src/gcp_variant_transforms/vcf_to_bq.py", line 643, in
raise e
File "/opt/gcp_variant_transforms/src/gcp_variant_transforms/vcf_to_bq.py", line 631, in
run()
File "/opt/gcp_variant_transforms/src/gcp_variant_transforms/vcf_to_bq.py", line 626, in run
raise e
File "/opt/gcp_variant_transforms/src/gcp_variant_transforms/vcf_to_bq.py", line 621, in run
known_args.sample_lookup_optimized_output_table)
File "/opt/gcp_variant_transforms/src/gcp_variant_transforms/libs/partitioning.py", line 277, in copy_to_flatten_table
self._copy_to_flatten_table(full_output_table_id, cp_query)
File "/opt/gcp_variant_transforms/src/gcp_variant_transforms/libs/partitioning.py", line 170, in _copy_to_flatten_table
_ = query_job.result(timeout=600)
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/cloud/bigquery/job.py", line 3230, in result
super(QueryJob, self).result(retry=retry, timeout=timeout)
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/cloud/bigquery/job.py", line 835, in result
return super(_AsyncJob, self).result(timeout=timeout)
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/future/polling.py", line 125, in result
self._blocking_poll(timeout=timeout)
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/cloud/bigquery/job.py", line 3126, in _blocking_poll
super(QueryJob, self)._blocking_poll(timeout=timeout)
File "/opt/gcp_variant_transforms/venv3/lib/python3.7/site-packages/google/api_core/future/polling.py", line 107, in _blocking_poll
"Operation did not complete within the designated " "timeout."
concurrent.futures._base.TimeoutError: Operation did not complete within the designated timeout.

@moschetti
Copy link
Member

Can you check the BigQuery logs to see if there is information about the timeout?
http://console.cloud.google.com/logs
You can filter by time and resource type to see if there are any BigQuery errors or warnings.

@moschetti
Copy link
Member

Following up here: the cause of this is the 600s timeout of the query that generates the sample-optimized table, which is failing on lower (larger) chromosomes in datasets with large number of samples. This is due to #607 where the sample-optimized table does not support --append behavior.

Workaround is create a container image with a longer timeout. Long-term fix will be tracked in #607.

@lawrenae
Copy link
Collaborator

this is fixed with PR #713

@pgrosu
Copy link

pgrosu commented Apr 1, 2022

@lawrenae Did you mean to reference #715 instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants