Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running TFX transform fails to pip install #6955

Open
kolaente opened this issue Nov 14, 2024 · 4 comments
Open

Running TFX transform fails to pip install #6955

kolaente opened this issue Nov 14, 2024 · 4 comments

Comments

@kolaente
Copy link

System information

  • Have I specified the code to reproduce the issue (Yes, No): Yes
  • Environment in which the code is executed: Linux, venv Jupyter Notebook
  • TensorFlow version: 2.15.1
  • TFX Version: 1.15.1
  • Python version: 3.10
  • Python dependencies (from pip freeze output):
click to expand
absl-py==1.4.0
annotated-types==0.7.0
anyio==4.6.2.post1
apache-beam==2.60.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
astunparse==1.6.3
async-lru==2.0.4
async-timeout==5.0.0
attrs==23.2.0
babel==2.16.0
backcall==0.2.0
backports.tarfile==1.2.0
beautifulsoup4==4.12.3
bleach==6.2.0
cachetools==5.5.0
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.4.0
click==8.1.7
cloudpickle==2.2.1
colorama==0.4.6
comm==0.2.2
crcmod==1.7
cryptography==43.0.3
debugpy==1.8.7
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.1.1
dnspython==2.7.0
docker==4.4.4
docopt==0.6.2
docstring_parser==0.16
exceptiongroup==1.2.2
fastavro==1.9.7
fasteners==0.19
fastjsonschema==2.20.0
flatbuffers==24.3.25
fqdn==1.5.1
gast==0.6.0
google-api-core==2.22.0
google-api-python-client==1.12.11
google-apitools==0.5.31
google-auth==2.35.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.1
google-cloud-aiplatform==1.71.1
google-cloud-bigquery==3.26.0
google-cloud-bigquery-storage==2.27.0
google-cloud-bigtable==2.26.0
google-cloud-core==2.4.1
google-cloud-datastore==2.20.1
google-cloud-dlp==3.25.0
google-cloud-language==2.15.0
google-cloud-pubsub==2.26.1
google-cloud-pubsublite==1.11.1
google-cloud-recommendations-ai==0.10.13
google-cloud-resource-manager==1.13.0
google-cloud-spanner==3.49.1
google-cloud-storage==2.18.2
google-cloud-videointelligence==2.14.0
google-cloud-vision==3.8.0
google-crc32c==1.6.0
google-pasta==0.2.0
google-resumable-media==2.7.2
googleapis-common-protos==1.65.0
grpc-google-iam-v1==0.13.1
grpc-interceptor==0.15.4
grpcio==1.65.5
grpcio-status==1.48.2
h11==0.14.0
h5py==3.12.1
hdfs==2.7.3
httpcore==1.0.6
httplib2==0.22.0
httpx==0.27.2
idna==3.10
importlib_metadata==8.4.0
ipykernel==6.29.5
ipython==7.34.0
ipython-genutils==0.2.0
ipywidgets==7.8.5
isoduration==20.11.0
jaraco.classes==3.4.0
jaraco.context==6.0.1
jaraco.functools==4.1.0
jedi==0.19.1
jeepney==0.8.0
Jinja2==3.1.4
joblib==1.4.2
json5==0.9.25
jsonpickle==3.3.0
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.3
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==1.1.11
keras==2.15.0
keras-tuner==1.4.7
keyring==25.5.0
keyrings.google-artifactregistry-auth==1.1.2
kt-legacy==1.0.5
kubernetes==12.0.1
libclang==18.1.1
lxml==5.3.0
Markdown==3.7
MarkupSafe==3.0.2
matplotlib-inline==0.1.7
mistune==3.0.2
ml-dtypes==0.3.2
ml-metadata==1.15.0
ml-pipelines-sdk==1.15.1
more-itertools==10.5.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
nltk==3.9.1
notebook==7.2.2
notebook_shim==0.2.4
numpy==1.26.4
nvidia-cublas-cu12==12.2.5.6
nvidia-cuda-cupti-cu12==12.2.142
nvidia-cuda-nvcc-cu12==12.2.140
nvidia-cuda-nvrtc-cu12==12.2.140
nvidia-cuda-runtime-cu12==12.2.140
nvidia-cudnn-cu12==8.9.4.25
nvidia-cufft-cu12==11.0.8.103
nvidia-curand-cu12==10.3.3.141
nvidia-cusolver-cu12==11.5.2.141
nvidia-cusparse-cu12==12.1.2.141
nvidia-nccl-cu12==2.16.5
nvidia-nvjitlink-cu12==12.2.140
oauth2client==4.1.3
oauthlib==3.2.2
objsize==0.7.0
opentelemetry-api==1.27.0
opentelemetry-sdk==1.27.0
opentelemetry-semantic-conventions==0.48b0
opt_einsum==3.4.0
orjson==3.10.11
overrides==7.7.0
packaging==24.1
pandas==1.5.3
pandocfilters==1.5.1
parso==0.8.4
pexpect==4.9.0
pickleshare==0.7.5
pillow==11.0.0
platformdirs==4.3.6
pluggy==1.5.0
portalocker==2.10.1
portpicker==1.6.0
prometheus_client==0.21.0
prompt_toolkit==3.0.48
proto-plus==1.25.0
protobuf==3.20.3
psutil==6.1.0
ptyprocess==0.7.0
pyarrow==10.0.1
pyarrow-hotfix==0.6
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycparser==2.22
pydantic==2.9.2
pydantic_core==2.23.4
pydot==1.4.2
pyfarmhash==0.3.2
Pygments==2.18.0
pymongo==4.10.1
pyparsing==3.2.0
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
pytz==2024.2
PyYAML==6.0.2
pyzmq==26.2.0
redis==5.2.0
referencing==0.35.1
regex==2024.9.11
requests==2.32.3
requests-oauthlib==2.0.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rouge_score==0.1.2
rpds-py==0.20.1
rsa==4.9
sacrebleu==2.4.3
scipy==1.12.0
SecretStorage==3.3.3
Send2Trash==1.8.3
shapely==2.0.6
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
sqlparse==0.5.1
tabulate==0.9.0
tensorboard==2.15.2
tensorboard-data-server==0.7.2
tensorflow==2.15.1
tensorflow-data-validation==1.15.1
tensorflow-estimator==2.15.0
tensorflow-hub==0.15.0
tensorflow-io-gcs-filesystem==0.37.1
tensorflow-metadata==1.15.0
tensorflow-serving-api==2.15.1
tensorflow-transform==1.15.0
tensorflow_model_analysis==0.46.0
termcolor==2.5.0
terminado==0.18.1
tfx==1.15.1
tfx-bsl==1.15.1
tinycss2==1.4.0
tomli==2.0.2
tornado==6.4.1
tqdm==4.66.6
traitlets==5.14.3
types-python-dateutil==2.9.0.20241003
typing_extensions==4.12.2
uri-template==1.3.0
uritemplate==3.0.1
urllib3==2.2.3
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.1.1
widgetsnbextension==3.6.10
wrapt==1.14.1
zipp==3.20.2
zstandard==0.23.0

Describe the current behavior

Following this TFX guide, running this code in a jupyter notebook cell:

from tfx.components import Transform

transform = Transform(
    examples=prepare_data_component.outputs['examples'],
    schema=schema_gen.outputs['schema'],
    module_file=os.path.abspath('../components/module.py'))

context.run(transform)

(examples and schema have been generated previously)

results in this error:

WARNING: There was an error checking the latest version of pip.
ERROR: Exception:
Traceback (most recent call last):
  File "PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 105, in _run_wrapper
    status = _inner_run()
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 96, in _inner_run
    return self.run(options, args)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 67, in wrapper
    return func(self, options, args)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 325, in run
    session = self.get_default_session(options)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/index_command.py", line 76, in get_default_session
    self._session = self.enter_context(self._build_session(options))
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/index_command.py", line 99, in _build_session
    session = PipSession(
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/network/session.py", line 344, in __init__
    self.headers["User-Agent"] = user_agent()
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/network/session.py", line 142, in user_agent
    linux_distribution = distro.name(), distro.version(), distro.codename()
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 371, in version
    return _distro.version(pretty, best)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 900, in version
    self.uname_attr("release"),
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 1088, in uname_attr
    return self._uname_info.get(attribute, "")
  File "/nix/store/si7kfwma2v8ypxnl9iyl0x2sw47fq4pc-python3-3.10.14-env/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 1202, in _uname_info
    stdout = subprocess.check_output(cmd, stderr=subprocess.DEVNULL)
  File "/nix/store/si7kfwma2v8ypxnl9iyl0x2sw47fq4pc-python3-3.10.14-env/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/nix/store/si7kfwma2v8ypxnl9iyl0x2sw47fq4pc-python3-3.10.14-env/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('uname', '-rs')' died with <Signals.SIGSEGV: 11>.

CalledProcessError: Command '['PROJECT_DIR/.devenv/state/venv/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmphlk7ovq8', '/tmp/tfx-interactive-2024-11-14T12_22_15.829911-sndfipr4/_wheels/tfx_user_code_Transform-0.0+6a115ee6c2805a1c5f73bc0f06dee31e25bab22f2807f05ce91a6ad75f2068aa-py3-none-any.whl']' returned non-zero exit status 2.

It looks like running pip here failed - why does it even call pip in the first place? I'm able to run pip without issues in the venv.

Describe the expected behavior

Does not crash.

Standalone code to reproduce the issue

see above

@janasangeetha janasangeetha self-assigned this Nov 18, 2024
@pritamdodeja
Copy link
Contributor

That whl contains your preprocessing_fn, the idea being the transform component should be portable and scalable (e.g. you snould be able to run the same transform component on DataflowRunner, each worker can execute that preprocessing work, hence the whl). The uname -rs dying looks like a problem. Are you able to install that whl manually (before doing that, also research how to uninstall whl) as that whl is installed in a temporary python environment.

@janasangeetha
Copy link
Contributor

Hi @kolaente
Thank you for reporting. I'll investigate and provide an update here.

@janasangeetha
Copy link
Contributor

Hi @kolaente
I was unable to reproduce the error. Please provide more steps to reproduce the issue. Also, I am able to run the tutorial which has transform component gist. Please feel free to explore the tutorial.

@kolaente
Copy link
Author

Seems like an issue with my environment. It works when running it in a jupyter notebook server.

Still, why does running the component try to install something via pip?

@janasangeetha
Copy link
Contributor

Hi @kolaente
As per my understanding when we run in local environment wheel file will be created for the component and then the package will be installed.
@lego0901 Could you please share your thoughts.

@pritamdodeja
Copy link
Contributor

Since Transform is a beam powered component, and beam, in a local environment, can execute multiple workers as different processes, each process needs the right packages, including the module_file containing preprocessing_fn, to execute in a distributed manner. This is the reason for the pip install. In fact, if you look at what happens when a pipeline is compiled to be executed in GCP, you will see that similar whl's are generated/staged in GCS, so that the workers executing in DataflowRunner also have a similar execution environment. If you unzip the whl file, you'll see what all is in there. However, I don't think pip installation is the issue here in the local environment, there is something more fundamental going on because of the SIGSEV being seen in the logs.

It is possible to run Transform locally without the pip dependency (e.g. you want to do multi-threaded vs. multi-process approach), but that won't solve your problem unfortunately.

@janasangeetha janasangeetha removed their assignment Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants