-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFX.components.transform id #6278
Comments
I am unable to run the shared notebook. My environment crashes while using Can you please make sure the example notebook works so that we can replicate the issue on our end. Thank you! |
not sure how to run this! I am able to run the jupyter on a local machine but on colab it fails at the moment. Will appreciate any feedback on this or if you can run this locally. |
@raminmohammadi, I tried but was unable to create a local setup to test your notebook because of some permission issues. @zoyahav, Can you please give some feedback why the transform output in TFX pipeline is different from expected output when running the transformation outside TFX pipeline. Thanks. |
Any updates on this issue? Tnx |
If the bug is related to a specific library below, please raise an issue in the
respective repo directly:
TensorFlow Data Validation Repo
TensorFlow Model Analysis Repo
TensorFlow Transform Repo
TensorFlow Serving Repo
System information
Interactive Notebook, Google Cloud, etc): Linux, Notebook, Colab
pip freeze
output):requirements.txt
Describe the current behavior:
this problem only happens when i use the transfrom as part of the tfx. I'm encountering an issue while working with the "transform" function, which involves processing individual input data items. Each of these data inputs consists of two keys: 'entities' and 'text'.
My specific task is to perform a transformation on the "text" dimension of the input tensor, breaking it down into individual characters. For example, given the input "This is a test," I intend to follow these steps:
Split the text into character arrays: [['t', 'h', 'i', 's'], ['i', 's'], ['a'], ['t', 'e', 's', 't']]
Code 1: tf.strings.unicode_split(tf.strings.split('This is a test'), input_encoding='UTF-8')
Map each character to a dictionary, obtain its index, and pad each word to a width of 12 characters.
Code 2: tf.map_fn(get_index, text, fn_output_signature=tf.TensorSpec(shape=(1, Wlength), dtype=tf.int64, name=None))
currently transform only returns one vector starting with 1 and rest 0:
example = [[1, 0,0,0,0,0,0,0,0]]
Describe the expected behavior
expected output should be:
<tf.Tensor: shape=(4, 1, 12), dtype=int64, numpy=
array([[[58, 20, 21, 31, 0, 0, 0, 0, 0, 0, 0, 0]],
Standalone code to reproduce the issue
Providing a bare minimum test case or step(s) to reproduce the problem will
greatly help us to debug the issue. If possible, please share a link to
Colab/Jupyter/any notebook.
https://colab.research.google.com/drive/1ap8Gycu7s--mz0VAxp4W2DphAd1HW1yi?usp=sharing
Name of your Organization (Optional)
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem.
If including tracebacks, please include the full traceback. Large logs and files
should be attached.
The text was updated successfully, but these errors were encountered: