Improve recommender tutorial by updating code to be usable outside of InteractiveContext #5885

TaylorZowtuk · 2023-05-03T21:08:56Z

URL(s) with the issue:

https://www.tensorflow.org/tfx/tutorials/tfx/recommenders

Description of issue (what needs changing):

The current notebook runs without issue and works as a starting point. For me (and I presume others) the next step is naturally to organize the code in a more production-like pipeline which means adapting the notebook and fitting it into something like the templates described in this guide.

However, if one adapts the recommender tutorial to run outside of InteractiveContext, then the code fails to run. In particular, the Channel's that we pass to the Trainer component are empty when the pipeline is run using LocalDagRunner. When the MovielensModel calls movies_uri.get()[0] in its constructor, the program will throw a RuntimeError because we are indexing into an empty list. This is in spite of the fact that the artifacts do exist in the local file system and previous components have run correctly.

I created a fork and (arbitrarily) pushed my code here to illustrate exactly what I am running.

You can see the logs from a run here. In particular, look from this line onwards and you will see what custom_config evaluates to and the error.

From my brief attempt at tracing through the TFX code for the Trainer component, it seems that the executors track or resolve the artifacts for the non-custom_config arguments (like examples, transform_graph, and schema) differently than the custom_config arguments. That is why train_files in run_fn() is a valid path while the custom_config values are empty Channel's. But I am uncertain of why there is a difference depending on the orchestrator used and what the correct way to resolve this is.

This is not a new confusion, as you can see others have come across the same situation as myself. Unfortunately, that question was never answered and I was also unable to find any answers in any of the TensorFlow repos/docs or other stack overflow posts. I hope that this issue can clarify the correct way to approach this situation and help others avoid the same mistake in the future.

Why this should be changed:

I would like to request that the tutorial be updated because:

I feel like the tutorial should teach users how to use TFX in a manner that works independently of the type of orchestrator
it will help users avoid facing unexpected failures when they apply what they learned from the tutorial
there is currently a lack of clarity on why these artifacts are empty in some situations

The text was updated successfully, but these errors were encountered:

TaylorZowtuk · 2023-05-03T21:11:44Z

@rcrowe-google I see you published the original tutorial. Would you be willing to share your thoughts on this, whether its a worthwhile improvement, and possibly clarify why the code works when using the InteractiveContext but not LocalDagRunner?

rcrowe-google · 2023-05-03T23:01:03Z

Hi @TaylorZowtuk - Yes, it would be worthwhile to update the example to work in LocalDagRunner, and I should have written it that way in the first place. It's been on my list to update it for what seems like forever, and I just haven't had time yet. The code works in InteractiveContext because the artifacts are in memory, but they really should have been passed in Channels.

TaylorZowtuk · 2023-05-04T14:06:34Z

Thanks for confirming and thanks for the clarification. I appreciate you taking the time to respond @rcrowe-google.

BlakeB415 · 2023-08-26T04:06:53Z

Hello, I'm experiencing the same issue. Is there a workaround for this currently?

lukhaza · 2024-01-02T06:27:49Z

Do we have any progress on this issue ? I'm experiencing the same issue.

stefandominicus-takealot · 2024-11-04T09:31:29Z

@lukhaza I believe you extended the Trainer component to ingest multiple Examples channels. Can you share any of those details here?

stefandominicus-takealot · 2024-11-19T13:08:40Z

I've extended the standard Trainer and Tuner components (as well as the google_cloud_ai_platform extensions of those components) to support additional Examples and Schema inputs. Here is a gist.

The solution is general (supporting both "item" and "query" inputs), but in the context of this thread, you would pass the dataset of unique movies into run_fn via the Trainer's item_examples and item_schema parameters, and then use fn_args.item_files, fn_args.item_schema_path and fn_args.item_data_accessor to load this dataset (just as you already use fn_args.train_files, fn_args.schema_path and fn_args.data_accessor to load your training dataset).

TaylorZowtuk added the type:docs label May 3, 2023

singhniraj08 self-assigned this May 4, 2023

singhniraj08 added the stat:awaiting response label May 4, 2023

google-ml-butler bot removed the stat:awaiting response label May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve recommender tutorial by updating code to be usable outside of InteractiveContext #5885

Improve recommender tutorial by updating code to be usable outside of InteractiveContext #5885

TaylorZowtuk commented May 3, 2023 •

edited

Loading

TaylorZowtuk commented May 3, 2023 •

edited

Loading

rcrowe-google commented May 3, 2023

TaylorZowtuk commented May 4, 2023

BlakeB415 commented Aug 26, 2023

lukhaza commented Jan 2, 2024

stefandominicus-takealot commented Nov 4, 2024

stefandominicus-takealot commented Nov 19, 2024

Improve recommender tutorial by updating code to be usable outside of InteractiveContext #5885

Improve recommender tutorial by updating code to be usable outside of InteractiveContext #5885

Comments

TaylorZowtuk commented May 3, 2023 • edited Loading

URL(s) with the issue:

Description of issue (what needs changing):

Why this should be changed:

TaylorZowtuk commented May 3, 2023 • edited Loading

rcrowe-google commented May 3, 2023

TaylorZowtuk commented May 4, 2023

BlakeB415 commented Aug 26, 2023

lukhaza commented Jan 2, 2024

stefandominicus-takealot commented Nov 4, 2024

stefandominicus-takealot commented Nov 19, 2024

TaylorZowtuk commented May 3, 2023 •

edited

Loading

TaylorZowtuk commented May 3, 2023 •

edited

Loading