Skip to content

Commit

Permalink
Next steps
Browse files Browse the repository at this point in the history
  • Loading branch information
TheooJ committed Jun 20, 2024
1 parent 206fe54 commit 3e4a597
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions skrub/_joiner.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
FunctionTransformer(partial(sbd.fill_nulls, value="")),
ToStr(),
HashingVectorizer(analyzer="char_wb", ngram_range=(2, 4)),
# TODO: Remove sparse output from Tfidf to work with TableVectorizer
TfidfTransformer(),
)
_DATETIME_ENCODER = DatetimeEncoder(resolution=None, add_total_seconds=True)
Expand Down Expand Up @@ -55,7 +54,8 @@ def _make_vectorizer(table, string_encoder, rescale):
In addition if `rescale` is `True`, a StandardScaler is applied to
numeric and datetime columns.
"""
# TODO remove use of ColumnTransformer, select_dtypes & pandas-specific code
# TODO: add Skrubber before ColumnTransformer
# TODO: remove use of ColumnTransformer
transformers = [
(clone(string_encoder), c) for c in (s.string() | s.categorical()).expand(table)
]
Expand Down

0 comments on commit 3e4a597

Please sign in to comment.