You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that the norms of embeddings for most models are 1, I thought at first that the default value of the normalize_embeddings argument in the encode() method is True, but I found that the default value of the normalize_embeddings is False. I found that if modules.json has a module of type sentence_transformers.models.Normalize (2_Normalize), the output embeddings will always be normalized, regardless of the value of the normalize_embeddings argument. This happens because the forward() method, includes all the loaded modules.
The text was updated successfully, but these errors were encountered:
Indeed, some models force the normalization in the architecture itself. In that case, it can't be turned off.
I don't really know why this is a big problem, though. I think most embeddings should be normalized - they're simply easier and cheaper to work with (e.g. you can use dot product directly to compute similarities).
I'm not sure how I feel about optionally disabling modules based on the normalize_embeddings argument, primarily because I don't know a use case where you really don't want normalized embeddings. I'm open to your thoughts on this! If it's important for something, then I'll definitely consider your proposal.
Hello!
For the sake of clarity for end-users, I believe the normalize_embeddings parameter might be misleading in this case, as some models completely ignore it. This could create confusion, making users think they have control over normalization when they actually don't.
I noticed that the norms of embeddings for most models are 1, I thought at first that the default value of the
normalize_embeddings
argument in theencode()
method is True, but I found that the default value of thenormalize_embeddings
is False. I found that if modules.json has a module of typesentence_transformers.models.Normalize
(2_Normalize
), the output embeddings will always be normalized, regardless of the value of thenormalize_embeddings
argument. This happens because theforward()
method, includes all the loaded modules.The text was updated successfully, but these errors were encountered: