DiT now supports sequence conditions. #923

ds-hwang · 2025-01-14T05:09:02Z

When using seq2seq models with DiT, the condition may have the same sequence length as the input.
For example:

Input shape: [batch, seq_len, dim]
Condition shape: [batch, seq_len, cond_dim]

AdaptiveLayerNormModulation now supports conditions in both [batch, cond_dim] and [batch, seq_len, cond_dim] formats. It outputs conditions in the shape [batch, 1|seq_len, cond_dim], depending on whether seq_len is present.

Accordingly, DiT has been updated to handle rank-3 conditions. The codebase has also become simpler. Previously, jnp.expand_dims was scattered across many places, but now AdaptiveLayerNormModulation adjusts the rank of the condition to match the input and returns it accordingly.

Speech detokenizer will use this DiT.

When using seq2seq models with DiT, the condition may have the same sequence length as the input. For example: - Input shape: `[batch, seq_len, dim]` - Condition shape: `[batch, seq_len, cond_dim]` AdaptiveLayerNormModulation now supports conditions in both `[batch, cond_dim]` and `[batch, seq_len, cond_dim]` formats. It outputs conditions in the shape `[batch, 1|seq_len, cond_dim]`, depending on whether `seq_len` is present. Accordingly, DiT has been updated to handle rank-3 conditions. The codebase has also become simpler. Previously, `jnp.expand_dims` was scattered across many places, but now `AdaptiveLayerNormModulation` adjusts the rank of the condition to match the input and returns it accordingly. Speech detokenizer will use this DiT.

ds-hwang · 2025-01-14T05:09:20Z

@ruomingp Could you take a look? From 979

ruomingp · 2025-01-14T06:22:20Z

axlearn/common/dit.py

        """
        cfg = self.config
        x = get_activation_fn(cfg.activation)(input)
        output = self.linear(x)
+        assert output.ndim in (2, 3)


Raise a ValueError instead of assert (which should only be used to enforce internal logic errors and cannot be triggered by user error).

As I clicked automerge, this your comment was not handled. I submitted #981 to handle it.

ds-hwang requested review from ruomingp, markblee and a team as code owners January 14, 2025 05:09

ds-hwang enabled auto-merge January 14, 2025 05:09

ruomingp approved these changes Jan 14, 2025

View reviewed changes

ds-hwang added this pull request to the merge queue Jan 14, 2025

Merged via the queue into apple:main with commit a946f91 Jan 14, 2025
6 checks passed

ds-hwang deleted the dit branch January 14, 2025 06:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DiT now supports sequence conditions. #923

DiT now supports sequence conditions. #923

ds-hwang commented Jan 14, 2025

ds-hwang commented Jan 14, 2025

ruomingp Jan 14, 2025

ds-hwang Jan 14, 2025

DiT now supports sequence conditions. #923

DiT now supports sequence conditions. #923

Conversation

ds-hwang commented Jan 14, 2025

ds-hwang commented Jan 14, 2025

ruomingp Jan 14, 2025

Choose a reason for hiding this comment

ds-hwang Jan 14, 2025

Choose a reason for hiding this comment