MAISI tensor splitting across multiple devices #1883

ritchieyu · 2024-11-24T23:50:27Z

ritchieyu
Nov 24, 2024

Hi!

The MAISI paper describes how feature maps are split into segments, and each segment is allocated to a different device to perform convolutions. And in autoencoderkl_maisi.py, I do see that tensor splitting is done on Line 247, and convolutions are performed on each segment on Line 253. However, correct me if I'm wrong, but it seems that convolutions are all performed on the same device. I don't see an option to specify multiple devices, according to how TSP is described in the paper. If there doesn't exist this option, will it be added in a future release?

I see it's reported in the MAISI README that for 256x256x256 images with latent size of 4x64x64x64, peak training memory only hits 8G. I'm working with smaller images and a smaller latent size, but memory exceeds 30G (on 40G A100), so I'm wondering if TSP is the differentiating factor here.

Thanks!

Update: The memory issue was mainly due to incorrect use of transformations, but I'm still curious about the TSP question!

KumoLiu · 2024-11-25T05:59:20Z

KumoLiu
Nov 25, 2024
Maintainer

Hi @Can-Zhao, could you please help share some comments here? Thanks.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAISI tensor splitting across multiple devices #1883

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

MAISI tensor splitting across multiple devices #1883

ritchieyu Nov 24, 2024

Replies: 1 comment

KumoLiu Nov 25, 2024 Maintainer

ritchieyu
Nov 24, 2024

KumoLiu
Nov 25, 2024
Maintainer