You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I click "Start", the output log freezes here and the training never starts, no matter how long it runs.
This is my operating system environment:
Sytem: Ubuntu 22.04
CUDA Version: 12.04
Python Version: 3.11
GPU: NVIDIA H100 80GB HBM3
Driver Version: 550.90.07
[2025-01-10 03:47:50] [INFO] INFO create LoRA for Text Encoder 1: lora_flux.py:741
[2025-01-10 03:47:50] [INFO] INFO prepare CLIP-L for fp8: set to torch.float8_e4m3fn, set embeddings to torch.bfloat16 flux_train_network.py:511
[2025-01-10 03:47:50] [INFO] INFO create LoRA for Text Encoder 1: 72 modules. lora_flux.py:744
[2025-01-10 03:47:50] [INFO] INFO create LoRA for FLUX all blocks: 304 modules. lora_flux.py:765
[2025-01-10 03:47:50] [INFO] INFO enable LoRA for text encoder: 72 modules lora_flux.py:911
[2025-01-10 03:47:50] [INFO] INFO enable LoRA for U-Net: 304 modules lora_flux.py:916
[2025-01-10 03:47:50] [INFO] FLUX: Gradient checkpointing enabled. CPU offload: False
[2025-01-10 03:47:50] [INFO] INFO Text Encoder 1 (CLIP-L): 72 modules, LR 0.0008 lora_flux.py:1018
[2025-01-10 03:47:50] [INFO] INFO use 8-bit AdamW optimizer | {} train_util.py:4682
[2025-01-10 03:47:50] [INFO] INFO set U-Net weight dtype to torch.float8_e4m3fn train_network.py:631
[2025-01-10 03:47:50] [INFO] INFO prepare CLIP-L for fp8: set to torch.float8_e4m3fn, set embeddings to torch.bfloat16 flux_train_network.py:511
[2025-01-10 03:47:56] [INFO] running training / 学習開始
[2025-01-10 03:47:56] [INFO] num train images * repeats / 学習画像の数×繰り返し回数: 990
[2025-01-10 03:47:56] [INFO] num reg images / 正則化画像の数: 0
[2025-01-10 03:47:56] [INFO] num batches per epoch / 1epochのバッチ数: 495
[2025-01-10 03:47:56] [INFO] num epochs / epoch数: 16
[2025-01-10 03:47:56] [INFO] batch size per device / バッチサイズ: 1
[2025-01-10 03:47:56] [INFO] gradient accumulation steps / 勾配を合計するステップ数 = 1
[2025-01-10 03:47:56] [INFO] total optimization steps / 学習ステップ数: 7920
[2025-01-10 03:48:13] [INFO] 2025-01-10 03:48:13 INFO unet dtype: torch.float8_e4m3fn, device: cuda:1 train_network.py:1124
[2025-01-10 03:48:13] [INFO] INFO text_encoder [0] dtype: torch.float8_e4m3fn, device: cuda:1 train_network.py:1130
[2025-01-10 03:48:13] [INFO] INFO text_encoder [1] dtype: torch.bfloat16, device: cpu train_network.py:1130
[2025-01-10 03:48:14] [INFO] steps: 0%| | 0/7920 [00:00<?, ?it/s]2025-01-10 03:48:14 INFO unet dtype: torch.float8_e4m3fn, device: cuda:0 train_network.py:1124
[2025-01-10 03:48:14] [INFO] INFO text_encoder [0] dtype: torch.float8_e4m3fn, device: cuda:0 train_network.py:1130
[2025-01-10 03:48:14] [INFO] INFO text_encoder [1] dtype: torch.bfloat16, device: cpu train_network.py:1130
[2025-01-10 03:48:14] [INFO]
[2025-01-10 03:48:14] [INFO] epoch 1/16
[2025-01-10 03:48:14] [INFO] huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
[2025-01-10 03:48:14] [INFO] To disable this warning, you can either:
[2025-01-10 03:48:14] [INFO] - Avoid using `tokenizers` before the fork if possible
[2025-01-10 03:48:14] [INFO] - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[2025-01-10 03:48:14] [INFO] huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
[2025-01-10 03:48:14] [INFO] To disable this warning, you can either:
[2025-01-10 03:48:14] [INFO] - Avoid using `tokenizers` before the fork if possible
The text was updated successfully, but these errors were encountered:
When I click "Start", the output log freezes here and the training never starts, no matter how long it runs.
This is my operating system environment:
Sytem: Ubuntu 22.04
CUDA Version: 12.04
Python Version: 3.11
GPU: NVIDIA H100 80GB HBM3
Driver Version: 550.90.07
The text was updated successfully, but these errors were encountered: