Installation error/ training error #416

Enslavedpixels · 2023-03-21T18:45:48Z

Enslavedpixels
Mar 21, 2023

I ran the PowerShell installation thingy after I did all the prerequisite steps, but it during the installation a bunch of red text appears. Part of the installation output I will place below (seems to by an issue with installing pytorch). It does proceed to the accelerate config (after a suspiciously short time, but then trying to train a model it returns another error.

Installation error (not full installation output):

Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu116
Collecting torch==1.12.1+cu116
Downloading https://download.pytorch.org/whl/cu116/torch-1.12.1%2Bcu116-cp310-cp310-win_amd64.whl (2388.4 MB)
- -------------------------------------- 0.1/2.4 GB 366.0 kB/s eta 1:43:43
ERROR: Exception:
Traceback (most recent call last):
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\urllib3\response.py", line 437, in _error_catcher
yield
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\urllib3\response.py", line 560, in read
data = self._fp_read(amt) if not fp_closed else b""
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\urllib3\response.py", line 526, in _fp_read
return self._fp.read(amt) if amt is not None else self._fp.read()
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\cachecontrol\filewrapper.py", line 90, in read
data = self.__fp.read(amt)
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 465, in read
s = self.fp.read(amt)
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\socket.py", line 705, in readinto
return self._sock.recv_into(b)
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 1274, in recv_into
return self.read(nbytes, buffer)
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\ssl.py", line 1130, in read
return self._sslobj.read(len, buffer)
TimeoutError: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\cli\base_command.py", line 160, in exc_logging_wrapper
status = run_func(*args)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\cli\req_command.py", line 247, in wrapper
return func(self, options, args)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\commands\install.py", line 400, in run
requirement_set = resolver.resolve(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\resolver.py", line 92, in resolve
result = self._result = resolver.resolve(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\resolvelib\resolvers.py", line 481, in resolve
state = resolution.resolve(requirements, max_rounds=max_rounds)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\resolvelib\resolvers.py", line 348, in resolve
self._add_to_criteria(self.state.criteria, r, parent=None)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\resolvelib\resolvers.py", line 172, in _add_to_criteria
if not criterion.candidates:
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\resolvelib\structs.py", line 151, in bool
return bool(self._sequence)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\found_candidates.py", line 155, in bool
return any(self)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\found_candidates.py", line 143, in
return (c for c in iterator if id(c) not in self._incompatible_ids)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\found_candidates.py", line 47, in _iter_built
candidate = func()
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\factory.py", line 206, in _make_candidate_from_link
self._link_candidate_cache[link] = LinkCandidate(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 297, in init
super().init(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 162, in init
self.dist = self._prepare()
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 231, in _prepare
dist = self._prepare_distribution()
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\resolution\resolvelib\candidates.py", line 308, in _prepare_distribution
return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\operations\prepare.py", line 491, in prepare_linked_requirement
return self._prepare_linked_requirement(req, parallel_builds)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\operations\prepare.py", line 536, in _prepare_linked_requirement
local_file = unpack_url(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\operations\prepare.py", line 166, in unpack_url
file = get_http_url(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\operations\prepare.py", line 107, in get_http_url
from_path, content_type = download(link, temp_dir.path)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\network\download.py", line 147, in call
for chunk in chunks:
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\cli\progress_bars.py", line 53, in _rich_progress_bar
for chunk in iterable:
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_internal\network\utils.py", line 63, in response_chunks
for chunk in response.raw.stream(
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\urllib3\response.py", line 621, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\urllib3\response.py", line 559, in read
with self._error_catcher():
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\contextlib.py", line 153, in exit
self.gen.throw(typ, value, traceback)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\pip_vendor\urllib3\response.py", line 442, in _error_catcher
raise ReadTimeoutError(self._pool, None, "Read timed out.")
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='download.pytorch.org', port=443): Read timed out.

This is the error I get when trying to train:

Folder 100_pics: 54 images found
Folder 100_pics: 5400 steps
max_train_steps = 5400
stop_text_encoder_training = 0
lr_warmup_steps = 540
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="E:/diffusion/nai/stable-diffusion-webui/models/Stable-diffusion/f222.ckpt" --train_data_dir="E:\diffusion\lora train\pics\pics" --resolution=512,512 --output_dir="E:\diffusion\lora train\pics\model" --logging_dir="E:\diffusion\lora train\pics\log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="540" --train_batch_size="1" --max_train_steps="5400" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --bucket_reso_steps=64 --xformers --bucket_no_upscale
Could not find module 'E:\diffusion\lora\kohya_ss\venv\Lib\site-packages\xformers_C.pyd' (or one of its dependencies). Try using the full path with constructor syntax.
WARNING:root:WARNING: Could not find module 'E:\diffusion\lora\kohya_ss\venv\Lib\site-packages\xformers_C.pyd' (or one of its dependencies). Try using the full path with constructor syntax.
Need to compile C++ extensions to get sparse attention suport. Please run python setup.py build develop
prepare tokenizer
Use DreamBooth method.
prepare images.
found directory E:\diffusion\lora train\pics\pics\100_pics contains 54 image files
5400 train images with repeating.
0 reg images.
no regularization images / 正則化画像が見つかりませんでした
[Dataset 0]
batch_size: 1
resolution: (512, 512)
enable_bucket: True
min_bucket_reso: 256
max_bucket_reso: 1024
bucket_reso_steps: 64
bucket_no_upscale: True

[Subset 0 of Dataset 0]
image_dir: "E:\diffusion\lora train\pics\pics\100_pics"
image_count: 54
num_repeats: 100
shuffle_caption: False
keep_tokens: 0
caption_dropout_rate: 0.0
caption_dropout_every_n_epoches: 0
caption_tag_dropout_rate: 0.0
color_aug: False
flip_aug: False
face_crop_aug_range: None
random_crop: False
is_reg: False
class_tokens: pics
caption_extension: .caption

[Dataset 0]
loading image sizes.
100%|██████████████████████████████████████████████████████████████████████████████████| 54/54 [00:00<00:00, 54.97it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数（繰り返し回数を含む）
bucket 0: resolution (192, 128), count: 100
bucket 1: resolution (320, 448), count: 100
bucket 2: resolution (384, 320), count: 100
bucket 3: resolution (384, 512), count: 100
bucket 4: resolution (384, 576), count: 2000
bucket 5: resolution (384, 640), count: 100
bucket 6: resolution (448, 448), count: 100
bucket 7: resolution (576, 384), count: 2100
bucket 8: resolution (640, 384), count: 700
mean ar error (without repeats): 0.019370974714165847
prepare accelerator
Traceback (most recent call last):
File "E:\diffusion\lora\kohya_ss\train_network.py", line 659, in
train(args)
File "E:\diffusion\lora\kohya_ss\train_network.py", line 108, in train
accelerator, unwrap_model = train_util.prepare_accelerator(args)
File "E:\diffusion\lora\kohya_ss\library\train_util.py", line 1984, in prepare_accelerator
accelerator = Accelerator(gradient_accumulation_steps=args.gradient_accumulation_steps, mixed_precision=args.mixed_precision,
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\accelerator.py", line 355, in init
raise ValueError(err.format(mode="fp16", requirement="a GPU"))
ValueError: fp16 mixed precision requires a GPU
Traceback (most recent call last):
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "E:\diffusion\lora\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['E:\diffusion\lora\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=E:/diffusion/nai/stable-diffusion-webui/models/Stable-diffusion/f222.ckpt', '--train_data_dir=E:\diffusion\lora train\pics\pics', '--resolution=512,512', '--output_dir=E:\diffusion\lora train\pics\model', '--logging_dir=E:\diffusion\lora train\pics\log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=540', '--train_batch_size=1', '--max_train_steps=5400', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

noobbuddy · 2023-03-27T16:07:50Z

noobbuddy
Mar 27, 2023

Bro i am getting exact same error, if you get the solution please let me know.

4 replies

Enslavedpixels Mar 27, 2023
Author

Will do

ilikespace1 Mar 28, 2023

@noobbuddy @Enslavedpixels I manually changed the code for kohya_ss\library\train_util.py to the one here https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py and that fixed that issue for me.

noobbuddy Mar 28, 2023

@noobbuddy @Enslavedpixels I manually changed the code for kohya_ss\library\train_util.py to the one here https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py and that fixed that issue for me.

it`s the same. Nothing has changed here.

Enslavedpixels Mar 28, 2023
Author

@ilikespace1 I manually changed the code for kohya_ss\library\train_util.py to the one here https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py and that fixed that issue for me.

It's now giving my a similar, shorter error.

Folder 100_pics: 54 images found
Folder 100_pics: 5400 steps
max_train_steps = 5400
stop_text_encoder_training = 0
lr_warmup_steps = 540
accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" --train_data_dir="E:/diffusion/lora train/pics/pics" --resolution=512,512 --output_dir="E:/diffusion/lora train/pics/model" --logging_dir="E:/diffusion/lora train/pics/log" --network_alpha="1" --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-5 --unet_lr=0.0001 --network_dim=8 --output_name="last" --lr_scheduler_num_cycles="1" --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="540" --train_batch_size="1" --max_train_steps="5400" --save_every_n_epochs="1" --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit" --bucket_reso_steps=64 --xformers --bucket_no_upscale
Could not find module 'E:\diffusion\lora\kohya_ss\venv\Lib\site-packages\xformers_C.pyd' (or one of its dependencies). Try using the full path with constructor syntax.
WARNING:root:WARNING: Could not find module 'E:\diffusion\lora\kohya_ss\venv\Lib\site-packages\xformers_C.pyd' (or one of its dependencies). Try using the full path with constructor syntax.
Need to compile C++ extensions to get sparse attention suport. Please run python setup.py build develop
Traceback (most recent call last):
File "E:\diffusion\lora\kohya_ss\train_network.py", line 16, in
import library.train_util as train_util
File "E:\diffusion\lora\kohya_ss\library\train_util.py", line 59, in
from library.lpw_stable_diffusion import StableDiffusionLongPromptWeightingPipeline
ModuleNotFoundError: No module named 'library.lpw_stable_diffusion'
Traceback (most recent call last):
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "E:\diffusion\lora\kohya_ss\venv\Scripts\accelerate.exe_main.py", line 7, in
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
args.func(args)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
simple_launcher(args)
File "E:\diffusion\lora\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['E:\diffusion\lora\kohya_ss\venv\Scripts\python.exe', 'train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=E:/diffusion/lora train/pics/pics', '--resolution=512,512', '--output_dir=E:/diffusion/lora train/pics/model', '--logging_dir=E:/diffusion/lora train/pics/log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=8', '--output_name=last', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=540', '--train_batch_size=1', '--max_train_steps=5400', '--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents', '--optimizer_type=AdamW8bit', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale']' returned non-zero exit status 1.

Enslavedpixels · 2023-04-11T13:47:53Z

Enslavedpixels
Apr 11, 2023
Author

I still haven't found a solution, but reinstalling it is now giving me a different error,

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Installation error/ training error #416

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Installation error/ training error #416

Enslavedpixels Mar 21, 2023

Replies: 2 comments · 4 replies

noobbuddy Mar 27, 2023

Enslavedpixels Mar 27, 2023 Author

ilikespace1 Mar 28, 2023

noobbuddy Mar 28, 2023

Enslavedpixels Mar 28, 2023 Author

Enslavedpixels Apr 11, 2023 Author

Enslavedpixels
Mar 21, 2023

Replies: 2 comments 4 replies

noobbuddy
Mar 27, 2023

Enslavedpixels Mar 27, 2023
Author

Enslavedpixels Mar 28, 2023
Author

Enslavedpixels
Apr 11, 2023
Author