Properly designate model state for actively training models when nodes crash or leave cluster #3577
Job | Run time |
---|---|
4s | |
23m 11s | |
25m 22s | |
24m 45s | |
20m 0s | |
30m 43s | |
17m 48s | |
28m 57s | |
14m 10s | |
14m 13s | |
3h 19m 13s |
Job | Run time |
---|---|
4s | |
23m 11s | |
25m 22s | |
24m 45s | |
20m 0s | |
30m 43s | |
17m 48s | |
28m 57s | |
14m 10s | |
14m 13s | |
3h 19m 13s |