-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathclip_results_discussion.txt
186 lines (165 loc) · 7.38 KB
/
clip_results_discussion.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
Ex1)
Configurations
*****************************************************************************************
batch_size = 32 # batch size
vocab_threshold = 6 # minimum word count threshold
vocab_from_file = False # if True, load existing vocab file
embed_size = 512 # dimensionality of image and word embeddings
hidden_size = 512 # number of features in hidden state of the RNN decoder
num_epochs = 1 # number of training epochs (1 for testing)
save_every = 1 # determines frequency of saving model weights
print_every = 200 # determines window for printing average loss
log_file = 'training_log.txt' # name of file with saved training loss and perplexity
Epoch [1/1], Step [200/3143], Loss: 3.5503, Perplexity: 34.8221
Epoch [1/1], Step [400/3143], Loss: 3.8901, Perplexity: 48.9169
Epoch [1/1], Step [600/3143], Loss: 3.5742, Perplexity: 35.6675
Epoch [1/1], Step [800/3143], Loss: 3.2025, Perplexity: 24.5949
Epoch [1/1], Step [1000/3143], Loss: 3.6232, Perplexity: 37.4565
Epoch [1/1], Step [1200/3143], Loss: 3.0515, Perplexity: 21.1470
Epoch [1/1], Step [1400/3143], Loss: 3.1357, Perplexity: 23.0057
Epoch [1/1], Step [1600/3143], Loss: 2.9424, Perplexity: 18.9607
Epoch [1/1], Step [1800/3143], Loss: 2.9210, Perplexity: 18.5604
Epoch [1/1], Step [2000/3143], Loss: 3.2989, Perplexity: 27.0816
Epoch [1/1], Step [2200/3143], Loss: 3.0814, Perplexity: 21.7880
Epoch [1/1], Step [2400/3143], Loss: 2.6065, Perplexity: 13.5512
Epoch [1/1], Step [2600/3143], Loss: 2.7144, Perplexity: 15.0958
Epoch [1/1], Step [2800/3143], Loss: 3.1453, Perplexity: 23.2276
Epoch [1/1], Step [3000/3143], Loss: 2.7393, Perplexity: 15.4758
Epoch [1/1], Step [3143/3143], Loss: 2.6219, Perplexity: 13.7618Running Validation...
Validation Loss for epoch 1: 4.5145673751831055
Avg Training Loss: 3.1465
Exp2)
same config with linear+BN) https://stackoverflow.com/questions/37232782/nan-loss-when-training-regression-network?fbclid=IwAR3_ZVurStv20oZ-fvsUQj81PYsPSwGiBdjLJTLGRhLYJBK8s-ui1hfxrC0
nan values as paramas
Exp3)
same confing, num_epochs = 5
Training loss for epoch 1: 3.1383
Validation loss for epoch 1: 3.9793
Training loss for epoch 2: 2.5065
Validation loss for epoch 2: 4.3469
Training loss for epoch 3: 2.2631
Validation loss for epoch 3: 4.7330
Training loss for epoch 4: 2.0725
Validation loss for epoch 4: 4.6994
Training loss for epoch 5: 1.9564
Validation loss for epoch 5: 4.9221
Exp4)
1 epoch
Attention-based:
Epoch [1/1], Step [200/3143], Loss: 5.5314
Epoch [1/1], Step [400/3143], Loss: 5.3541
Epoch [1/1], Step [600/3143], Loss: 4.5658
Epoch [1/1], Step [800/3143], Loss: 4.6533
Epoch [1/1], Step [1000/3143], Loss: 4.7537
Epoch [1/1], Step [1200/3143], Loss: 4.7449
Epoch [1/1], Step [1400/3143], Loss: 4.3444
Epoch [1/1], Step [1600/3143], Loss: 4.4949
Epoch [1/1], Step [1800/3143], Loss: 3.9708
Epoch [1/1], Step [2000/3143], Loss: 4.1330
Epoch [1/1], Step [2200/3143], Loss: 3.9356
Epoch [1/1], Step [2400/3143], Loss: 3.7760
Epoch [1/1], Step [2600/3143], Loss: 4.0477
Epoch [1/1], Step [2800/3143], Loss: 4.3807
Epoch [1/1], Step [3000/3143], Loss: 4.0358
Epoch [1/1], Step [3143/3143], Loss: 4.7389
Training Loss for epoch 1: 3.871891498565674
Running Validation...
Validation Loss for epoch 1: 4.753614902496338
Exp5)
1:5 epochs
Same config
Epoch [1/5], Step [200/3143], Loss: 4.7409
Epoch [1/5], Step [400/3143], Loss: 4.8517
Epoch [1/5], Step [600/3143], Loss: 4.7153
Epoch [1/5], Step [800/3143], Loss: 4.3040
Epoch [1/5], Step [1000/3143], Loss: 4.8538
Epoch [1/5], Step [1200/3143], Loss: 4.0638
Epoch [1/5], Step [1400/3143], Loss: 4.7034
Epoch [1/5], Step [1600/3143], Loss: 4.2157
Epoch [1/5], Step [1800/3143], Loss: 4.6249
Epoch [1/5], Step [2000/3143], Loss: 4.4693
Epoch [1/5], Step [2200/3143], Loss: 4.1417
Epoch [1/5], Step [2400/3143], Loss: 4.0952
Epoch [1/5], Step [2600/3143], Loss: 4.2739
Epoch [1/5], Step [2800/3143], Loss: 4.3693
Epoch [1/5], Step [3000/3143], Loss: 3.9678
Epoch [1/5], Step [3143/3143], Loss: 4.1132
Training Loss for epoch 1: 3.8724825382232666
Running Validation...
Validation Loss for epoch 1: 4.651616096496582
Epoch [2/5], Step [200/3143], Loss: 3.7792
Epoch [2/5], Step [400/3143], Loss: 4.0902
Epoch [2/5], Step [600/3143], Loss: 3.6627
Epoch [2/5], Step [800/3143], Loss: 3.7513
Epoch [2/5], Step [1000/3143], Loss: 4.2731
Epoch [2/5], Step [1200/3143], Loss: 3.7292
Epoch [2/5], Step [1400/3143], Loss: 4.0366
Epoch [2/5], Step [1600/3143], Loss: 3.9708
Epoch [2/5], Step [1800/3143], Loss: 3.4851
Epoch [2/5], Step [2000/3143], Loss: 3.6514
Epoch [2/5], Step [2200/3143], Loss: 4.0710
Epoch [2/5], Step [2400/3143], Loss: 3.4409
Epoch [2/5], Step [2600/3143], Loss: 4.0037
Epoch [2/5], Step [2800/3143], Loss: 3.7322
Epoch [2/5], Step [3000/3143], Loss: 4.0506
Epoch [2/5], Step [3143/3143], Loss: 3.6916
Training Loss for epoch 2: 3.2557592391967773
Running Validation...
Validation Loss for epoch 2: 4.650144100189209
Epoch [3/5], Step [200/3143], Loss: 3.7858
Epoch [3/5], Step [400/3143], Loss: 3.6788
Epoch [3/5], Step [600/3143], Loss: 3.5691
Epoch [3/5], Step [800/3143], Loss: 3.8178
Epoch [3/5], Step [1000/3143], Loss: 3.7697
Epoch [3/5], Step [1200/3143], Loss: 3.8784
Epoch [3/5], Step [1400/3143], Loss: 3.3156
Epoch [3/5], Step [1600/3143], Loss: 3.6895
Epoch [3/5], Step [1800/3143], Loss: 3.5150
Epoch [3/5], Step [2000/3143], Loss: 3.5123
Epoch [3/5], Step [2200/3143], Loss: 3.9954
Epoch [3/5], Step [2400/3143], Loss: 3.9180
Epoch [3/5], Step [2600/3143], Loss: 3.5604
Epoch [3/5], Step [2800/3143], Loss: 3.6350
Epoch [3/5], Step [3000/3143], Loss: 3.6434
Epoch [3/5], Step [3143/3143], Loss: 3.5227
Training Loss for epoch 3: 3.0523741245269775
Running Validation...
Validation Loss for epoch 3: 4.408127784729004
Epoch [4/5], Step [200/3143], Loss: 3.5532
Epoch [4/5], Step [400/3143], Loss: 3.5206
Epoch [4/5], Step [600/3143], Loss: 3.4521
Epoch [4/5], Step [800/3143], Loss: 3.2871
Epoch [4/5], Step [1000/3143], Loss: 3.3661
Epoch [4/5], Step [1200/3143], Loss: 3.4460
Epoch [4/5], Step [1400/3143], Loss: 3.4313
Epoch [4/5], Step [1600/3143], Loss: 3.7750
Epoch [4/5], Step [1800/3143], Loss: 3.6470
Epoch [4/5], Step [2000/3143], Loss: 3.3601
Epoch [4/5], Step [2200/3143], Loss: 3.6185
Epoch [4/5], Step [2400/3143], Loss: 3.3269
Epoch [4/5], Step [2600/3143], Loss: 3.3619
Epoch [4/5], Step [2800/3143], Loss: 3.3262
Epoch [4/5], Step [3000/3143], Loss: 3.4191
Epoch [4/5], Step [3143/3143], Loss: 3.6353
Training Loss for epoch 4: 2.9163501262664795
Running Validation...
Validation Loss for epoch 4: 4.456430435180664
Epoch [5/5], Step [200/3143], Loss: 3.1417
Epoch [5/5], Step [400/3143], Loss: 3.8200
Epoch [5/5], Step [600/3143], Loss: 3.3572
Epoch [5/5], Step [800/3143], Loss: 3.5046
Epoch [5/5], Step [1000/3143], Loss: 3.5043
Epoch [5/5], Step [1200/3143], Loss: 3.2457
Epoch [5/5], Step [1400/3143], Loss: 3.3165
Epoch [5/5], Step [1600/3143], Loss: 3.4888
Epoch [5/5], Step [1800/3143], Loss: 3.2251
Epoch [5/5], Step [2000/3143], Loss: 3.2397
Epoch [5/5], Step [2200/3143], Loss: 3.5427
Epoch [5/5], Step [2400/3143], Loss: 3.0406
Epoch [5/5], Step [2600/3143], Loss: 3.5680
Epoch [5/5], Step [2800/3143], Loss: 3.4652
Epoch [5/5], Step [3000/3143], Loss: 3.0333
Epoch [5/5], Step [3143/3143], Loss: 3.4057
Training Loss for epoch 5: 2.8139476776123047
Running Validation...
Validation Loss for epoch 5: 4.51342248916626