-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdeep-learning.html
389 lines (338 loc) · 17.4 KB
/
deep-learning.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Smile - Deep Learning</title>
<meta name="description" content="Statistical Machine Intelligence and Learning Engine">
<!-- prettify js and CSS -->
<script src="https://cdn.rawgit.com/google/code-prettify/master/loader/run_prettify.js?lang=scala&lang=kotlin&lang=clj"></script>
<style>
.prettyprint ol.linenums > li { list-style-type: decimal; }
</style>
<!-- Bootstrap core CSS -->
<link href="css/cerulean.min.css" rel="stylesheet">
<link href="css/custom.css" rel="stylesheet">
<script src="https://code.jquery.com/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js"></script>
<!-- slider -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.carousel.min.js"></script>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.carousel.css" type="text/css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.transitions.css" type="text/css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/owl-carousel/1.3.3/owl.theme.min.css" type="text/css" />
<!-- table of contents auto generator -->
<script src="js/toc.js" type="text/javascript"></script>
<!-- styles for pager and table of contents -->
<link rel="stylesheet" href="css/pager.css" type="text/css" />
<link rel="stylesheet" href="css/toc.css" type="text/css" />
<!-- Vega-Lite Embed -->
<script src="https://cdn.jsdelivr.net/npm/vega@5"></script>
<script src="https://cdn.jsdelivr.net/npm/vega-lite@5"></script>
<script src="https://cdn.jsdelivr.net/npm/vega-embed@6"></script>
<!-- Google tag (gtag.js) -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-57GD08QCML"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-57GD08QCML');
</script>
<!-- Sidebar and testimonial-slider -->
<script type="text/javascript">
$(document).ready(function(){
// scroll/follow sidebar
// #sidebar is defined in the content snippet
// This script has to be executed after the snippet loaded.
// $.getScript("js/follow-sidebar.js");
$("#testimonial-slider").owlCarousel({
items: 1,
singleItem: true,
pagination: true,
navigation: false,
loop: true,
autoPlay: 10000,
stopOnHover: true,
transitionStyle: "backSlide",
touchDrag: true
});
});
</script>
</head>
<body>
<div class="container" style="max-width: 1200px;">
<header>
<div class="masthead">
<p class="lead">
<a href="index.html">
<img src="images/smile.jpg" style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px; margin-right: 20px;">
<span class="tagline">Smile — Statistical Machine Intelligence and Learning Engine</span>
</a>
</p>
</div>
<nav class="navbar navbar-default" role="navigation">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#navbar-collapse">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<!-- Collect the nav links, forms, and other content for toggling -->
<div class="collapse navbar-collapse" id="navbar-collapse">
<ul class="nav navbar-nav">
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Overview <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="quickstart.html">Quick Start</a></li>
<li><a href="overview.html">What's Machine Learning</a></li>
<li><a href="data.html">Data Processing</a></li>
<li><a href="visualization.html">Data Visualization</a></li>
<li><a href="vegalite.html">Declarative Visualization</a></li>
<li><a href="gallery.html">Gallery</a></li>
<li><a href="faq.html">FAQ</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Supervised Learning <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="classification.html">Classification</a></li>
<li><a href="regression.html">Regression</a></li>
<li><a href="deep-learning.html">Deep Learning</a></li>
<li><a href="feature.html">Feature Engineering</a></li>
<li><a href="validation.html">Model Validation</a></li>
<li><a href="missing-value-imputation.html">Missing Value Imputation</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Unsupervised Learning <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="clustering.html">Clustering</a></li>
<li><a href="vector-quantization.html">Vector Quantization</a></li>
<li><a href="association-rule.html">Association Rule Mining</a></li>
<li><a href="mds.html">Multi-Dimensional Scaling</a></li>
<li><a href="manifold.html">Manifold Learning</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">LLM & NLP <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="llm.html">Large Language Model (LLM)</a></li>
<li><a href="nlp.html">Natural Language Processing (NLP)</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Math <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="linear-algebra.html">Linear Algebra</a></li>
<li><a href="statistics.html">Statistics</a></li>
<li><a href="wavelet.html">Wavelet</a></li>
<li><a href="interpolation.html">Interpolation</a></li>
<li><a href="graph.html">Graph Data Structure</a></li>
</ul>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown">API <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="api/java/index.html" target="_blank">Java</a></li>
<li><a href="api/scala/index.html" target="_blank">Scala</a></li>
<li><a href="api/kotlin/index.html" target="_blank">Kotlin</a></li>
<li><a href="api/clojure/index.html" target="_blank">Clojure</a></li>
<li><a href="api/json/index.html" target="_blank">JSON</a></li>
</ul>
</li>
<li><a href="https://mybinder.org/v2/gh/haifengl/smile/notebook?urlpath=lab%2Ftree%2Fshell%2Fsrc%2Funiversal%2Fnotebooks%2Findex.ipynb" target="_blank">Try It Online</a></li>
</ul>
</div>
<!-- /.navbar-collapse -->
</nav>
</header>
<div id="content" class="row">
<div class="col-md-3 col-md-push-9 hidden-xs hidden-sm">
<div id="sidebar">
<div class="sidebar-toc" style="margin-bottom: 20px;">
<p class="toc-header">Contents</p>
<div id="toc"></div>
</div>
<div id="search">
<script>
(function() {
var cx = '010264411143030149390:ajvee_ckdzs';
var gcse = document.createElement('script');
gcse.type = 'text/javascript';
gcse.async = true;
gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
'//cse.google.com/cse.js?cx=' + cx;
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(gcse, s);
})();
</script>
<gcse:searchbox-only></gcse:searchbox-only>
</div>
</div>
</div>
<div class="col-md-9 col-md-pull-3">
<h1 id="feature-top" class="title">Deep Learning</h1>
<p>Deep learning is based on artificial neural networks (ANNs)
with representation learning. The adjective "deep" refers to the use of
multiple layers in the network. Fundamentally, deep learning algorithms,
such as convolutional neural networks and transformers, leverage a hierarchy
of layers to transform input data into a slightly more abstract and composite
representation.</p>
<p>Importantly, a deep learning process can learn which features to optimally
place in which level on its own. Prior to deep learning, machine learning
techniques often involved hand-crafted feature engineering to transform
the data into a more suitable representation for a classification algorithm
to operate upon. In the deep learning approach, features are not hand-crafted
and the model discovers useful feature representations from the data automatically.
This does not eliminate the need for hand-tuning; for example, varying numbers
of layers and layer sizes can provide different degrees of abstraction.</p>
<p>While smile-core module provides MLP (multi-layer perceptron) for classification
and regression tasks on tabular data, smile-deep module provides advanced
algorithms for computer vision and large language models (LLMs). Furthermore,
smile-deep supports GPU devices.</p>
<h2 id="mnist">A Gentle Example</h2>
<p>In the below code snippets, we show how to train a model on MNIST dataset.
On line 5, we call the function <code>Device.preferredDevice()</code> that
will return a GPU device if it exists, otherwise the default CPU device.
You can also create a Device object by calling its factory methods such as
<code>Device.GPU(0)</code>, <code>Device.MPS()</code>, or <code>Device.CPU()</code>.
Then we set the returned device as the default compute device. Line 5 and 6 are optional.
Without them, we will use CPU as the default compute device.</p>
<p>On Line 8, we define a deep learning model with a sequential block of layers.
For complicated models, it is helpful to print out its structure for verification
as we do on Line 14. On Line 15, we move the model to the preferred compute
device.</p>
<ul class="nav nav-tabs">
<li class="active"><a href="#java_1" data-toggle="tab">Java</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane active" id="java_1">
<div class="code" style="text-align: left;">
<pre class="prettyprint linenums lang-java">
<code>import smile.deep.layer.*;
import smile.deep.metric.*;
import smile.deep.tensor.*;
Device device = Device.preferredDevice();
device.setDefaultDevice();
Model net = new Model(new SequentialBlock(
Layer.relu(784, 64, 0.5),
Layer.relu(64, 32),
Layer.logSoftmax(32, 10))
);
System.out.println(net);
net.to(device);
CSVFormat format = CSVFormat.Builder.create().setDelimiter(' ').build();
double[][] x = Read.csv("data/mnist/mnist2500_X.txt", format).toArray();
int[] y = Read.csv("data/mnist/mnist2500_labels.txt", format).column(0).toIntArray();
Dataset dataset = Dataset.of(x, y, 64);
Optimizer optimizer = Optimizer.SGD(net, 0.01);
Loss loss = Loss.nll();
net.train(100, optimizer, loss, dataset);
try (var guard = Tensor.noGradGuard()) {
Map<String, Double> metrics = net.eval(dataset,
new Accuracy(),
new Precision(Averaging.Micro),
new Precision(Averaging.Macro),
new Precision(Averaging.Weighted),
new Recall(Averaging.Micro),
new Recall(Averaging.Macro),
new Recall(Averaging.Weighted));
for (var entry : metrics.entrySet()) {
System.out.format("Training %s = %.2f%%\n", entry.getKey(), 100 * entry.getValue());
}
}</code></pre>
</div>
</div>
</div>
<p>From line 17 to 19, we load a sample data of MNIST. This is same as we used to do with
smile-core. The data are read in as plain <code>double[][]</code>. Then on line 20, we
create a <code>Dataset</code> object that wraps the data and target labels.
The <code>Dataset</code> object implements the <code>Iterable</code> interface
so that it may emit mini-batch samples of size 64 as specified by the third parameter
if we loop through it.</p>
<p>From line 22 to 24, we create an SGD (stochastic gradient descent) optimizer,
the negative log-likelihood (NLL) loss function, and train the model for 100 epochs.
The whole process should finish very quickly (e.g. 15 seconds with CPU). Finally,
we evaluate the model with a variety of metrics from line 26 to 38. Note that
the evaluation is on the training data only for demonstration purpose. In practice,
it is better to evaluate on a hold-out test dataset. On line 26, we create a
no-grad guard in a try-with statement to prevent gradient computation. The inference
code should be inside this try-with block. This is very helpful for inference
as it minimizes the memory usage and avoids a lot of unnecessary computation. The guard
object will be automatically released after the code block finishes.</p>
<h2 id="efficient-net">EfficientNet</h2>
<p>In previous section, we train a model from scratch. In this section, we demonstrate
image classification with pretrained EfficientNetV2 models.
EfficientNetV2 is a new family of convolutional networks that have faster training
speed and better parameter efficiency than previous models.</p>
<p>On line 1, we create an instance of EfficientNet V2_S (small) model, which will load
the pretrained weights at <code>model/EfficientNet/efficientnet_v2_s.pt</code>
from the working directory. You may download the weights from
<a href="https://smile-ai.org/model/EfficientNet/efficientnet_v2_s.pt">smile-ai.org</a>.</p>
<ul class="nav nav-tabs">
<li class="active"><a href="#java_2" data-toggle="tab">Java</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane active" id="java_2">
<div class="code" style="text-align: left;">
<pre class="prettyprint linenums lang-java">
<code>var model = EfficientNet.V2S();
model.to(device);
model.eval();
var lenna = ImageIO.read(new File("data/image/Lenna.png"));
var panda = ImageIO.read(new File("data/image/panda.jpg"));
try (var guard = Tensor.noGradGuard()) {
long startTime = System.nanoTime();
var output = model.forward(panda);
long endTime = System.nanoTime();
long duration = (endTime - startTime) / 1000000; //divide by 1000000 to get milliseconds.
System.out.println("1st run elapsed time: " + duration + "ms");
startTime = System.nanoTime();
output = model.forward(lenna, panda);
endTime = System.nanoTime();
duration = (endTime - startTime) / 1000000;
System.out.println("2nd run elapsed time: " + duration + "ms");
var topk = output.topk(5);
topk._2().to(Device.CPU());
String[] images = {"Lenna", "Panda"};
for (int i = 0; i < 2; i++) {
System.out.println("======== " + images[i] + " ========");
for (int j = 0; j < 5; j++) {
System.out.println(ImageNet.labels[topk._2().getInt(i, j)]);
}
}</code></pre>
</div>
</div>
</div>
<p>Note that we run the inference twice for benchmarking.
The first inference is typically slow due to multiple reasons.
The very first CUDA call (it could be a tensor creation etc.)
is creating the CUDA context, which loads the driver etc.
The first inference also needs to allocate new memory, which will then
be reused through the CUDACachingAllocator. However, the initial
cudaMalloc calls are also "expensive" (compared to just reusing the
already allocated memory) and you would thus also
see a slow iteration time until your workload reached the peak memory
and is able to reuse the GPU memory. Note that new cudaMalloc calls
could of course still happen during the training, e.g. if your input
size increases etc.</p>
<div id="btnv">
<span class="btn-arrow-left">← </span>
<a class="btn-prev-text" href="regression.html" title="Previous Section: Regression"><span>Regression</span></a>
<a class="btn-next-text" href="validation.html" title="Next Section: Model Validation"><span>Model Validation</span></a>
<span class="btn-arrow-right"> →</span>
</div>
</div>
<script type="text/javascript">
$('#toc').toc({exclude: 'h1, h5, h6', context: '', autoId: true, numerate: false});
</script>
</div>
</div>
<a href=https://github.com/haifengl/smile><img style="position: fixed; top: 0; right: 0; border: 0" src=/images/forkme_right_orange.png alt="Fork me on GitHub"></a>
<!-- Place this tag right after the last button or just before your close body tag. -->
<script async defer id="github-bjs" src="https://buttons.github.io/buttons.js"></script>
</body>
</html>