forked from ispc/ispc.github.com
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathperf.html
200 lines (196 loc) · 8.85 KB
/
perf.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.15.2: http://docutils.sourceforge.net/" />
<title>Performance</title>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-1486404-4']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
<link rel="stylesheet" href="css/style.css" type="text/css" />
</head>
<body>
<div class="document" id="performance">
<div id="wrap">
<div id="wrap2">
<div id="header">
<h1 id="logo">Intel® Implicit SPMD Program Compiler</h1>
<div id="slogan">An open-source compiler for high-performance SIMD programming on
the CPU</div>
</div>
<div id="nav">
<div id="nbar">
<ul>
<li><a href="index.html">Overview</a></li>
<li><a href="features.html">Features</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="documentation.html">Documentation</a></li>
<li id="selected"><a href="perf.html">Performance</a></li>
<li><a href="contrib.html">Contributors</a></li>
</ul>
</div>
</div>
<div id="content-wrap">
<div id="sidebar">
<div class="widgetspace">
<h1>Resources</h1>
<ul class="menu">
<li><a href="http://github.com/ispc/ispc/">ispc page on github</a></li>
<li><a href="http://groups.google.com/group/ispc-users/">ispc
users mailing list</a></li>
<li><a href="http://groups.google.com/group/ispc-dev/">ispc
developers mailing list</a></li>
<li><a href="http://github.com/ispc/ispc/wiki/">Wiki</a></li>
<li><a href="http://github.com/ispc/ispc/issues/">Bug tracking</a></li>
</ul>
</div>
</div>
<h1 class="title">Performance</h1>
<div id="content">
<p>The SPMD programming model that <tt class="docutils literal">ispc</tt> makes it easy to harness the
computational power available in SIMD vector units on modern CPUs, while
its basis in C makes it easy for programmers to adopt and use
productively. This page summarizes the performance of <tt class="docutils literal">ispc</tt> with the
workloads in the <tt class="docutils literal">examples/</tt> directory of the <tt class="docutils literal">ispc</tt> distribution.</p>
<p>These results were measured on a 4-core Apple iMac with a 4-core 3.4GHz
Intel® Core-i7 processor using the Intel® AVX instruction set. The basis
for comparison is a reference C++ implementation compiled with gcc 4.2.1,
the version distributed with OS X 10.7.2. (The reference implementation is
also included in the <tt class="docutils literal">examples/</tt> directory.)</p>
<table border="1" class="docutils">
<caption>Performance of <tt class="docutils literal">ispc</tt> with a variety of the workloads
from the <tt class="docutils literal">examples/</tt> directory of the <tt class="docutils literal">ispc</tt> distribution, compared
a reference C++ implementation compiled with gcc 4.2.1.</caption>
<colgroup>
<col width="33%" />
<col width="33%" />
<col width="33%" />
</colgroup>
<tbody valign="top">
<tr><td>Workload</td>
<td><tt class="docutils literal">ispc</tt>, 1 core</td>
<td><tt class="docutils literal">ispc</tt>, 4 cores</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/aobench">AOBench</a> (512 x 512 resolution)</td>
<td>6.19x</td>
<td>28.06x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/options">Binomial Options</a> (128k options)</td>
<td>7.94x</td>
<td>33.43x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/options">Black-Scholes Options</a> (128k options)</td>
<td>8.45x</td>
<td>32.48x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/deferred">Deferred Shading</a> (1280p)</td>
<td>5.02x</td>
<td>23.06x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/mandelbrot_tasks">Mandelbrot Set</a></td>
<td>6.21x</td>
<td>20.28x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/noise">Perlin Noise Function</a></td>
<td>5.37x</td>
<td>n/a</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/rt">Ray Tracer</a> (Sponza dataset)</td>
<td>4.31x</td>
<td>20.29x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/stencil">3D Stencil</a></td>
<td>4.05x</td>
<td>15.53x</td>
</tr>
<tr><td><a class="reference external" href="https://github.com/ispc/ispc/tree/master/examples/volume_rendering">Volume Rendering</a></td>
<td>3.60x</td>
<td>17.53x</td>
</tr>
</tbody>
</table>
<p>The following table shows speedups for a number of the examples on a
2.40GHz, 40-core Intel® Xeon E7-8870 system with the Intel® SSE4
instruction set, running Microsoft Windows Server 2008 Enterprise. Here,
the serial C/C++ baseline code was compiled with MSVC 2010.</p>
<table border="1" class="docutils">
<caption>Performance of <tt class="docutils literal">ispc</tt> with a variety of the workloads
from the <tt class="docutils literal">examples/</tt> directory of the <tt class="docutils literal">ispc</tt> distribution, on
system with 40 CPU cores.</caption>
<colgroup>
<col width="50%" />
<col width="50%" />
</colgroup>
<tbody valign="top">
<tr><td>Workload</td>
<td><tt class="docutils literal">ispc</tt>, 40 cores</td>
</tr>
<tr><td>AOBench (2048 x 2048 resolution)</td>
<td>182.36x</td>
</tr>
<tr><td>Binomial Options (2m options)</td>
<td>63.85x</td>
</tr>
<tr><td>Black-Scholes Options (2m options)</td>
<td>83.97x</td>
</tr>
<tr><td>Ray Tracer (Sponza dataset)</td>
<td>195.67x</td>
</tr>
<tr><td>Volume Rendering</td>
<td>243.18x</td>
</tr>
</tbody>
</table>
<div class="section" id="notices-disclaimers">
<h1>Notices & Disclaimers</h1>
<p>Software and workloads used in performance tests may have been optimized for
performance only on Intel microprocessors.</p>
<p>Performance tests, such as SYSmark and MobileMark, are measured using specific
computer systems, components, software, operations and functions. Any change
to any of those factors may cause the results to vary. You should consult
other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined
with other products. For more complete information visit
www.intel.com/benchmarks.</p>
<p>Performance results are based on testing as of dates shown in configurations and
may not reflect all publicly available updates. See backup for configuration
details. No product or component can be absolutely secure.</p>
<p>Your costs and results may vary.</p>
<p>Intel technologies may require enabled hardware, software or service activation.</p>
<p>© Intel Corporation. Intel, the Intel logo, and other Intel marks are
trademarks of Intel Corporation or its subsidiaries. Other names and brands may
be claimed as the property of others.</p>
</div>
<div class="section" id="optimization-notice">
<h1>Optimization Notice</h1>
<p>Intel's compilers may or may not optimize to the same degree for non-Intel
microprocessors for optimizations that are not unique to Intel microprocessors.
These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with
Intel microprocessors. Certain optimizations not specific to Intel
microarchitecture are reserved for Intel microprocessors. Please refer to the
applicable product User and Reference Guides for more information regarding the
specific instruction sets covered by this notice.</p>
</div>
</div>
<div class="clearfix"></div>
<div id="footer"> © <strong>Intel Corporation</strong> | Valid <a href="http://validator.w3.org/check?uri=referer">XHTML</a> | <a href="http://jigsaw.w3.org/css-validator/check/referer">CSS</a> | ClearBlue by: <a href="http://www.themebin.com/">ThemeBin</a>
<!-- Please Do Not remove this link, thank u -->
</div>
</div>
</div>
</div>
</div>
</body>
</html>