From b54ec4b659dbad2a356d2fa4f93f5541e0f14128 Mon Sep 17 00:00:00 2001 From: Roopa Malavally <56051583+Rmalavally@users.noreply.github.com> Date: Mon, 27 Jul 2020 12:25:45 -0700 Subject: [PATCH] Update GCN-ISA-Manuals.rst --- GCN_ISA_Manuals/GCN-ISA-Manuals.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/GCN_ISA_Manuals/GCN-ISA-Manuals.rst b/GCN_ISA_Manuals/GCN-ISA-Manuals.rst index ea3c2d89..08d3673d 100644 --- a/GCN_ISA_Manuals/GCN-ISA-Manuals.rst +++ b/GCN_ISA_Manuals/GCN-ISA-Manuals.rst @@ -26,7 +26,7 @@ Inline GCN ISA Assembly Guide The Art of AMDGCN Assembly: How to Bend the Machine to Your Will ****************************************************************** -The ability to write code in assembly is essential to achieving the best performance for a GPU program. In a `previous blog `_ we described how to combine several languages in a single program using ROCm and Hsaco. This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture. I'd like to thank Ilya Perminov of Luxsoft for co-authoring this blog post. Programs written for GPUs should achieve the highest performance possible. Even carefully written ones, however, won’t always employ 100% of the GPU’s capabilities. Some reasons are the following: +The ability to write code in assembly is essential to achieving the best performance for a GPU program. We have previously described how to combine several languages in a single program using ROCm and Hsaco. This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture. I'd like to thank Ilya Perminov of Luxsoft for co-authoring this blog post. Programs written for GPUs should achieve the highest performance possible. Even carefully written ones, however, won’t always employ 100% of the GPU’s capabilities. Some reasons are the following: * The program may be written in a high level language that does not expose all of the features available on the hardware. * The compiler is unable to produce optimal ISA code, either because the compiler needs to ‘play it safe’ while adhering to the semantics of a language or because the compiler itself is generating un-optimized code.