diff --git a/GCN_ISA_Manuals/GCN-ISA-Manuals.rst b/GCN_ISA_Manuals/GCN-ISA-Manuals.rst index ea3c2d89..08d3673d 100644 --- a/GCN_ISA_Manuals/GCN-ISA-Manuals.rst +++ b/GCN_ISA_Manuals/GCN-ISA-Manuals.rst @@ -26,7 +26,7 @@ Inline GCN ISA Assembly Guide The Art of AMDGCN Assembly: How to Bend the Machine to Your Will ****************************************************************** -The ability to write code in assembly is essential to achieving the best performance for a GPU program. In a `previous blog `_ we described how to combine several languages in a single program using ROCm and Hsaco. This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture. I'd like to thank Ilya Perminov of Luxsoft for co-authoring this blog post. Programs written for GPUs should achieve the highest performance possible. Even carefully written ones, however, won’t always employ 100% of the GPU’s capabilities. Some reasons are the following: +The ability to write code in assembly is essential to achieving the best performance for a GPU program. We have previously described how to combine several languages in a single program using ROCm and Hsaco. This article explains how to produce Hsaco from assembly code and also takes a closer look at some new features of the GCN architecture. I'd like to thank Ilya Perminov of Luxsoft for co-authoring this blog post. Programs written for GPUs should achieve the highest performance possible. Even carefully written ones, however, won’t always employ 100% of the GPU’s capabilities. Some reasons are the following: * The program may be written in a high level language that does not expose all of the features available on the hardware. * The compiler is unable to produce optimal ISA code, either because the compiler needs to ‘play it safe’ while adhering to the semantics of a language or because the compiler itself is generating un-optimized code.