Revert "Add cleanup_text.sh script, and remove non-ASCII characters f…

…rom RST code blocks"
RadeonOpenCompute · Apr 8, 2020 · 611e249 · 611e249
1 parent d6188df
commit 611e249
Show file tree

Hide file tree

Showing 17 changed files with 119 additions and 315 deletions.
diff --git a/Deep_learning/GCN-asm-tutorial.rst b/Deep_learning/GCN-asm-tutorial.rst
@@ -71,9 +71,9 @@ The host program should also allocate memory for the in, index and out buffers.
   out = AllocateBuffer(size);
 
   // Fill Kernarg memory
-  Kernarg(in); // Add base pointer to "in" buffer
-  Kernarg(index); // Append base pointer to "index" buffer
-  Kernarg(out); // Append base pointer to "out" buffer
+  Kernarg(in); // Add base pointer to “in” buffer
+  Kernarg(index); // Append base pointer to “index” buffer
+  Kernarg(out); // Append base pointer to “out” buffer
 
 Initial Wavefront and Register State To launch a kernel in real hardware, the run time needs information about the kernel, such as
 

diff --git a/GCN_ISA_Manuals/GCN-ISA-Manuals.rst b/GCN_ISA_Manuals/GCN-ISA-Manuals.rst
@@ -90,9 +90,9 @@ The host program should also allocate memory for the in, index and out buffers.
   out = AllocateBuffer(size);
 
   // Fill Kernarg memory
-  Kernarg(in); // Add base pointer to "in" buffer
-  Kernarg(index); // Append base pointer to "index" buffer
-  Kernarg(out); // Append base pointer to "out" buffer
+  Kernarg(in); // Add base pointer to “in” buffer
+  Kernarg(index); // Append base pointer to “index” buffer
+  Kernarg(out); // Append base pointer to “out” buffer
 
 Initial Wavefront and Register State To launch a kernel in real hardware, the run time needs information about the kernel, such as
 

diff --git a/Installation_Guide/FAQ-on-Installation.rst b/Installation_Guide/FAQ-on-Installation.rst
@@ -89,7 +89,7 @@ This problem can occur on Fedora installation if several previous kernels are cu
 This is not an issue with the YUM repository; it is caused by the size of the /boot filesystem and the size of the kernels already installed on it. This issue can be fixed by uninstalling previous versions of the rocm Linux kernel:
 ::
  sudo dnf remove rocm
- rpm -qa | grep kfd | xargs sudo rpm -e
+ rpm -qa | grep kfd | xargs sudo rpm –e
  sudo dnf install rocm
  
 Installing from an archived repository
@@ -104,7 +104,7 @@ Here is an Example:
 
   cd /temp && wget http://repo.radeon.com/rocm/archive/apt_1.6.3.tar.bz2
   tar -xvf apt_1.6.3.tar.bz2
-  sudo echo "deb [amd64] file://temp/apt_1.6.3 xenial main" > /etc/apt/sources.lists.d/rocm.local.list
+  sudo echo “deb [amd64] file://temp/apt_1.6.3 xenial main” > /etc/apt/sources.lists.d/rocm.local.list
   sudo apt-get update && sudo apt-get install rocm
 
 Users should make sure that no other list files contain another rocm repo configuration.
@@ -119,7 +119,7 @@ Add a /etc/yum.d/rocm.local.repo file with the following contents: ::
   enabled=1
   gpgcheck=0
   cd /temp && wget http://repo.radeon.com/rocm/archive/yum_1.6.3.tar.bz2
-  tar -xvf yum_1.6.3.tar.bz2
+  tar –xvf yum_1.6.3.tar.bz2
 
 Then execute: ::
 

diff --git a/Installation_Guide/HCC-Compiler.rst b/Installation_Guide/HCC-Compiler.rst
@@ -173,4 +173,4 @@ For applications compiled using hcc, ThinLTO could significantly improve link-ti
 ThinLTO Phase 2 - Under development
 **************************************
 
-This ThinLTO implementation which will use llvm-lto LLVM tool to replace clamp-device bash script. It adds an optllc option into ThinLTOGenerator, which will perform in-program opt and codegen in parallel.
+This ThinLTO implementation which will use llvm-lto LLVM tool to replace clamp-device bash script. It adds an optllc option into ThinLTOGenerator, which will perform in-program opt and codegen in parallel.
diff --git a/Installation_Guide/Installation-Guide.rst b/Installation_Guide/Installation-Guide.rst
@@ -227,7 +227,7 @@ To install ROCm on your system, follow the instructions below:
     enabled=1
     gpgcheck=0
 
-Note: The URL of the repository must point to the location of the repositories' repodata database.
+Note: The URL of the repository must point to the location of the repositories’ repodata database.
 
 3. Install ROCm components using the following command:
 
@@ -352,7 +352,7 @@ The following section tells you how to perform an install and uninstall ROCm on
 
 ::
 
-	sudo zypper clean -all
+	sudo zypper clean –all
 	sudo zypper addrepo --no-gpgcheck http://repo.radeon.com/rocm/zyp/zypper/ rocm 
 	sudo zypper ref
 	zypper install rocm-dkms

diff --git a/Installation_Guide/Quick Start Installation Guide.rst b/Installation_Guide/Quick Start Installation Guide.rst
@@ -236,7 +236,7 @@ To install ROCm on your system, follow the instructions below:
     enabled=1
     gpgcheck=0
 
-Note: The URL of the repository must point to the location of the repositories' repodata database.
+Note: The URL of the repository must point to the location of the repositories’ repodata database.
 
 3. Install ROCm components using the following command:
 
@@ -363,7 +363,7 @@ The following section tells you how to perform an install and uninstall ROCm on
 
 ::
 
-	sudo zypper clean -all
+	sudo zypper clean –all
 	sudo zypper addrepo --no-gpgcheck http://repo.radeon.com/rocm/zyp/zypper/ rocm 
 	sudo zypper ref
 	zypper install rocm-dkms

diff --git a/Programming_Guides/HIP-GUIDE.rst b/Programming_Guides/HIP-GUIDE.rst
@@ -80,7 +80,7 @@ __global__ functions are often referred to as kernels, and calling one is termed
       hipLaunchKernelGGL(MyKernel, dim3(gridDim), dim3(groupDim), 0/*dynamicShared*/, 0/*stream), a, b, c, n)
 
 
-The hipLaunchKernelGGL macro always starts with the five parameters specified above, followed by the kernel arguments. The Hipify script automatically converts Cuda launch syntax to hipLaunchKernelGGL, including conversion of optional arguments in <<< >>> to the five required hipLaunchKernelGGL parameters. The :ref:`dim3` constructor accepts zero to three arguments and will by default initialize unspecified dimensions to 1. See dim3. The kernel uses the coordinate built-ins (hipThread*, hipBlock*, hipGrid*) to determine coordinate index and coordinate bounds of the work item that's currently executing. 
+The hipLaunchKernelGGL macro always starts with the five parameters specified above, followed by the kernel arguments. The Hipify script automatically converts Cuda launch syntax to hipLaunchKernelGGL, including conversion of optional arguments in <<< >>> to the five required hipLaunchKernelGGL parameters. The :ref:`dim3` constructor accepts zero to three arguments and will by default initialize unspecified dimensions to 1. See dim3. The kernel uses the coordinate built-ins (hipThread*, hipBlock*, hipGrid*) to determine coordinate index and coordinate bounds of the work item that’s currently executing. 
 
  .. _Kernel:
 

diff --git a/Programming_Guides/Kernel_language.rst b/Programming_Guides/Kernel_language.rst
@@ -141,7 +141,7 @@ Calling __global__ Functions
   hipLaunchKernelGGL(MyKernel, dim3(gridDim), dim3(groupDim), 0/*dynamicShared*/, 0/*stream), a, b, c, n);
  
 
-The hipLaunchKernelGGL macro always starts with the five parameters specified above, followed by the kernel arguments. The Hipify script automatically converts Cuda launch syntax to hipLaunchKernelGGL, including conversion of optional arguments in <<< >>> to the five required hipLaunchKernelGGL parameters. The dim3 constructor accepts zero to three arguments and will by default initialize unspecified dimensions to 1. See `dim3 <https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md#dim3>`_. The kernel uses the coordinate built-ins (hipThread*, hipBlock*, hipGrid*) to determine coordinate index and coordinate bounds of the work item that's currently executing. See `Coordinate Built-Ins <https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md#coordinate-built-ins>`_.
+The hipLaunchKernelGGL macro always starts with the five parameters specified above, followed by the kernel arguments. The Hipify script automatically converts Cuda launch syntax to hipLaunchKernelGGL, including conversion of optional arguments in <<< >>> to the five required hipLaunchKernelGGL parameters. The dim3 constructor accepts zero to three arguments and will by default initialize unspecified dimensions to 1. See `dim3 <https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md#dim3>`_. The kernel uses the coordinate built-ins (hipThread*, hipBlock*, hipGrid*) to determine coordinate index and coordinate bounds of the work item that’s currently executing. See `Coordinate Built-Ins <https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md#coordinate-built-ins>`_.
 
 .. _Kernel-Launch-Example:
 

diff --git a/Programming_Guides/Opencl-optimization.rst b/Programming_Guides/Opencl-optimization.rst
@@ -1543,7 +1543,7 @@ In the second block of code, the ``?:`` operator executes in the vector units, s
  a[idx] = d[idx];
  }
 
-This is inefficient because the GPU compiler must know the base pointer that every load comes from and in this situation, the compiler cannot determine what aEUR~d' points to. So, both B and C are assigned to the same GPU resource, removing the ability to do certain optimizations.
+This is inefficient because the GPU compiler must know the base pointer that every load comes from and in this situation, the compiler cannot determine what â€˜d' points to. So, both B and C are assigned to the same GPU resource, removing the ability to do certain optimizations.
 
 *If the algorithm allows changing the work-group size, it is possible to get better performance by using larger work-groups (more work-items in each work-group) because the workgroup creation overhead is reduced. On the other hand, the OpenCL CPU runtime uses a task-stealing algorithm at the work-group level, so when the kernel execution time differs because it contains conditions and/or loops of varying number of iterations, it might be better to increase the number of work-groups. This gives the runtime more flexibility in scheduling work-groups to idle CPU cores. Experimentation might be needed to reach optimal work-group size.
 *Since the AMD OpenCL runtime supports only in-order queuing, using clFinish() on a queue and queuing a blocking command gives the same result. The latter saves the overhead of another API command.
@@ -2230,7 +2230,7 @@ The following are sample kernels with different coalescing patterns.
  {
  int gid = get_global_id(0);
  if((gid & 0x1) == 0) {
- gid = (gid & (Eoe63)) +62 - get_local_id(0);
+ gid = (gid & (Ëœ63)) +62 - get_local_id(0);
  }
  output[gid] = input[gid];
  return;
@@ -3242,7 +3242,7 @@ In the second block of code, the ``?:`` operator executes in an ALU clause, so n
    } 
 
 
-   This is inefficient because the GPU compiler must know the base pointer that every load comes from and in this situation, the compiler cannot determine what aEUR~d' points to. So, both B and C are assigned to the same GPU resource, removing the ability to do certain optimizations.
+   This is inefficient because the GPU compiler must know the base pointer that every load comes from and in this situation, the compiler cannot determine what â€˜d' points to. So, both B and C are assigned to the same GPU resource, removing the ability to do certain optimizations.
  
  * If the algorithm allows changing the work-group size, it is possible to get better performance by using larger work-groups (more work-items in each work-group) because the workgroup creation overhead is reduced. On the other hand, the OpenCL CPU runtime uses a task-stealing algorithm at the work-group level, so when the kernel execution time differs because it contains conditions and/or loops of varying number of iterations, it might be better to increase the number of work-groups. This gives the runtime more flexibility in scheduling work-groups to idle CPU cores. Experimentation might be needed to reach optimal work-group size.
  * Since the AMD OpenCL runtime supports only in-order queuing, using ``clFinish`` () on a queue and queuing a blocking command gives the same result. The latter saves the overhead of another API command.

diff --git a/Programming_Guides/Opencl-programming-guide.rst b/Programming_Guides/Opencl-programming-guide.rst
@@ -1469,7 +1469,7 @@ Now, these headers can be passed as embedded headers along with the program obje
 :: 
 
    cl_program input_headers[2] = { foo_pg, myinc_pg }; 
-   char * input_header_names[2] = { "foo.h", "mydir/myinc.h" };
+   char * input_header_names[2] = { “foo.h”, “mydir/myinc.h” };
 
    clCompileProgram(program_A, 0, NULL, // num_devices & device_list
       NULL, // compile_options
@@ -1660,7 +1660,7 @@ A sample kernel definition is shown below.
 
   kernel void sample_kernel( global const uchar *normalPtr, global uchar *svmPtr)
   {  
-    ...
+    …
   }
 
 To create a kernel object for the above kernel, you must pass the program object corresponding to the kernel to the clCreateKernel function. Assuming that the program object containing the above kernel function has been created and built as program, a kernel object for the above kernel would be created as follows:
@@ -2139,7 +2139,7 @@ OpenCL Language types.
   MyFunc () 
   {
     tempClass = new(Test);
-    ... // Some OpenCL startup code - create context, queue, etc.
+    ... // Some OpenCL startup code – create context, queue, etc.
     cl_mem classObj = clCreateBuffer(context, CL_MEM_USE_HOST_PTR, sizeof(Test), &tempClass, event);
     clEnqueueMapBuffer(...,classObj,...);
     tempClass.setX(10);
@@ -2393,9 +2393,9 @@ Generic example
 
 In OpenCL 1.2, the developer needed to write three functions for a pointer p that can reference the local, private, or global address space::
   
-  void fooL (local int *p) { ... } 
-  void fooP (private int *p) { ... }
-  void fooG (global int *p) { ... }
+  void fooL (local int *p) { … } 
+  void fooP (private int *p) { … }
+  void fooG (global int *p) { … }
  
 
 
@@ -2967,7 +2967,7 @@ There are special directives for the OpenCL compiler to enable or disable availa
  #pragma OPENCL EXTENSION all: <behavior>
 
 
-The <extension_name> is described in Section A.1, "Extension Name
+The <extension_name> is described in Section A.1, “Extension Name
 Convention.”. The second form allows to address all extensions at once. The <behavior> token can be either:
 
 * **enable** - the extension is enabled if it is supported, or the error is reported if the specified extension is not supported or token “all” is used.

diff --git a/ROCm_API_References/HCC-API.rst b/ROCm_API_References/HCC-API.rst
@@ -46,7 +46,7 @@ For example:
 
 ::
 
-   `` hcchcc-config -cxxflags -ldflagsfoo.cpp -o foo ``
+   `` hcchcc-config –cxxflags –ldflagsfoo.cpp -o foo ``
 
 HCC built-in macros
 ********************
@@ -143,4 +143,4 @@ HC supports capturing memory pointer by a GPU kernel.
 
 ``` // allocate GPU memory through the HSA API int* gpu_pointer; hsa_memory_allocate(..., &gpu_pointer); ... parallel_for_each(ext, [=](index i) [[hc]] { gpu_pointer[i[0]]++; }
 
-``` For HSA APUs that supports system wide shared virtual memory, a GPU kernel can directly access system memory allocated by the host: ``` int* cpu_memory = (int*) malloc(...); ... parallel_for_each(ext, [=](index i) [[hc]] { cpu_memory[i[0]]++; }); ```
+``` For HSA APUs that supports system wide shared virtual memory, a GPU kernel can directly access system memory allocated by the host: ``` int* cpu_memory = (int*) malloc(...); ... parallel_for_each(ext, [=](index i) [[hc]] { cpu_memory[i[0]]++; }); ```
diff --git a/ROCm_Compiler_SDK/ROCm-Native-ISA.rst b/ROCm_Compiler_SDK/ROCm-Native-ISA.rst
@@ -2265,7 +2265,7 @@ DS
  ds_min_rtn_f64 v[8:9], v2, v[4:5]
 
 
-For full list of supported instructions, refer to "LDS/GDS instructions" in ISA Manual.
+For full list of supported instructions, refer to “LDS/GDS instructions” in ISA Manual.
 
 .. _FLAT:
 
@@ -2280,7 +2280,7 @@ FLAT
  flat_atomic_fmax_x2 v[1:2], v[3:4], v[5:6] glc
 
 
-For full list of supported instructions, refer to "FLAT instructions" in ISA Manual.
+For full list of supported instructions, refer to “FLAT instructions” in ISA Manual.
 
 
 .. _MUBUF:
@@ -2295,7 +2295,7 @@ MUBUF
  buffer_wbinvl1
  buffer_atomic_inc v1, v2, s[8:11], s4 idxen offset:4 slc
 
-For full list of supported instructions, refer to "MUBUF Instructions" in ISA Manual.
+For full list of supported instructions, refer to “MUBUF Instructions” in ISA Manual.
 
 .. _SMRD/SMEM:
 
@@ -2309,7 +2309,7 @@ SMRD/SMEM
  s_dcache_inv_vol
  s_memtime s[4:5]
 
-For full list of supported instructions, refer to "Scalar Memory Operations" in ISA Manual.
+For full list of supported instructions, refer to “Scalar Memory Operations” in ISA Manual.
 
 .. _SOP1:
 
@@ -2325,7 +2325,7 @@ SOP1
  s_swappc_b64 s[2:3], s[4:5]
  s_cbranch_join s[4:5]
 
-For full list of supported instructions, refer to "SOP1 Instructions" in ISA Manual.
+For full list of supported instructions, refer to “SOP1 Instructions” in ISA Manual.
 
 .. _SOP2:
 
@@ -2343,7 +2343,7 @@ SOP2
  s_bfe_i64 s[2:3], s[4:5], s6
  s_cbranch_g_fork s[4:5], s[6:7]
 
-For full list of supported instructions, refer to "SOP2 Instructions" in ISA Manual.
+For full list of supported instructions, refer to “SOP2 Instructions” in ISA Manual.
 
 .. _SOPC:
 
@@ -2356,7 +2356,7 @@ SOPC
  s_bitcmp0_b64 s[2:3], s4
  s_setvskip s3, s5
 
-For full list of supported instructions, refer to "SOPC Instructions" in ISA Manual.
+For full list of supported instructions, refer to “SOPC Instructions” in ISA Manual.
 
 .. _SOPP:
 
@@ -2376,7 +2376,7 @@ SOPP
  s_sendmsg sendmsg(MSG_INTERRUPT)
  s_trap 1
 
-For full list of supported instructions, refer to "SOPP Instructions" in ISA Manual.
+For full list of supported instructions, refer to “SOPP Instructions” in ISA Manual.
 
 Unless otherwise mentioned, little verification is performed on the operands of SOPP Instructions, so it is up to the programmer to be familiar with the range or acceptable values.
 
@@ -2434,7 +2434,7 @@ VOP_SDWA examples
  v_fract_f32 v0, |v0| dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_1
  v_cmpx_le_u32 vcc, v1, v2 src0_sel:BYTE_2 src1_sel:WORD_0
 
-For full list of supported instructions, refer to "Vector ALU instructions".
+For full list of supported instructions, refer to “Vector ALU instructions”.
 
 
 .. _Code Object V2 Predefined Symbols (-mattr=-code-object-v3):