From bcb8fd400b526833c1a1b214e8a6a4e06e5e9754 Mon Sep 17 00:00:00 2001 From: MatMatt Date: Sun, 22 Dec 2024 14:27:24 +0000 Subject: [PATCH] deploy: 7dc84febd16cf65bb134203c044022294eae94b6 --- CheatSheet.html | 392 ++++++++++++++++++++++++++++++++++++++---------- README.html | 2 +- guidelines.html | 183 +++++++++++----------- sitemap.xml | 16 +- 4 files changed, 417 insertions(+), 176 deletions(-) diff --git a/CheatSheet.html b/CheatSheet.html index 6a04fd3..3efd9eb 100644 --- a/CheatSheet.html +++ b/CheatSheet.html @@ -2,12 +2,12 @@ - + - + A Cheatsheet for Developing Standards for Generative AI Training and Web Crawlers @@ -25,7 +25,7 @@ } /* CSS for syntax highlighting */ pre > code.sourceCode { white-space: pre; position: relative; } -pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } +pre > code.sourceCode > span { line-height: 1.25; } pre > code.sourceCode > span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode > span { color: inherit; text-decoration: inherit; } @@ -36,7 +36,7 @@ } @media print { pre > code.sourceCode { white-space: pre-wrap; } -pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } +pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } @@ -101,12 +101,17 @@

Contents

  • 2.3. PDF Structuring for AI Integration
  • 2.4. HTML Structuring for AI Integration
  • -
  • 3. Importance of Sitemap Indexing in HTML Documents for Easy Web Crawling and Generative AI Training
  • +
  • 3. Importance of Sitemap Indexing in HTML Documents
  • 4. Best Practices for Information Formatting
  • -
  • 5. Automation with GitHub Deployment
  • +
  • 5. Quarto Markdown Editors +
  • +
  • 6. Automation with GitHub Deployment
  • 6. Conclusion
  • -

    Other Formats

    +
    @@ -130,7 +135,7 @@

    A Cheatsheet for Developing Standards for Generative AI Traini
    Published
    -

    September 30, 2024

    +

    October 29, 2024

    @@ -138,10 +143,17 @@

    A Cheatsheet for Developing Standards for Generative AI Traini +
    +
    +
    Keywords
    +

    AI standards, web crawlers, AI training, content formatting

    +
    +
    + -

    ::: {style=“font-family: ‘Times New Roman’, serif; text-align: justify;”}

    -

    This document serve as a quick reference guide to ensure content follows structured formats essential for web crawlers and AI systems. Utilizing Quarto Markdown in HTMLs and generating sitemaps are critical for efficient crawling, helping search engines and AI models quickly index and retrieve well-structured content.

    + +

    This document serve as a quick reference guide to ensure content follows structured formats essential for web crawlers and AI systems. Utilizing Quarto Markdown in HTMLs and generating sitemaps are critical for efficient crawling, helping search engines and AI models quickly index and retrieve well-structured content.

    1. Introduction

    @@ -189,24 +201,28 @@

    YAML Example for

    2.2. HTML Structuring for Web Crawlers

    Semantic HTML5 elements, such as <article>, <section>, and <header>, help web crawlers index and understand the content more efficiently.

    -
    <article>
    -  <header>
    -    <h1>Understanding Web Crawlers</h1>
    -    <meta name="description" content="Overview of web crawlers and their role in AI training." />
    -  </header>
    -  <section>
    -    <h2>How Web Crawlers Index Content</h2>
    -    <p>Web crawlers use links and metadata to index the web.</p>
    -  </section>
    -</article>
    +
    ---
    +<article>
    +  <header>
    +    <h1>Understanding Web Crawlers</h1>
    +    <meta name="description" content="Overview of web crawlers and their role in AI training." />
    +  </header>
    +  <section>
    +    <h2>How Web Crawlers Index Content</h2>
    +    <p>Web crawlers use links and metadata to index the web.</p>
    +  </section>
    +</article>
    +---

    2.2.1. Microdata for Structured Content

    -
    <article itemscope itemtype="https://schema.org/Article">
    -  <header>
    -    <h1 itemprop="headline">AI and Web Crawling</h1>
    -    <meta itemprop="description" content="Overview of AI training using web crawlers." />
    -  </header>
    -</article>
    +
    ---
    +<article itemscope itemtype="https://schema.org/Article">
    +  <header>
    +    <h1 itemprop="headline">AI and Web Crawling</h1>
    +    <meta itemprop="description" content="Overview of AI training using web crawlers." />
    +  </header>
    +</article>
    +---

    @@ -223,33 +239,35 @@

    2.3. PD

    2.4. HTML Structuring for AI Integration

    To optimize content for AI integration, HTML documents should include semantic elements, structured data formats like JSON-LD, and relevant metadata. This helps AI systems process and train on the content efficiently.

    -
    <article itemscope itemtype="https://schema.org/Article">
    -  <header>
    -    <h1 itemprop="headline">AI Training Data and Web Crawlers</h1>
    -    <meta name="description" content="How to structure content for AI training and web crawling." />
    -  </header>
    -  <section>
    -    <h2>AI Model Training</h2>
    -    <p>Semantic structure is essential for AI to understand content.</p>
    -    <script type="application/ld+json">
    -    {
    -      "@context": "https://schema.org",
    -      "@type": "Dataset",
    -      "name": "AI Training Data",
    -      "description": "Dataset structured for AI and web crawlers.",
    -      "creator": {
    -        "@type": "Organization",
    -        "name": "Your Organization"
    -      }
    -    }
    -    </script>
    -  </section>
    -</article>
    +
    ---
    +<article itemscope itemtype="https://schema.org/Article">
    +  <header>
    +    <h1 itemprop="headline">AI Training Data and Web Crawlers</h1>
    +    <meta name="description" content="How to structure content for AI training and web crawling." />
    +  </header>
    +  <section>
    +    <h2>AI Model Training</h2>
    +    <p>Semantic structure is essential for AI to understand content.</p>
    +    <script type="application/ld+json">
    +    {
    +      "@context": "https://schema.org",
    +      "@type": "Dataset",
    +      "name": "AI Training Data",
    +      "description": "Dataset structured for AI and web crawlers.",
    +      "creator": {
    +        "@type": "Organization",
    +        "name": "Your Organization"
    +      }
    +    }
    +    </script>
    +  </section>
    +</article>
    +---

    -
    -

    3. Importance of Sitemap Indexing in HTML Documents for Easy Web Crawling and Generative AI Training

    +
    +

    3. Importance of Sitemap Indexing in HTML Documents

    Sitemaps are essential for enhancing the discoverability and accessibility of web content for both web crawlers and AI systems. As an XML file, a sitemap provides a structured roadmap of a website, listing URLs, metadata, and details like last modified dates and update frequency. This helps crawlers efficiently index content and enables generative AI models to train on well-structured data, improving processing and retrieval accuracy. Key Benefits of Sitemap Indexing for Web Crawling and AI Training are:

    • Improved Discoverability: Sitemaps enable web crawlers to find all relevant resources on a site, especially for deep or hard-to-reach pages.

    • @@ -257,13 +275,16 @@

      3. Importance of Sitemap Indexing in HTML Documents for Easy Web Crawling an
    • Structured Data for AI Training: Well-indexed documents help generative AI models understand relationships between content, improving relevance and accuracy in AI-generated responses.

    • Faster Content Retrieval: Sitemaps speed up indexing and ensure better search rankings, enabling faster content access for AI models.

    -
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    -   <url>
    -      <loc>http://example.com/ai-training</loc>
    -      <lastmod>2024-09-30</lastmod>
    -      <changefreq>monthly</changefreq>
    -   </url>
    -</urlset>
    +
    ---
    +<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    +<url>
    +    <loc>https://<your-username>.github.io/<your-repo-name>/index.html</loc>
    +    <lastmod>2024-10-08T12:24:05Z</lastmod>
    +    <changefreq>monthly</changefreq>
    +    <priority>0.8</priority>
    +</url>
    +</urlset>
    +---

    Submit your sitemap to search engines via tools like Google Search Console to ensure your content is indexed properly. This improves the discoverability of AI training datasets and documents by web crawlers and AI models.


    @@ -277,13 +298,44 @@

    4. Best Practices for Information Formatting


    +
    +

    5. Quarto Markdown Editors

    +

    To work with Quarto Markdown (.qmd) files and have them generated automatically, we can use several editors that integrate well with Quarto. VS Code (Visual Studio Code), RStudio, JupyterLab with Quarto Integration, and Atom with Quarto Plugin are some popular editors that support Quarto and can automatically generate .qmd files.

    +

    R-Studio is lightweight, easy-to-use and integrates with Quarto and provides tools for rendering, previewing, and managing .qmd documents in an effective way.

    +
    +

    Steps to Set It Up

    +
      +
    1. Install RStudio: Download from RStudio.
    2. +
    3. Install Quarto: Follow Quarto installation instructions to install Quarto.
    4. +
    5. Create a New Quarto Document: +
        +
      • In RStudio, go to File > New File > Quarto Document.
      • +
      • Choose the type of document you want (e.g., HTML, PDF, Word).
      • +
      • A .qmd file will be created automatically.
      • +
    6. +
    7. Automatically Render .qmd: +
        +
      • After editing your document, you can preview it using Render or export it to various formats.
      • +
    8. +
    +
    +
    +

    Benefits

    +
      +
    • Full support for Quarto with an integrated environment.
    • +
    • Provides tools for live preview and exporting.
    • +
    • Ideal for users familiar with R or data science workflows.
    • +
    +
    +
    +
    -

    5. Automation with GitHub Deployment

    +

    6. Automation with GitHub Deployment

    Automation is crucial for ensuring efficiency and consistency in the deployment of content structured for AI integration and web crawlers. By automating the rendering of Quarto Markdown, Markdown, and Jupyter Notebook files into HTML, generating a sitemap, and deploying the output to GitHub Pages, the process becomes seamless and repeatable with minimal human intervention. This ensures that any changes to content are instantly reflected on the website, keeping the content discoverable and up-to-date for web crawlers and AI systems. Steps in the Automation Pipeline are:

    1. Trigger on Push or Pull Requests:
        -
      • The workflow is triggered whenever .qmd, .md, or .ipynb files are modified or included in a pull request, ensuring content is updated automatically.
      • +
      • The workflow is triggered whenever .qmd files are modified or included in a pull request, ensuring content is updated automatically.
    2. Checkout Repository:
        @@ -303,7 +355,7 @@

        5. Automation with GitHub Deployment

    3. Generate Sitemap:
        -
      • Automatically creates a sitemap.xml that helps search engines and web crawlers discover all available content on the website.
      • +
      • Automatically creates a sitemap.xml following the google structure and it helps search engines and web crawlers discover all available content on the website.
    4. Deploy to GitHub Pages:
        @@ -355,18 +407,7 @@

        6. Conclusion

        } return false; } - const clipboard = new window.ClipboardJS('.code-copy-button', { - text: function(trigger) { - const codeEl = trigger.previousElementSibling.cloneNode(true); - for (const childEl of codeEl.children) { - if (isCodeAnnotation(childEl)) { - childEl.remove(); - } - } - return codeEl.innerText; - } - }); - clipboard.on('success', function(e) { + const onCopySuccess = function(e) { // button target const button = e.trigger; // don't keep focus @@ -398,11 +439,50 @@

        6. Conclusion

        }, 1000); // clear code selection e.clearSelection(); + } + const getTextToCopy = function(trigger) { + const codeEl = trigger.previousElementSibling.cloneNode(true); + for (const childEl of codeEl.children) { + if (isCodeAnnotation(childEl)) { + childEl.remove(); + } + } + return codeEl.innerText; + } + const clipboard = new window.ClipboardJS('.code-copy-button:not([data-in-quarto-modal])', { + text: getTextToCopy }); - function tippyHover(el, contentFn) { + clipboard.on('success', onCopySuccess); + if (window.document.getElementById('quarto-embedded-source-code-modal')) { + // For code content inside modals, clipBoardJS needs to be initialized with a container option + // TODO: Check when it could be a function (https://github.com/zenorocha/clipboard.js/issues/860) + const clipboardModal = new window.ClipboardJS('.code-copy-button[data-in-quarto-modal]', { + text: getTextToCopy, + container: window.document.getElementById('quarto-embedded-source-code-modal') + }); + clipboardModal.on('success', onCopySuccess); + } + var localhostRegex = new RegExp(/^(?:http|https):\/\/localhost\:?[0-9]*\//); + var mailtoRegex = new RegExp(/^mailto:/); + var filterRegex = new RegExp('/' + window.location.host + '/'); + var isInternal = (href) => { + return filterRegex.test(href) || localhostRegex.test(href) || mailtoRegex.test(href); + } + // Inspect non-navigation links and adorn them if external + var links = window.document.querySelectorAll('a[href]:not(.nav-link):not(.navbar-brand):not(.toc-action):not(.sidebar-link):not(.sidebar-item-toggle):not(.pagination-link):not(.no-external):not([aria-hidden]):not(.dropdown-item):not(.quarto-navigation-tool):not(.about-link)'); + for (var i=0; i6. Conclusion

    interactive: true, interactiveBorder: 10, theme: 'quarto', - placement: 'bottom-start' + placement: 'bottom-start', }; + if (contentFn) { + config.content = contentFn; + } + if (onTriggerFn) { + config.onTrigger = onTriggerFn; + } + if (onUntriggerFn) { + config.onUntrigger = onUntriggerFn; + } window.tippy(el, config); } const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]'); @@ -425,7 +514,130 @@

    6. Conclusion

    try { href = new URL(href).hash; } catch {} const id = href.replace(/^#\/?/, ""); const note = window.document.getElementById(id); - return note.innerHTML; + if (note) { + return note.innerHTML; + } else { + return ""; + } + }); + } + const xrefs = window.document.querySelectorAll('a.quarto-xref'); + const processXRef = (id, note) => { + // Strip column container classes + const stripColumnClz = (el) => { + el.classList.remove("page-full", "page-columns"); + if (el.children) { + for (const child of el.children) { + stripColumnClz(child); + } + } + } + stripColumnClz(note) + if (id === null || id.startsWith('sec-')) { + // Special case sections, only their first couple elements + const container = document.createElement("div"); + if (note.children && note.children.length > 2) { + container.appendChild(note.children[0].cloneNode(true)); + for (let i = 1; i < note.children.length; i++) { + const child = note.children[i]; + if (child.tagName === "P" && child.innerText === "") { + continue; + } else { + container.appendChild(child.cloneNode(true)); + break; + } + } + if (window.Quarto?.typesetMath) { + window.Quarto.typesetMath(container); + } + return container.innerHTML + } else { + if (window.Quarto?.typesetMath) { + window.Quarto.typesetMath(note); + } + return note.innerHTML; + } + } else { + // Remove any anchor links if they are present + const anchorLink = note.querySelector('a.anchorjs-link'); + if (anchorLink) { + anchorLink.remove(); + } + if (window.Quarto?.typesetMath) { + window.Quarto.typesetMath(note); + } + // TODO in 1.5, we should make sure this works without a callout special case + if (note.classList.contains("callout")) { + return note.outerHTML; + } else { + return note.innerHTML; + } + } + } + for (var i=0; i res.text()) + .then(html => { + const parser = new DOMParser(); + const htmlDoc = parser.parseFromString(html, "text/html"); + const note = htmlDoc.getElementById(id); + if (note !== null) { + const html = processXRef(id, note); + instance.setContent(html); + } + }).finally(() => { + instance.enable(); + instance.show(); + }); + } + } else { + // See if we can fetch a full url (with no hash to target) + // This is a special case and we should probably do some content thinning / targeting + fetch(url) + .then(res => res.text()) + .then(html => { + const parser = new DOMParser(); + const htmlDoc = parser.parseFromString(html, "text/html"); + const note = htmlDoc.querySelector('main.content'); + if (note !== null) { + // This should only happen for chapter cross references + // (since there is no id in the URL) + // remove the first header + if (note.children.length > 0 && note.children[0].tagName === "HEADER") { + note.children[0].remove(); + } + const html = processXRef(null, note); + instance.setContent(html); + } + }).finally(() => { + instance.enable(); + instance.show(); + }); + } + }, function(instance) { }); } let selectedAnnoteEl; @@ -469,6 +681,7 @@

    6. Conclusion

    } div.style.top = top - 2 + "px"; div.style.height = height + 4 + "px"; + div.style.left = 0; let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter"); if (gutterDiv === null) { gutterDiv = window.document.createElement("div"); @@ -494,6 +707,32 @@

    6. Conclusion

    }); selectedAnnoteEl = undefined; }; + // Handle positioning of the toggle + window.addEventListener( + "resize", + throttle(() => { + elRect = undefined; + if (selectedAnnoteEl) { + selectCodeLines(selectedAnnoteEl); + } + }, 10) + ); + function throttle(fn, ms) { + let throttle = false; + let timer; + return (...args) => { + if(!throttle) { // first call gets through + fn.apply(this, args); + throttle = true; + } else { // all the others get throttled + if(timer) clearTimeout(timer); // cancel #2 + timer = setTimeout(() => { + fn.apply(this, args); + timer = throttle = false; + }, ms); + } + }; + } // Attach click handler to the DT const annoteDls = window.document.querySelectorAll('dt[data-target-cell]'); for (const annoteDlNode of annoteDls) { @@ -557,4 +796,5 @@

    6. Conclusion

    + \ No newline at end of file diff --git a/README.html b/README.html index 8d0fede..120e283 100644 --- a/README.html +++ b/README.html @@ -47,7 +47,7 @@

    Copernicus Land Monitoring Service (CLMS)

    -

    This repository contains technical documents for the CLMS, such as ATBD’s, PUM’s, or nomenclature guidelines.

    +

    Repository for the maintenance and generation of reference guidelines and technical product documents for the CLMS.

    diff --git a/guidelines.html b/guidelines.html index 6fbe370..e67481b 100644 --- a/guidelines.html +++ b/guidelines.html @@ -80,46 +80,46 @@
    @@ -167,28 +167,29 @@

    Guidelines for Using Quarto Markdown

    -
    -

    Quarto Markdown Configuration for Multiple Output Formats

    +
    +

    1 Quarto Markdown Configuration for Multiple Output Formats

    Add the following YAML configuration to the top of the .qmd file to enable multiple output formats, such as html, pdf, and docx:

    ---
     title: "Guidelines for Using Quarto Markdown in HTML for Web Crawling"
     author: "Your Name"
     date: "2024-10-08"
    -format:
    -  html: 
    -    toc: true              # Include a Table of Contents
    -    toc-title: "Contents"
    -    toc-depth: 3
    -  pdf:
    -    toc: true
    -    toc-depth: 3
    -  docx:
    -    toc: true
    -    toc-depth: 3
    -sitemap: true               # Enable sitemap generation for web crawlers
    -keywords: ["SEO", "web crawling", "Quarto Markdown", "HTML"]
    -description: "This document provides guidelines for using Quarto Markdown in HTML for web crawling."
    ----
    +number-sections: true +format: + html: + toc: true # Include a Table of Contents + toc-title: "Index" + toc-depth: 3 + pdf: + toc: true + toc-depth: 3 + docx: + toc: true + toc-depth: 3 +sitemap: true # Enable sitemap generation for web crawlers +keywords: ["SEO", "web crawling", "Quarto Markdown", "HTML"] +description: "This document provides guidelines for using Quarto Markdown in HTML for web crawling." +---

    The toc option enables a Table of Contents for each specified output type, making navigation easier for longer documents. The toc-title option allows to set a custom title for the Table of Contents, which is especially useful for HTML output. Additionally, the toc-depth option controls the level of headings included in the Table of Contents, allowing to specify how detailed the outline should be, based on the document’s heading hierarchy.

    The YAML header includes metadata that is critical for Search Engine Optimization (SEO) and web crawling.

    ---
    @@ -204,18 +205,18 @@ 

    Quarto Markdown Configuration for Multiple Output Formats

    description: "This document provides guidelines for using Quarto Markdown in HTML for web crawling." ---
    -
    -

    Quarto Markdown in HTML for Web Crawling

    +
    +

    2 Quarto Markdown in HTML for Web Crawling

    Purpose: This document provides guidelines on using Quarto Markdown to create HTML files optimized for web crawling. These steps and syntaxes will help you structure content, enhance SEO, and improve discoverability of your pages.

    -
    -

    Prerequisites

    +
    +

    2.1 Prerequisites

    1. Install RStudio: Download and install RStudio from RStudio Download.
    2. Install Quarto: Follow Quarto installation to install the Quarto CLI.
    -
    -

    Basic Setup in RStudio

    +
    +

    2.2 Basic Setup in RStudio

    1. Create a New Quarto Document:
        @@ -233,18 +234,18 @@

        Basic Setup in RStu

    -
    -

    Essential HTML Structure

    -
    -

    Title and Meta Description

    +
    +

    3 Essential HTML Structure

    +
    +

    3.1 Title and Meta Description

    Define an appropriate title and meta description in the YAML header, as these are essential for search engines.

    ---
     title: "Guide to Quarto Markdown for SEO"
     description: "Learn how to use Quarto Markdown to create SEO-optimized HTML content for web crawling."
     ---
    -
    -

    Headings and Subheadings

    +
    +

    3.2 Headings and Subheadings

    Organize content using structured headings (#, ##, ###) to create a hierarchy. This helps crawlers understand the structure and prioritize content.

    ---
     # Main Heading
    @@ -258,22 +259,22 @@ 

    Headings and Subh More content here. ---

    -
    -

    Linking Structure

    +
    +

    3.3 Linking Structure

    Use descriptive anchor text for links and ensure that internal links are present to improve navigation.

    ---
     For more details, refer to the [Introduction to SEO](#introduction-to-seo).
     ---
    -
    -

    Image Alt Text and Descriptions

    +
    +

    3.4 Image Alt Text and Descriptions

    Add meaningful alt text to images to improve accessibility and indexing by search engines.

    ---
     ![SEO Process](images/seo_process.png){alt="Diagram showing the process of SEO optimization"}
     ---
    -
    -

    Tables in Quarto Markdown

    +
    +

    3.5 Tables in Quarto Markdown

    @@ -346,8 +347,8 @@

    Tables in Quarto

    Table 1: Simple Table Example
    -
    -

    Complex Table with Row and Column Spans

    +
    +

    3.6 Complex Table with Row and Column Spans

    <table>
       <tr>
         <th rowspan="2">Column 1</th>
    @@ -360,11 +361,11 @@ 

    Co </table>

    -
    -

    HTML Sitemap Generation for Web Crawling

    +
    +

    4 HTML Sitemap Generation for Web Crawling

    Enabling the sitemap option in the YAML header creates a sitemap automatically. This sitemap file helps web crawlers discover and index all relevant pages.

    -
    -

    Sample Sitemap Configuration

    +
    +

    4.1 Sample Sitemap Configuration

    The automatically generated sitemap.xml file might contain entries like the following:

    ---
     <url>
    @@ -376,12 +377,12 @@ 

    Sample Sitema ---

    -
    -

    Customizing Sitemap

    +
    +

    4.2 Customizing Sitemap

    To further customize, use the sitemap: attribute directly in the YAML header to control which pages are included or to add specific pages manually.

    -
    -

    Additional Metadata for Social Media and Crawlers

    +
    +

    4.3 Additional Metadata for Social Media and Crawlers

    Add Open Graph (og:) and Twitter metadata tags for better social media sharing and visibility.

    ---
     meta:
    @@ -398,10 +399,10 @@ 

    ---

    -
    -

    Quarto Syntax for Key SEO Components

    -
    -

    Structured Data with JSON-LD

    +
    +

    5 Quarto Syntax for Key SEO Components

    +
    +

    5.1 Structured Data with JSON-LD

    Use structured data like JSON-LD to help search engines understand the context of your content.

    ---
     {
    @@ -417,8 +418,8 @@ 

    Structured Da } ---

    -
    -

    Linking External Stylesheets and JavaScript

    +
    +

    5.2 Linking External Stylesheets and JavaScript

    For advanced functionality, link to external CSS and JS files. This enhances the user experience without compromising SEO.

    ---
     <link rel="stylesheet" href="https://your-stylesheet-url.css">
    @@ -426,11 +427,11 @@ 

    ---

    -
    -

    Rendering the Quarto Document in HTML

    +
    +

    6 Rendering the Quarto Document in HTML

    Once created the .qmd file, render it in HTML:

    -
    -

    Render in RStudio

    +
    +

    6.1 Render in RStudio

    Go to the RStudio Terminal or Console and run:

    quarto render yourfile.qmd

    HTML output: quarto render yourfile.qmd –to html PDF output: quarto render yourfile.qmd –to pdf DOCX output: quarto render yourfile.qmd –to docx

    @@ -439,16 +440,16 @@

    Render in RStudio

    tinytex::install_tinytex()

    We can customize DOCX output with a reference DOCX file by adding reference-doc in the docx configuration.

    -
    -

    Preview in Browser

    +
    +

    6.2 Preview in Browser

    Open the generated HTML file in a browser to ensure the content is well-structured for web crawling and SEO.

    -
    -

    Best Practices Checklist for Accessible and SEO-Optimized Documents

    +
    +

    7 Best Practices Checklist for Accessible and SEO-Optimized Documents

    This checklist ensures that your Quarto Markdown documents and tables are optimized for accessibility, SEO, and readability across multiple formats (HTML, PDF, DOCX).

    -
    -

    General Document Best Practices

    +
    +

    7.1 General Document Best Practices

    • @@ -459,8 +460,8 @@

      General Do
    -
    -

    Accessible Tables Best Practices

    +
    +

    7.2 Accessible Tables Best Practices

    • @@ -471,8 +472,8 @@

      Accessibl

      Following these guidelines will ensure Quarto Markdown documents and tables are accessible, SEO-optimized, and suitable for multiple output formats.

    -
    -

    Conclusion

    +
    +

    8 Conclusion

    This Quarto Markdown document can be saved with a .qmd extension, edited in RStudio, and rendered to HTML to ensure it follows best practices for web crawling.

    diff --git a/sitemap.xml b/sitemap.xml index 576e0cc..385166e 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -2,49 +2,49 @@ https://.github.io//ANNEX IV IT Principles.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//CLMS_doc_example.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//CLMS_filenamingconvention.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//CheatSheet.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//README.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//clms.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//guidelines.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8 https://.github.io//test.html - 2024-12-22T14:03:22Z + 2024-12-22T14:27:24Z monthly 0.8