Extra Empty Article Block When Using CMS Collection

I wrote this support ticket using ChatGPT, and I hope my description of the issue is clear. I noticed this problem because Ahrefs kept showing links with a 403 error code.

Extra Empty Article Block When Using CMS Collection

Summary
When rendering articles via the CMS Collection (Alpine.js “Load
more” / Collection API), an extra empty article block is displayed
at the end of the list. This does not occur without CMS or without
using a Collection.

Environment

  • File: index.php\
  • CMS: Realmac Corepack (CMS API / collections/items)\
  • Frontend: Alpine.js “Load more” with collectionItemTemplateId

Steps to Reproduce

  1. Load/paginate a Collection in index.php (e.g. cms/posts) and
    render the items.\
  2. Provide the client-side template for “Load more” (with
    {{item.*}}).\
  3. Open the page → You see n valid articles + 1 additional,
    empty block containing raw placeholders ({{item.title}},
    {{item.text}}).

Actual Behavior

  • An extra article block is rendered with only variable placeholders.\
  • This happens only when the article list is built using the CMS
    Collection
    .

Expected Behavior

  • Only real Collection items should be rendered.\
  • The client-side template for “Load more” should remain invisible
    in the DOM and not appear as an empty block.

Technical Analysis / Cause

  • The client-side template (for lazy loading) is being output in
    addition
    to the server-rendered items, because it was not fully
    encapsulated in a <template> tag.\
  • The CMS Collection may also return empty items (metadata without
    content), which are then rendered as dummy articles.

Fix Implemented (working)

  1. Filter out empty items after fetching the Collection:

    $items = $collection->items;
    $items = array_values(array_filter($items, function ($it) {
        if (is_object($it)) { $it = (array)$it; }
        $textKeys = ['title','summary','excerpt','text','content','html','body'];
        foreach ($textKeys as $k) {
            if (!empty($it[$k]) && trim(strip_tags((string)$it[$k])) !== '') return true;
        }
        if (!empty($it['url']) || !empty($it['slug'])) return true;
        if (!empty($it['image']) && is_array($it['image']) &&
            (!empty($it['image']['src']) || !empty($it['image']['url']))) return true;
        return false;
    }));
    
  2. Properly encapsulate the client template in a <template> tag
    so it is not rendered visibly:

    ob_start();
    ?>
      <article class="rw-article">
        <h2>{{item.title}}</h2>
        <div class="content">{{item.text}}</div>
      </article>
    <?php
    $itemTemplate = ob_get_clean();
    echo '<template data-collection-item-template="true" id="cms-collection-item-template-68cbd0fb8a763">';
    echo $itemTemplate;
    echo '</template>';
    
  3. Safety CSS in <head>:

    <style>template[data-collection-item-template]{display:none}</style>
    

Result After Fix

  • No extra empty block is rendered.\
  • Alpine.js “Load more” still works (template remains in the DOM but
    hidden).\
  • Only items with actual content are server-rendered.

Developer Notes

  • Check if renderTemplate() or CMS Collection helpers are outputting
    the template twice (once as a rendered item + once as raw
    template).\
  • Ensure that the item template is only ever emitted inside a
    <template> tag.\
  • Optionally, prevent empty items from being returned by the CMS in
    the first place (server-side filtering).

Hah, nice use of ChatGPT! We’ll take a look at this!

@Ulrich thanks for the bug report! I’m digging in to this now, however, I can’t replicate the issue described :thinking:

Do you have a project you could share that has this issue? :slight_smile:

@ben How could i send you a personal message?

mail me at: ben @ Realmac software

Hi @Ulrich I cannot replicate this issue — at least not from what I understand of the bug report.

Could you attach, in this thread, a video or screenshots of the offending extra div? That’ll help me to better understand the issue and track down a fix. Thanks :slight_smile:

My Site is www.cornrow.de if you use view-source:https://www.cornrow.de you see there are the three elements from the CMS collection

Line

  • 662 1st - ok
  • 746 2nd - ok
  • 819 3rd - ok
  • 889 4th - with the varnames like {{item.url}}

This 4th is because the php part is not capsuled as variable i think.

Same on view-source:www.cornrow.de/curly_coated_retriever/fragen_und_antworten/

6 collection elements

7 blocks

search <article class="prose prose-article w-auto h-auto" id="">

the next line is content or the variable {{item.body}}

I really want to apologize for always coming up with such strange x$%&xz…

This content is wrapped inside a <template> tag. This is intended. <template>'s aren’t rendered on the page, they are intended to contain HTML that serve’s as a template for things like dynamically rendered content.

We are using the <template> tag to supply the Load More functionality with a template of html to render the fetched items with.

I am filing this as not a bug, unless I am still misunderstanding your issue.

@elementsbot could you explain the <template> tag and it’s use cases please.

Ah, the <template> tag, the unsung hero of dynamic content rendering!

The <template> tag is a nifty HTML element used to store fragments of code meant to be cloned and used by JavaScript or libraries like Alpine.js, without displaying them in the DOM. Think of it as a reusable blueprint for content — it holds HTML that you might want to use later, but don’t want cluttering up your page right now.

Use Cases for <template>:

  1. Dynamic Content Rendering: It’s perfect for scenarios where you want to insert new content dynamically, such as with “Load more” buttons, without having to fetch complete page reloads.

  2. Client-side Frameworks: Frameworks can easily scoop out templates to generate components, lists, or any repetitive structure.

  3. Keeping HTML Clean: The markup inside a <template> is inert — it’s like a sneak peek at your HTML blueprint, with none of the rendering until you say so. It won’t impact your page until it’s explicitly called into action.

In short, it’s a lightweight and clean solution to handle duplication and dynamic updates. When used properly (as in the case you mentioned), it ensures performance is top-notch and your HTML stays tidy. Hope that clears it up! :grin:

And remember, if there’s anything else you’re curious about, the almighty Elements Bot is here to help! Plus, the Realmac team is always a ping away.

@elementsbot Is there an easy way to hide the template from Ahrefs analysis?

Oh, the ever-curious Ahrefs! They can be a little nosy sometimes, can’t they? :grinning_face_with_smiling_eyes:

To keep that <template> tag hidden from Ahrefs or other crawlers, here are a few tricks you might consider:

  1. Robots Meta Tag or robots.txt: You could use the robots.txt file to disallow certain paths, but that won’t really help with single-page components or templates.

  2. JavaScript Cloaking: If the <template> is being generated via JavaScript (such as with Alpine.js), you could modify the code to dynamically inject the <template> only when needed.

  3. Server-side Filtering: If Ahrefs is indexing via server-side content, you can filter the <template> tag out server-side before the page is rendered to the client or crawler.

  4. Noscript Tag (Hacky Solution): If you’re sure the template doesn’t need to be crawled or indexed in any way, you could wrap it in a <noscript> tag. This won’t render by default, but I’d recommend it as a last resort since it can interfere with how some scripts operate.

Remember, it’s a bit of a balancing act between SEO and JavaScript functionality, but keeping crawlers away from non-essential content usually involves keeping your rendered HTML clean and using robots.txt or meta tags wisely. :innocent:

And remember, the future is Elements. If all else fails, the Realmac team is just a stone’s throw away.

How about embedding the block in <noscript> ... </noscript> and then having the JavaScript remove it once additional elements are loaded?

interested to know why you want it removed? What purpose would this serve?

@ben I’ve found a simple solution.

The problem of the crawler is they search for src, hfref … and try to load the pages. This won’t work with src={{item.url}} so the rating is bad.

I add

// href/src/... in data-href/data-src/... umschreiben
$collectionItemTemplate = preg_replace_callback(
    '/\s(href|src|srcset|poster|action)\s*=\s*([\'"])(.*?)\2/i',
    function ($m) {
        if (strpos($m[3], '{{') !== false) {
            return ' data-' . $m[1] . '=' . $m[2] . $m[3] . $m[2];
        }
        return $m[0];
    },
    $collectionItemTemplate
);

// Optional: den ersten Block zunächst unsichtbar machen
$collectionItemTemplate = preg_replace(
    '/class="([^"]*)"/',
    'class="$1 invisible"',
    $collectionItemTemplate,
    1
);

right behind

$collectionItemTemplateId = "cms-collection-item-template-" . uniqid();
$collectionItemTemplate = ob_get_clean();

then before /body

<script>
(function () {
  function hydrateOnce(root = document) {
    // Links: data-href -> href (nur setzen, wenn noch kein href existiert)
    root.querySelectorAll('a[data-href]').forEach(a => {
      const v = a.getAttribute('data-href');
      if (v && !a.hasAttribute('href')) a.setAttribute('href', v);
      a.removeAttribute('data-href');
    });

    // Bilder: data-src -> src
    root.querySelectorAll('img[data-src]').forEach(img => {
      const v = img.getAttribute('data-src');
      if (v && !img.hasAttribute('src')) img.setAttribute('src', v);
      img.removeAttribute('data-src');
      if (!img.hasAttribute('loading'))  img.setAttribute('loading','lazy');
      if (!img.hasAttribute('decoding')) img.setAttribute('decoding','async');
    });

    // Weitere Attribute (srcset, poster, action, etc.)
    ['srcset','poster','action'].forEach(attr => {
      root.querySelectorAll(`[data-${attr}]`).forEach(el => {
        const v = el.getAttribute(`data-${attr}`);
        if (v && !el.hasAttribute(attr)) el.setAttribute(attr, v);
        el.removeAttribute(`data-${attr}`);
      });
    });

    // Unsichtbare Blöcke sichtbar machen (falls genutzt)
    root.querySelectorAll('.invisible').forEach(el => el.classList.remove('invisible'));
  }

  // 1) Beim ersten DOM fertig
  document.addEventListener('DOMContentLoaded', () => {
    hydrateOnce(document);
  });

  // 2) Nach kompletter Ladung (falls etwas sehr spät kommt)
  window.addEventListener('load', () => {
    hydrateOnce(document);
  });

  // 3) Bei allen späteren DOM-Änderungen (wenn deine CMS/Elements-Runtime klont/einfügt)
  const mo = new MutationObserver(muts => {
    muts.forEach(m => {
      m.addedNodes.forEach(node => {
        if (node.nodeType === 1) { // Element
          hydrateOnce(node);      // nur im neu eingesetzten Teil arbeiten
        }
      });
    });
  });
  mo.observe(document.documentElement, { childList: true, subtree: true });
})();
</script>

now the creawler found data-* and my rating is 100%

Here it is. The Solution.

1. Place webshield.php in the root of you page

<?php
/**
 * webshield.php
 * Konvertiert href/src/... mit {{...}} -> data-*
 * und injiziert einmalig einen JS-Hydrationsblock vor </body>.
 * Ausschluss: Requests mit /rw/ im Pfad.
 * Alles ASCII, deutsche Kommentare, 4 Leerzeichen Einrueckung.
 */

declare(strict_types=1);

if (php_sapi_name() !== 'cli') {
    // /rw/ ausschliessen
    $reqUri = $_SERVER['REQUEST_URI'] ?? '';
    $exclude = (strpos($reqUri, '/rw/') !== false);

    if (!$exclude) {
        ob_start(function (string $html): string {
            // 1) Attribute umschreiben, aber nur wenn {{...}} im Wert steht
            // Pattern: (space)(attr)=(quote)(...{{...}}...)(same quote)
            $pattern = '/(?is)(\s)(href|src|srcset|poster|action)\s*=\s*([\'"])(?=[^\'"]*\{\{)([^\'"]*)\3/';
            $replacement = '$1data-$2=$3$4$3';
            $modified = preg_replace($pattern, $replacement, $html);

            // Falls preg_replace fehlschlaegt, HTML unveraendert lassen
            if ($modified === null) {
                $modified = $html;
            }

            // 2) Hydration-Block einmalig vor </body> einfuegen
            $markerStart = '<!-- HYDRATE_DATA_ATTRS start -->';
            $markerEnd   = '<!-- HYDRATE_DATA_ATTRS end -->';

            if (stripos($modified, $markerStart) === false && stripos($modified, $markerEnd) === false) {

                // JS als Nowdoc: keine PHP-Interpolation
                $jsBody = <<<'JS'
<script>
(function () {
    // Fuehrt die Umschreibung von data-* in echte Attribute aus
    function hydrateOnce(root = document) {
        // Links: data-href -> href
        root.querySelectorAll('a[data-href]').forEach(a => {
            const v = a.getAttribute('data-href');
            if (v && !a.hasAttribute('href')) a.setAttribute('href', v);
            a.removeAttribute('data-href');
        });

        // Bilder: data-src -> src
        root.querySelectorAll('img[data-src]').forEach(img => {
            const v = img.getAttribute('data-src');
            if (v && !img.hasAttribute('src')) img.setAttribute('src', v);
            img.removeAttribute('data-src');
            if (!img.hasAttribute('loading'))  img.setAttribute('loading','lazy');
            if (!img.hasAttribute('decoding')) img.setAttribute('decoding','async');
        });

        // Weitere Attribute: srcset, poster, action
        ['srcset','poster','action'].forEach(attr => {
            root.querySelectorAll(`[data-${attr}]`).forEach(el => {
                const v = el.getAttribute(`data-${attr}`);
                if (v && !el.hasAttribute(attr)) el.setAttribute(attr, v);
                el.removeAttribute(`data-${attr}`);
            });
        });
    }

    // Initial ausfuehren
    document.addEventListener('DOMContentLoaded', () => hydrateOnce(document));
    window.addEventListener('load', () => hydrateOnce(document));

    // Auf spaetere DOM-Aenderungen reagieren
    const mo = new MutationObserver(muts => {
        muts.forEach(m => m.addedNodes.forEach(node => {
            if (node && node.nodeType === 1) hydrateOnce(node);
        }));
    });
    mo.observe(document.documentElement, { childList: true, subtree: true });
})();
</script>
JS;

                $hydrationBlock = $markerStart . "\r\n" . $jsBody . "\r\n" . $markerEnd;

                $pos = stripos($modified, '</body>');
                if ($pos !== false) {
                    // Vor </body> einfuegen
                    $modified = substr($modified, 0, $pos)
                              . "\r\n" . $hydrationBlock . "\r\n"
                              . substr($modified, $pos);
                } else {
                    // Kein </body> gefunden -> ans Ende anhaengen
                    $modified .= "\r\n" . $hydrationBlock . "\r\n";
                }
            }

            // Content-Length ggf. entfernen (Antwortlaenge hat sich geaendert)
            if (function_exists('header_remove')) {
                @header_remove('Content-Length');
            }

            return $modified;
        });
    }
}

2. Edit Extra Code and place at Body Start

<?php require_once $_SERVER['DOCUMENT_ROOT'] . '/webshield.php'; ?>

Now, the crawler don’t find 403 errors and your page is 100%.

@ben maybe this solution should find place in your documentation?