The Excerpt Reloaded: The Root Cause and Fix For Creating Validated XHTML
Download:
the-excerpt-reloaded.zip
Update 7/17/2007: New version R1.4: Fixed the plugin URI and author URI so that the links on the plugin administration page will work.
Update 7/10/2007: I incorporated a fix that Hillary Melville described in a comment (see comment section below) into the downloadable plugin, which is now at version R1.3.
Update 6/6/2007: I found an additional cause of unclosed tags and found a fix. See this article for details. This second fix is incorporated in the plugin you can download from this page.
A Wordpress plugin called the-excerpt-reloaded allows one to generate excerpts from the first N words of a wordpress post. The problem with this plugin is that it can create XHTML that doesn’t validate (some tags do not close). I determined the root cause of this problem and found a fix for this issue. You may download the updated plugin here.
The Problem
The invalid XHTML is caused when a <p> tag is inserted at the start of the excerpt, but there is no closing </p> tag. I found that many others were having this same issue (see comments here). The plugin has a parameter $fix_tags that is supposed to fix issues like this, but even with this parameter set to “true”, the problem still occurs.
Why It Happens
The problem is introduced when the function get_the_excerpt_reloaded() calls the apply_filters() function. apply_filters() performs several “filter” operations on the excerpt text in order to generate the final XHTML from a block of text. I believe that this is mainly for processing wordpress posts that are created with the WYSIWYG editor to create things like smilies or paragraph tags. One of the filter functions that it calls for “the_excerpt” filter is named “wpautop()”. This is the filter that adds the paragraph tags. This is where the problem is caused.
After the part of wpautop() where <p> and </p> pairs are inserted around text paragraphs, it does a few more search and replaces on the text before it returns the final output. Here are 2 of the search/replaces it performs:
- If any tag in a set of tags (call this set $allblocks) is directly after a <p> tag, with only whitespace character(s) (and no non-whitespace characters) between <p> and that opening tag, then the <p> is removed.
- If any tag in the $allblocks set of tags is directly before a </p> tag, with only whitespace character(s) (and no non-whitespace characters) between that tag and </p>, then the </p> is removed.
One of the tags in the $allblocks set is <div> (or </div>). By default, the_excerpt_reloaded puts a <div class=”more-link”> and </div> at the end of the excerpt containing the $more_link_text (the text that you designate that says something like “Continue Reading…”). Presumably, this is done so that you can define special formatting for the “more-link” class in your CSS.
The </div> will appear at the end of the excerpt. After the intermediate step where wpautop() places the </p> at the end of your post, it performs check #2, finds a </div> tag right before a </p> tag, and it removes the </p>. wpautop doesn’t remove the opening <p> because it is not directly next to another tag, and is most likely next to the first word of your post. This is why an opening <p> remains, while the </p> is removed!
The Fix
To fix this problem, we move the call to apply_filters() to be before adding the final <div></div> pair that encloses your $more_link_text.
First at the very end of the function, cut apply filters() from here:
$output = apply_filters($filter_type, $output);return $output;
Next, paste the apply_filters() call at the end of this text:
$output = rtrim($output, “\s\n\t\r\0\x0B”);
$output = ($fix_tags) ? $output : balanceTags($output);
$output .= ($showdots && $ellipsis) ? ‘…’ : ”;
$output = apply_filters($filter_type, $output);
This fix is included in the updated the_excerpt_reloaded.php file attached to this post.
The $fix_tags bug
There was another bug that I noticed (as well as others) where the $fix_tags parameter is not passed from the_excerpt_reloaded() to get_the_excerpt_reloaded(). The fix for this is also included in the updated the_excerpt_reloaded() file attached to this post. To fix this, insert $fix_tags in this line:
echo get_the_excerpt_reloaded($excerpt_length, $allowedtags, $filter_type, $use_more_link, $more_link_text, $force_more, $fakeit, $no_more, $more_tag, $more_link_title, $showdots);
to make it look like this:
echo get_the_excerpt_reloaded($excerpt_length, $allowedtags, $filter_type, $use_more_link, $more_link_text, $force_more, $fakeit, $fix_tags, $no_more, $more_tag, $more_link_title, $showdots);
Also, change this line:
function get_the_excerpt_reloaded($excerpt_length, $allowedtags, $filter_type, $use_more_link, $more_link_text, $force_more, $fakeit, $no_more, $more_tag, $more_link_title, $showdots) {
to this:
function get_the_excerpt_reloaded($excerpt_length, $allowedtags, $filter_type, $use_more_link, $more_link_text, $force_more, $fakeit, $fix_tags, $no_more, $more_tag, $more_link_title, $showdots) {
Possible Other Issues and Other Features
Even with this fix, I can envision some other problems that the_excerpt_reloaded() may cause.
Even with $fix_tags set to true, opened tags that are closed prematurely because the excerpt is cut-off will not have all of the intended text between the opening and closing tag. This may or may not be an issue. It may be a good idea to cut off the opening tag (as well as all text after it) to make this cleaner.
Another issue is that when counting N words, the plugin will erase any kind of whitespace (including newlines) between each word and just create a “space”. I’m not sure if this is always an issue. It may cause paragraphs to get combined.
One feature that I would like to add is to make the $more_link_text appear as part of the last paragraph (at the end) instead of separated as it’s own paragraph.
When I have some time, I’ll see if I can make these improvements to the_excerpt_reloaded(). It was fun doing the detective work to determine the cause of the invalid XHTML problem. I hope that this fix is useful for you.

Thanks for the update!
[…] June 5, 2007: Rob from Rob’s Notebook posted a comment, which you can see below, about his mod of the Excerpt Reloaded plugin. There is a problem with the original plugin where very often the closing paragraph tag </p> […]
Cheers for the update Rob.
[…] –more– pseudo-tag - the body of the code comes courtesy of the-excerpt-reloaded with modifications - for which, many […]
[…] the_excerpt Reloaded […]
Good work! I wish I had found this about 3hours earlier
You missed this one:
if(’all’ != $allowed_tags) {
$output = strip_tags($output, $allowedtags);
}
if(’all’ != $allowedtags) {
$output = strip_tags($output, $allowedtags);
}
Good find Hillary! Thank you for pointing it out.
I updated the downloadable plugin to have this fix. It is now at version R1.3.
Thank you so much for this fix! I spent quite a long time trying to figure out why it wasn’t XHTML valid!
[…] el tema de los excerpt (”seguir leyendo”) automáticos, acabé cogiendo “the_except_reloaded” (ya modificado por otro) para poder cortar por caracteres y no por palabras, respetando las […]
[…] The Excerpt Reloaded This plugin does what I always wanted to be done: instead of your post being truncated without any option and shown plain (and boring) by the original “the_excerpt()” function of Wordpress, this plugin - that was originally written by Kaf Oseo - gives you control over the length and the format of your excerpt. While truncating the post, this plugin tries to prevent tags from not being closed - it works in most cases. Sometimes, properly closing the tags does not work, which will break XHTML validity - e.g. if a link is created at the very position where the post is truncated. Whatever, it works in most cases, and I am sure they find a way of further improving the plugin. […]
Thank you for this interesting article! It has really helped me building my own website!
[…] dies ist verstanden und der Tipp das Plugin the excerpt Reloaded zu verwenden für mich zu der Downloadseite mit den notwendigen Hinweisen. Nun geht es ans […]
Using this plugin (your version) with WP 2.3.2 - is there a way to make it take the excerpt from the most recent Post, but not Pages? I use this in the sidebar and it works fine on my homepage (showing excerpt from most recent post), but when I go to a Page, this section of the sidebar now shows an excerpt from that particular Page, which is not what I want. Is there a modification I can make to stop it from including Pages? Thanks for any advice you can offer….
I using the last plugin - Thank you. I still get a 404 error when I click continue to view the entire post. Please, direct me to right place to get an answer. I’ve been working for hours on this :[
I wanted to ask if that fix goes into the plugin itself or the where the file is applied into the loop?
I downloaded the fixed plugin and uploaded it to my blog, and still the paragraphs in my main page ignore the . what can still be wrong in my version? Could someone help me?
This is what I currently have in my code:
‘, ‘none’, true, ‘Keep Reading…’, false, 1,0,”,”,1); ?>
It still doesnt add the paragraph breaks.
I upgraded my sandbox to WordPress 2.5.rc2 (release candidate 2) and the_excerpt_reloaded R1.4 appears to be function fine without any glitches.
Just wanted to let you know!!
[…] little plugin for Wordpress that allows me to format my excerpts all I like. It’s called The Excerpt Reloaded, and it let’s you set a ton of options like excerpt length, the name of your “read […]
Hei!
Thanks for your work, but it doesn´t work for me. I use your fixed Plugin and this code:
the_excerpt_reloaded(150, ‘
Whats wrong? Can you help me?
Thanks a lot!
Greetings.
M
I´m sorry, but the my code i not right. The right code is her: http://www.mein.meerblickzimmer.de/theexcerpt.txt
Maybe can you help me.
Thanks a lot! M
Hello
1.
I am using this MOD, GREATTT ! the blank admin page and the “Wordpress requires Cookies but your browser does not support them or they are blocked” issue both are solved by this MOD. I hope others can benefit from this. Since I was wandering and the issue was of Excerpt_reloaded, other than anything.
2. I have another problem - when I type exceprt, It does not show the typed excerpt. + earlier version it showed typed excerpt but did not give me the more link.
3. So you need to provide solution for, showing the typed excerpt, with the customized more link.
PLEASEEEE
Does this works with Wordpress 2.5?
[…] the_excerpt Reloaded (more customisable than WP’s default excerpt function, it allows you to exclude elements such as images, links, etc.) […]
Nice plugin! I have been using a slightly older version for a while and have finally just updated to the current version. I would seriously make a donation for the $more_link_text appear as part of the last paragraph feature. That has been my one peeve since I installed it in early 2007.
To answer NSpeaks question. Yes it works in WP 2.5 as I have recently upgraded.
I noticed several people having issues recently getting it to work correctly so here after uploading the excerpt reloaded to my plugin directory, I edited my index.php for my theme to have the following code that works for me:
‘, ‘content’, TRUE, ‘Read more »’, FALSE, 2); ?>
Forgot to escape it:
<div class="entry">
<?php if (is_search()) { ?>
<?php the_excerpt() ?>
<?php } else { ?>
<?php the_excerpt_reloaded(100, ‘<img><p><strong>’, ‘content’, TRUE, ‘Read more »’, FALSE, 2); ?>
<?php } ?>
</div>
Dave or anyone else: I’d be glad to take a donation (for the right amount) to add features.
[…] someone - who wrote that Wordpress’s core developers are still in the world of 2003 or so. ] http://robsnotebook.com/the-excerpt-reloaded […]