XML parsing failed error when sitemap.xml is called via browser

In certain customer cases, we have been forced to embed Javascript calls within a given menu item from the site menu. This largely has to do with 3rd party integration of certain functions.

We were previously using the Google XML Sitemaps plugin which seemed to handle this very well: The Google XML Sitemaps plugin would parse the javascript and display it (without all of the link info).

As you are probably aware, the Google XML Sitemaps plugin is not yet compatible with Multi-Site (There is a version working through beta testing right now but it has its own problems.)

Since that plugin doesn't work with Multi-Site, we decided to give Simple Sitemaps a try.

The problem we are having is that Simple Sitemaps is trying to display all of the Javascript details and is generating the XML parsing error as a result.

Here is an example of the error:

XML parsing failed

XML parsing failed: syntax error (Line: 129, Character: 76)

Reparse document as HTML
Error:
well-formedness constraint: entity declared
Specification:
http://www.w3.org/TR/REC-xml/#wf-entdeclared
126: <priority>1.0</priority>
127: </url>
128: <url>
129: <loc>javascript:SFLNewWin('http://www.domain.com/newsletters/?QID=223&tokenid=ae0bc3d83dfd1e061dd1e06c624b232b85')</loc>
130: <lastmod>2012-01-07T19:52:33+00:00</lastmod>
131: <priority>1.0</priority>
132: </url>

I'm wondering if there's any suggested work-around to handle this?

Thanks!

  • Timothy
    • Chief Pigeon

    Hey there! :slight_smile:

    Just checking in to see how things are going. :slight_smile:

    We haven't heard form you on this in a while. So I'm going to presume your all fixed up now and don't need any further assistance.

    However if you have more questions or need some more help then please feel free to respond in this thread or create a new one and we will be more than happy to offer assistance. :slight_smile:

    Take care.

  • Vladislav
    • Dead Eye Dev

    Hello,

    The parsing error you're seeing seems to be related to an unescaped ampersand ("...&tokenid=..."). Ampersands have special meaning in XML, as they denote the XML entities (e.g. > or <:wink:. Could you please try to rewrite the link using "&amp;" instead of the plain "&"?

  • Ovidiu
    • Code Wrangler

    got a very similar problem here:

    http://zice.ro/sitemap.xml or here
    http://sarcasticu.zice.ro/sitemap.xml

    open in browser gives error:

    This page contains the following errors:

    error on line 2 at column 6: XML declaration allowed only at the start of the document
    Below is a rendering of the page up to the first error.

    feed validator check here: http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fsarcasticu.zice.ro%2Fsitemap.xml

    it seems the first line is empty and the second line contains the XML declaration.

    any fixes?

  • Vladislav
    • Dead Eye Dev

    Hi,

    Actually, Simple Sitemaps creates a file in your blogs.dir, under files subdirectory. If it doesn't exist yet (i.e. on the first run), it will print out the dynamically generated content, but it will also cache the results in the file.

    Having said that, it is very possible that a piece of code (a plugin or a theme) is echoing some whitespace characters (a line break) before the cached file (or dynamic output) is rendered by Simple Sitemaps. This would typically be caused by a trailing line break in a PHP file after the closing PHP tag.

  • Ovidiu
    • Code Wrangler

    ok, help me out a bit please:

    I deleted the sitemap file for the main blog, opened zice.ro/sitemap.xml in my browser and all I get is: plugin missing? the sitemap.xml file isn't being generated again.

    network deactivate the plugin network activated it again and still the same: no sitemap is being generated again...

    what does it mean by plugin missing?

  • Vladislav
    • Dead Eye Dev

    Hi,

    It means that, after attempting to load the WordPress environment, the sitemap output/generation handler (sitemap.php) is unable to find the main class. Just a quick check, do you have sitemap.php file copied to your wp-content directory? If you do, are you perhaps using a different file structure then usual (i.e. moved root directory)?

  • Ovidiu
    • Code Wrangler

    Having said that, it is very possible that a piece of code (a plugin or a theme) is echoing some whitespace characters (a line break) before the cached file (or dynamic output) is rendered by Simple Sitemaps. This would typically be caused by a trailing line break in a PHP file after the closing PHP tag.

    any hint as to what type of plugin could be interfering? I don't run any other plugins that deal with sitemaps...

    I jsut checked my .htaccess file and found this line:

    RewriteRule ^(.*/)?sitemap.xml wp-content/sitemap.php [L]

    is that still necessary for this plugin? it might be a leftover from previous versions...

  • Vladislav
    • Dead Eye Dev

    I'm sorry, I haven't explained very well what I believe is happening. By the time the sitemap is about to be sent to your browser, a newline was already sent and thus, the XML declaration will come as a second line, not the first. This scenario also fully explains your screenshot and the discrepancy between the file on the server and what you get in your browser.

    Now, about the origin of the newline - this is most likely a trailing newline character (after the closing PHP tag) in one of your PHP files, very likely a plugin. If a PHP file has anything in it after the closing tag, that content will be sent to the browser just by including the file (it's actually a bit more complicated then that, but basically that's what happens if we rule out buffering). The best approach would be either deactivating plugins one by one and testing if the newline disappeared from the output. If this is not possible, you can also check the files yourself, and see if there are any newlines after the closing PHP tags in them.

  • Ovidiu
    • Code Wrangler

    ah, I do understand now.
    unfortunately there are too many plugins to deactivate one by one and it is a live site.

    thinking of alternate methods of detecting the faulty plugin, shouldn't a plugin with an empty line behind the closing php tag throw some warning/error/notice into my logs? can't find anything there :slight_frown:

    gonna read up on some linux manuals to see if there is a command line method of finding files with an empty line after the closing php tag, if anyone has any knowledge about this I'd be very grateful :slight_smile:

  • Vladislav
    • Dead Eye Dev

    Just saw your previous post, I missed it earlier as I was typing my response :slight_smile: The .htaccess rule is still needed for Simple Sitemaps, and it's a good thing you already have it in place. Also, any character outside the opening and closing PHP tags is perfectly valid, so it wouldn't appear in error logs. For finding the file that causes the newline output, you may want to look into grep tool but, depending on how you craft your regular expression for search, it's likely you'll see quite a few false positive results.

  • Ovidiu
    • Code Wrangler

    no luck.

    just manually edited all plugin files that were network activated and inside mu-plugins and removed countless empty lines after closing php statements. didn't go into any subfolders though but the problem still persists :slight_frown:

    not sure what to do now.

    does the newer SEO plugin you guys offer here use the same technology or could that fix the problem by replacing Simple Sitemaps for Multisite with Infinite SEO fix this?

    I never wanted to use Infinite SEO because the Simple Sitemaps for Multisite is all I need but if that will help, I will have to repalce it.

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.