Simple Site Maps Tricks + 1 bug

First of this has been a pain in my .... well you guys know were i am going

1) The trick edit simple-sitemaps.php

Look for line 39 and just change it to the number of posts you want to see in your site map I have change mine to 500 and have tested it with Google WebMaster tools it will submit 500 urls to them SWEET that is what i wanted :slight_smile:

@WPMU DEV please let me know if you guys can not do this for us in the new version of the cool plug

2) IS SITE MAPS "MADE ON THE FLY" i have tested this and retested it and i say yes what do you guys say.

I have looked in these locations and i dont see any site map files so this is telling me that files are made on the fly :slight_smile:

/home/blogline/public_html/wp-content/blogs.dir/
/home/blogline/public_html/wp-content/blogs.dir/assets
/home/blogline/public_html/wp-content/blogs.dir/assets/00
/home/blogline/public_html/wp-content/blogs.dir/assets/01

PS also take note of our new file system @Mustafa Uysal thank you for setting this up will still do a post for you guys about file limits in linux

3) The Bug BlogLines.co.za/sitemap.xml it works but i need to have it filtered it must only show posts from BlogLines.co.za not from any outher domain i have listed on BlogLines.co.za

This is from Google Webmaster Tools
This url is not allowed for a Sitemap at this location.
http://wakeboardsforsale.clubspa.com/arnette-heist-rectangular-sunglasses/
http://swimmingpooldivingboard.clubspa.com/s-r-smith-66-209-598s20t-frontier-iii-replacement-diving-board-8-feet-pewter-gray/

This is making sense to me as it is not part of BlogLines.co.za and that is why Google is having a pain with it

I am using http://ocaoimh.ie/wordpress-mu-sitewide-tags/ to pull all content from all blogs on BlogLines.co.za so that is why i am seeing the clubspa urls.

Is there know easy way of adding a filter to the Simple Sitemaps to only allow url from the main domain name :slight_smile:

I hope you guys can help out here as this is one of those ones i just can do that is why i am asking :slight_smile:

  • stevewest15
    • Flash Drive

    I just downloaded the latest release of this plugin (v 1.0.5) and it seems like it still contains the same bug as I posted on the forum months ago:

    - If you have WP MU Network Admin Settings -> Domain Mapping set with the "Redirect administration pages to site's original domain" option selected, then all of the site maps will contain incorrect URLs instead of the mapped domain name and Google will penalize you as the domain doesn't match the URLs in your site map.

    Also, there is no way to make a setting for the # of total posts this plugin will include in the sitemap.xml without editing the actual file which means at the next update, it will have to be edited once again.

    WP MU team we really need a real site map plugin instead of this amateurish attempt at a site map plugin.

  • Mark de Scande
    • Syntax Hero

    @stevewest15 Dude that is just low super low no need for harsh words WP MU team we really need a real site map plugin instead of this amateurish attempt at a site map plugin.

    Just by the way they do have a system they call Infinite SEO https://premium.wpmudev.org/project/wpmu-dev-seo/

    Simple Sitemaps are for guys like me i run http://www.bloglines.co.za and we have 81,765 sites and 78,506 users.

    Infinite SEO is for guys like you that need all the SEO stuff.

    So next time please dont call WPMUDEV amateurish thats just wrong.

  • aecnu
    • WP Unicorn

    Greetings Mark, Alan, and stevewest15,

    @Mark and Alan, thank you guys as always for your positive input, it is greatly appreciated.

    @stevewest15 slamming on the lead developer will certainly not be productive and is usually counter productive.

    @Mark, Alan, and stevewest15

    Here is an update with a potential fix for the issue.

    Please try the attached update and please advise if it helps.

    Please, make sure that you also update the sitemap.php file in your wp-config directory.

    Cheers, Joe

  • Mark de Scande
    • Syntax Hero

    Thank you for the upgrade

    1) Change line 39 simple-sitemaps.php 1000 (so no i have 1000 items in the sitemap)
    2) Tested it on BlogLines.co.za/sitemap.xml still have links from other domains we have.

    I have to ask what was the fix for ?

    Don't forget i do have the WordPress MU Sitewide Tags Pages from http://ocaoimh.ie/wordpress-mu-sitewide-tags/

    I use it to pull in all the content from all the blogs.

    Next BUG from my side ms-default-constants.php we had to mod this as we got to our linux file limit ext3 filesystem i do see the err below but if i go to the site i can see the site map

    [25-Jun-2012 06:51:22 UTC] PHP Warning: file_put_contents(/home/blogline/public_html/wp-content/blogs.dir/83023/files//sitemap.xml) [<a href='function.file-put-contents'>function.file-put-contents</a>]: failed to open stream: No such file or directory in /home/blogline/public_html/wp-content/plugins/simple-sitemaps/simple-sitemaps.php on line 163

    Here is the code we used to get it fixed that the users can upload there images and it works perfectly

    function ms_upload_constants(  ) {
    	global $wpdb;
    
    	if ( !defined( 'UPLOADBLOGSDIR' ) )
    		define( 'UPLOADBLOGSDIR', 'wp-content/blogs.dir' );
    
    	if( $wpdb->blogid <81925) {
    	 //first don't change default upload pacth
    		define( 'UPLOADS', UPLOADBLOGSDIR . "/{$wpdb->blogid}/files/" );
    		if ( 'wp-content/blogs.dir' == UPLOADBLOGSDIR )
    			define( 'BLOGUPLOADDIR', WP_CONTENT_DIR . "/blogs.dir/{$wpdb->blogid}/files/" );
    	}
    	else if ( ($wpdb->blogid > 81925) && ($wpdb->blogid < 113000) ) {
     //change path for new uploads just deal with ext3 32k limitation
    		define( 'UPLOADS', UPLOADBLOGSDIR . "/assets/00/{$wpdb->blogid}/files/" );
    		if ( 'wp-content/blogs.dir' == UPLOADBLOGSDIR )
    			define( 'BLOGUPLOADDIR', WP_CONTENT_DIR . "/blogs.dir/assets/00/{$wpdb->blogid}/files/" );
    	}
    	else if ( ($wpdb->blogid > 81925) && ($wpdb->blogid < 145000) ) {
     //sector 2
    		define( 'UPLOADS', UPLOADBLOGSDIR . "/assets/01/{$wpdb->blogid}/files/" );
    		if ( 'wp-content/blogs.dir' == UPLOADBLOGSDIR )
    			define( 'BLOGUPLOADDIR', WP_CONTENT_DIR . "/blogs.dir/assets/01/{$wpdb->blogid}/files/" );
    	}
    }

    Please have a look for us to see if we can not get a work around for the ext3 filesystem bug.

    Thank you guys again for helping out :slight_smile:

  • aecnu
    • WP Unicorn

    Greetings Mark,

    Thank you for your feedback and input.

    I have to ask what was the fix for?

    This new version that I found buried in the back was dealing with site maps and domain mapping as indicated here:
    https://premium.wpmudev.org/forums/topic/minor-bug-sitemap-domain-mapping

    And though the lead developer asked for feedback he did not get a word.

    Since the addressed fix within has not come out with an official release, I brought it here as to not get away from us and to build upon.

    Next BUG from my side ms-default-constants.php we had to mod this as we got to our linux file limit ext3 filesystem i do see the err below but if i go to the site i can see the site map

    What is this bug exactly addressing? I know it is 32,000 files but specifically to what?

    I ask this because I know you have more then this many blogs if memory serves me correctly.

    Here is the code we used to get it fixed that the users can upload there images and it works perfectly

    Upload there images for what Mark, What is this code specifically addressing my friend?

    Please advise.

    Cheers, Joe

  • Mark de Scande
    • Syntax Hero

    1) domain-mapping it works perfectly http://gymequipmentforsale.co.za/ sitemap.xml #TIP if it dont work for you just quick edit one of your post and click update this will resubmit the sitemap with the correct urls

    2) ms-default-constants.php The code was how we modded the file system.

    The Problem is the err that the site map is trying to put the files in the incorrect dir

    Looking at the code sitemap.php line 9
    $cachefile = dirname(__FILE__) . '/blogs.dir/' . $wpdb->blogid . '/files/sitemap.xml';

    It is starting the code from the blogs.dir but in fact my file system has change to this

    /home/blogline/public_html/wp-content/blogs.dir/
    /home/blogline/public_html/wp-content/blogs.dir/assets
    /home/blogline/public_html/wp-content/blogs.dir/assets/00
    /home/blogline/public_html/wp-content/blogs.dir/assets/01

    SO if we can get the first part fix it will work out the box.

    Dont forget the sitemaps works if you go to the new blogs it is just the err in the logs .
    I think the first time the sitemap is made it wants it to be in the correct place but as soon as you pull up the site in a browser it will work as it has now picked up the new location of the sitemap.

  • Mustafa
    • Syntax Hero

    Hiya folks,

    @Mark you need to edit sitemap.php file.

    example:

    if( $wpdb->blogid <81925) {
    		$cachefile = dirname(__FILE__) . '/blogs.dir/' . $wpdb->blogid . '/files/sitemap.xml';
    		}
    		else if ( $wpdb->blogid > 81925&& $wpdb->blogid < 113000) {
    		$cachefile = dirname(__FILE__) . '/blogs.dir/assets/00/' . $wpdb->blogid . '/files/sitemap.xml';
    
    		}
    		else if ( $wpdb->blogid >= 113000&& $wpdb->blogid < 145000) {
    		$cachefile = dirname(__FILE__) . '/blogs.dir/assets/01/' . $wpdb->blogid . '/files/sitemap.xml';
    
    		}

    and you need to edit simple-sitemap.php file

    example:

    function DeleteSitemap() {
    		global $wpdb;
    
    		if( $wpdb->blogid <81925) {
    		@unlink( ABSPATH . 'wp-content/blogs.dir/' . $wpdb->blogid . '/files/sitemap.xml' );
    		}
    		else if ( $wpdb->blogid > 81925&& $wpdb->blogid < 113000) {
    		@unlink( ABSPATH . 'wp-content/blogs.dir/assets/00/' . $wpdb->blogid . '/files/sitemap.xml' );
    		}
    		else if ( $wpdb->blogid >= 113000&& $wpdb->blogid < 145000) {
    		@unlink( ABSPATH . 'wp-content/blogs.dir/assets/01/' . $wpdb->blogid . '/files/sitemap.xml' );
    		}
    	}

    bla bla bla....

    Cheers,

    Note: WPMU DEV plugins,themes etc... tested with default wordpress structure. So, don't forget to need custom development for custom modifications :wink:

  • aecnu
    • WP Unicorn

    Greetings Mark,

    Thank you for the additional feedback, it is certainly appreciated as you well know.

    I have already flagged the lead developer in here and hopefully he will have the chance to respond soon and also look over this code snippet.

    Thanks as always.

    Cheers, Joe

  • Mark de Scande
    • Syntax Hero

    Hey guys was doing some testing with the new code

    1) I picked up this blog 81534 http://watches767.besthometreadmill.co.za/ then i loaded the sitemap http://watches767.besthometreadmill.co.za/sitemap.xml i can see the sitemap load perfectly i then looked at my logs got the err below

    [25-Jun-2012 11:38:33 UTC] PHP Warning: file_put_contents(/home/blogline/public_html/wp-content/blogs.dir/81534/files//sitemap.xml) [<a href='function.file-put-contents'>function.file-put-contents</a>]: failed to open stream: No such file or directory in /home/blogline/public_html/wp-content/plugins/simple-sitemaps/simple-sitemaps.php on line 172

    @Mustafa or the Def please have a look and see how we can get rid of this err :slight_smile:

    It is just strange that the sitemap works but it troughs a err to the logs ..

  • aecnu
    • WP Unicorn

    Greetings Mark,

    Thank you Mark as always for the additional feedback, it is greatly appreciated.

    I will go ahead and ping the lead developer again so that he can possibly revisit this ticket taking note of our progress and testing.

    I also noted a problem with your submitted error above which this error indicates usually a permissions problem because it cannot access a file, but note that on line 3 above right before sitemaps.xml there is a double forward slash that would certainly cause the error i.e. //sitemap.xml

    Cheers, Joe

  • Vladislav
    • Dead Eye Dev

    Hello,

    I was working on the plugin to allow for easily modifying all these aspects, without modifying the plugin sources (so the updates apply cleanly in the future). The new approach in the latest plugin release (v1.1, just released) could likely solve the doubled-up slashes issue as well.

    The new system relies on a couple of defines and hooks, and I'm also attaching a file with some simple example code to explain and illustrate their behavior. Also, here is a short list of new behavior-modifying means:

    SIMPLE_SITEMAPS_POST_SOFT_LIMIT - A define that regulates overall soft limit for latest post/page in a sitemap. "Soft" limit, because it's passed as a default value to soft limit and per-type filters, which can change that just in time for query fetching.

    SIMPLE_SITEMAPS_USE_CACHE - A define that regulates the use of sitemap caching. Defaults to "true", set this to false to dynamically generate sitemaps on demand.

    simple_sitemaps-totals_soft_limit - A filter that regulates the overall limit for latest post/page in a sitemap.

    simple_sitemaps-pages_count_override - A filter that regulates the final total pages count fetched for sitemaps.

    simple_sitemaps-posts_count_override - A filter that regulates the final total posts count fetched for sitemaps.

    simple_sitemaps-include_post - A filter that's being called for each post in the sitemap, and includes it in the sitemap or skips it, depending on the result.

    simple_sitemaps-generated_urlset - A filter that enables you to post-process the generated urlset.

    simple_sitemaps-use_cache - A filter that finally regulates the file chaching usage.

    simple_sitemaps-sitemap_location - A filter that sets the sitemap cache location.

    For an example, in the particular case for this thread, to accommodate the /assets/ directory structure, something like this would work as a mu-plugin (based on the already existing code):

    function bloglines_directory_path ($file) {
    	global $wpdb;
    	if ($wpdb->blogid < 81925) {
    		return ABSPATH . 'wp-content/blogs.dir/' . $wpdb->blogid . '/files/sitemap.xml';
    	} else if ($wpdb->blogid > 81925 && $wpdb->blogid < 113000) {
    		return ABSPATH . 'wp-content/blogs.dir/assets/00/' . $wpdb->blogid . '/files/sitemap.xml';
    	} else if ($wpdb->blogid >= 113000 && $wpdb->blogid < 145000) {
    		return ABSPATH . 'wp-content/blogs.dir/assets/01/' . $wpdb->blogid . '/files/sitemap.xml';
    	}
    	return $file;
    }
    add_filter('simple_sitemaps-sitemap_location', 'bloglines_directory_path');

    I believe this is the best way forward, which enables us the flexibility to support specific cases, as well as backwards compatibility and simplicity for others.

  • Mark de Scande
    • Syntax Hero

    @VeBailovity Well done you put a lot of work in to this.

    Report

    The new system out the box running on http://www.bloglines.co.za Simple Sitemaps for Multisite Version: 1.1

    Added these the wp-config.php
    define('SIMPLE_SITEMAPS_POST_SOFT_LIMIT', 500);
    define('SIMPLE_SITEMAPS_USE_CACHE', false);

    1) SO if i have it right sitemaps are now made on the fly and no longer need the to be in the blog.dir

    The good news is i don't see errs in /wp-content .

    2) The Pain in my ... is http://bloglines.co.za/sitemap.xml it contains urls from mapped domains from BlogLines sites big problem as google only allow urls from the main domain.

    Now how can we add a filter to wp-config.php to only have urls from bloglines.co.za.

    Please let me know on point 1 and 2 i think were getting there.

    @VeBailovity thank you for helping out i know we have taking this plug to the next level so that it is easy to set up and get running and that is the way things should be and i am proud to be a member @wpmuDEV

  • aecnu
    • WP Unicorn

    Greetings Mark,

    Thank you my friend for your continued feedback and input.

    I have no problem with you wanting to not have other domains within your sitemap, but I do not agree that Google does not follow them and ignores them.

    as google only allow urls from the main domain

    If this were indeed true then the rel=\"nofollow\" would be an irrelevant bot command, yet Matt Cutts from Google would not recommend people using the rel=\"nofollow\" in their links to keep Google bot from following the link - and that is all a sitemap is, an organized list of links.

    As a matter of fact it may indeed help those sites in the down line leaking rank to them and also help the main site if those sites are relevant to it giving it authority..

    As I mentioned, I understand and have no problem with not wanting these extra links, but I do not agree that Google does not acknowledge them. Like any other link Google will follow them too without the bot seeing the rel=\"nofollow\".

    Please advise.

    Cheers, Joe

  • aecnu
    • WP Unicorn

    Greetings Mark,

    Indeed what a great find that is.

    When I tried to dig into this statement a little more by Matt, it appears he was referring to links in the web site itself and not site maps.

    I could not find the exact video where he was referring to this, it may take literally hours of going through videos to get the clip.

    Therefore, though my question was valid, you are both right and I was misapplying the information to sitemaps rather then web site links.

    Cheers, Joe

  • Mark de Scande
    • Syntax Hero

    Matt the most hated man on the internet :slight_smile:

    Sitemap urls must only be from your site

    Website urls can almost be to any were

    I think with the site maps they dont want you to give outer websites credit but on your site your allowed to .

    So yes i think we were both right and wrong at the same time.

    So :slight_smile: https://premium.wpmudev.org/forums/topic/simple-site-maps-tricks-1-bug#post-239024 were back to were we were :slight_smile: @VeBailovity please let us know if you can make a filter to only allow urls of the main domain to be displayed.

    I did see this simple_sitemaps-generated_urlset - A filter that enables you to post-process the generated urlset. now will this work for us and if yes how will we add this to the sitemap :slight_smile:

  • aecnu
    • WP Unicorn

    Greetings Mark,

    I will go ahead and ping Ve in here again on this issue because he may not be following this thread anymore believing that this ticket is resolved once he made his update. But it would be good to catch him while the code is still fresh in his mind :slight_smile:

    Cheers, Joe

  • Vladislav
    • Dead Eye Dev

    Hello,

    Thank you very much @Mark de Scande for your continuous help in making our plugins better, and for all the awesome feedback! As for your point #1, those two defines should work exactly as you intended - the limit for your pages and posts included in the sitemap should be increased to 500, and the sitemap should be served dynamically, bypassing any cache (this should also, as a consequence, resolve any domain mapping admin/public pages differences).

    Now, as for issue #2 - the wrong domain articles being included in the results, as reported by Google tools. Indeed, only one domain should be mentioned in the sitemap urlset. Do you perhaps have the plugin running on your site at the moment? Because, if so, I'm a bit confused - I just checked your sitemap (on http://bloglines.co.za/sitemap.xml), and haven't been able to find any URLs pointing to different domains. Is it perhaps possible that this was a caching issue, either on Google end or on bloglines.co.za end?

    Anyway, if it turns out it wasn't as simple as caching issue, I think the "simple_sitemaps-include_post" filter might be our best bet here. This filter is called for each post/page before it actually gets included in the sitemap urlset. If the processing function hooked to the filter returns false, the post/page won't be included in the sitemap - so we can put it to use and filter out the posts with offending URLs, should we need to.

    However, it's probably better to try simple approach first and see if that helps. Can you please check your cache and, if that is all good, check if your sitemap still includes the wrong URLs?

  • aecnu
    • WP Unicorn

    Greetings Mark,

    What you may want to consider in this case in the future my friend would have been to have the entire DB extracted on your computer and just uploaded the tables that needed to be replaced via ssh i.e. .myi and .myd files or just that DB's entire folder.

    I would think this would be much faster to accomplish.

    Cheers, Joe

  • Mark de Scande
    • Syntax Hero

    Ouch rub salt in my .. the problem is the db is 65 gb

    In South Africa 2gb = R400 = $50 so lets work this out

    65 x 25 = $1626 just to down load a copy of the DB .

    Ps BlogLines is back up but i can not add new blogs i get this err

    Already Installed
    You appear to have already installed WordPress. To reinstall please clear your old database tables first.

    Sucks big time

  • aecnu
    • WP Unicorn

    Greetings Mark,

    I am truly sorry to hear that you are having so much trouble that we discussed in our off forum conversation.

    Perhaps decompressing the data base backup file on the server and then copying the files on the server to the applicable directory would do the job?

    Please advise.

    Cheers, Joe

  • Mark de Scande
    • Syntax Hero

    @VeBailovity Hey buddy the server db is working till a point :slight_smile:

    Please let me know if you made a filter to remove posts that are not from the same domain name.

    Google SiteMaps Err and screen shot was posted
    URL not allowed // This url is not allowed for a Sitemap at this location. //

    See if we can not get this plug to do this for us as well then i think it will be compatible with any setup :slight_smile: even mine ..

  • aecnu
    • WP Unicorn

    Greetings Mark de Scande,

    Glad to see that things are moving forward for you and hope that we can get some resolution to this marathon ticket soon.

    @Kimberly of course his sites are up and running, his is working with my servers now :slight_smile: Now if we can just get Ve in here ....

    Cheers, Joe

  • aecnu
    • WP Unicorn

    Greetings Mark de Scande,

    At this point though the lead developer has been pinged several times to this ticket, it is as of your last post to be reclassified as a feature request.

    I am still waiting for some type of filter to only allow urls from the main domain :slight_smile:

    Not an easy task form my understanding, but hopefully the lead devloper will consider adding it into the plugin in a future release.

    Cheers, Joe

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.