SmartCrawl sitemap gives 404 error

SmartCrawl sitemap gives 404 error. I've tried to add the following rules to nginx config file
rewrite ^/(.*/)?sitemap.xml /wp-content/uploads/sitemap.xml last;
Tried to disable all other plugins and to use default WordPress .htaccess, but it doesn't work.

  • Ash

    Hello Jasper

    I can see the sitemap is not created at all. Katya mention in live chat that she can see a sitemap created for subsite ID 2, but when I access the file directly, this doesn't work either. You would see a 404 request for this URL too: https://kwalitei******door.nl/wp-content/uploads/sites/2/sitemap.xml that is subsite ID 2.

    Would you please confirm, 1. if your php has write permission to your folder? And 2. If you have correct nginx rule for multisite. Would you please post here the rule you used for the multisite?

    Have a nice day!

    Cheers,
    Ash

  • Jasper

    Hi Ash,

    PHP has writing permissions, and uses fpm distro. Below is the content of the nginx-file for bouwexpert.online (and its sub-sites):

    # Auto generated nginx config file by DirectAdmin version 1.52.1
    # Modifying this file is not recommended as any changes you make will be
    # overwritten when the user makes any changes to their website

    # For global config changes that affect all Users, see this guide:
    # http://help.directadmin.com/item.php?id=558
    # For local config changes that only affect one User, see this guide:
    # http://help.directadmin.com/item.php?id=3

    server
    {
    try_files $uri $uri/ /index.php?$args;
    rewrite ^/(.*/)?sitemap.xml /home/bouwonline/domains/bouwexpert.online/public_html/wp-content/uploads/sitemap.xml last;
    # Expire rules for static content
    # cache.appcache, your document html and data
    location ~* \.(?:manifest|appcache|html?|xml|json)$ {
    expires -1;
    # access_log logs/static.log; # I don't usually include a static log
    }
    # Media: images, icons, video, audio, HTC
    location ~* \.(?:jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm|htc)$ {
    expires 1M;
    access_log off;
    add_header Cache-Control "public";
    }
    # CSS and Javascript
    location ~* \.(?:css|js)$ {
    expires 1d;
    access_log off;
    add_header Cache-Control "public";
    }
    location ~* \.(eot|otf|ttf|woff|woff2)$ {
    add_header Access-Control-Allow-Origin *;
    }
    location ~ ^/(status|ping)$ {
    access_log off;
    allow 127.0.0.1;
    allow 37.97.235.106;
    include /etc/nginx/fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    fastcgi_pass unix:disappointed:usr/local/php70/sockets/famme.sock;
    }
    listen 37.247.40.62:80;
    server_name bouwexpert.online http://www.bouwexpert.online kwaliteitsstukadoor.nl http://www.kwaliteitsstukadoor.nl;
    access_log /var/log/nginx/domains/bouwexpert.online.log;
    access_log /var/log/nginx/domains/bouwexpert.online.bytes bytes;
    error_log /var/log/nginx/domains/bouwexpert.online.error.log;
    root /home/bouwonline/domains/bouwexpert.online/public_html;
    index index.php index.html index.htm;
    include /usr/local/directadmin/data/users/bouwonline/nginx_php.conf;
    include /etc/nginx/webapps.conf;
    }

    server
    {
    try_files $uri $uri/ /index.php?$args;
    rewrite ^/(.*/)?sitemap.xml /home/bouwonline/domains/bouwexpert.online/public_html/wp-content/uploads/sitemap.xml last;
    # Expire rules for static content
    # cache.appcache, your document html and data
    location ~* \.(?:manifest|appcache|html?|xml|json)$ {
    expires -1;
    # access_log logs/static.log; # I don't usually include a static log
    }
    # Media: images, icons, video, audio, HTC
    location ~* \.(?:jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm|htc)$ {
    expires 1M;
    access_log off;
    add_header Cache-Control "public";
    }
    # CSS and Javascript
    location ~* \.(?:css|js)$ {
    expires 1d;
    access_log off;
    add_header Cache-Control "public";
    }
    location ~* \.(eot|otf|ttf|woff|woff2)$ {
    add_header Access-Control-Allow-Origin *;
    }
    location ~ ^/(status|ping)$ {
    access_log off;
    allow 127.0.0.1;
    allow 37.97.235.106;
    include /etc/nginx/fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    fastcgi_pass unix:disappointed:usr/local/php70/sockets/famme.sock;
    }
    listen 37.247.40.62:443 ssl http2;
    server_name bouwexpert.online http://www.bouwexpert.online kwaliteitsstukadoor.nl http://www.kwaliteitsstukadoor.nl;
    access_log /var/log/nginx/domains/bouwexpert.online.log;
    access_log /var/log/nginx/domains/bouwexpert.online.bytes bytes;
    error_log /var/log/nginx/domains/bouwexpert.online.error.log;
    root /home/bouwonline/domains/bouwexpert.online/private_html;
    index index.php index.html index.htm;
    ssl on;
    ssl_certificate /usr/local/directadmin/data/users/bouwonline/domains/bouwexpert.online.cert.combined;
    ssl_certificate_key /usr/local/directadmin/data/users/bouwonline/domains/bouwexpert.online.key;
    include /usr/local/directadmin/data/users/bouwonline/nginx_php.conf;
    include /etc/nginx/webapps.ssl.conf;
    }

    While you are at it, can you also look if the WP Defender settings :slight_smile:, because I think the following pieces of code needs to be added here as well, right?

    ## WP Defender - Prevent information disclosure ### Turn off directory indexing
    autoindex off;

    # Deny access to htaccess and other hidden files
    location ~ /\. {
    deny all;
    }

    # Deny access to wp-config.php file
    location = /wp-config.php {
    deny all;
    }

    # Deny access to revealing or potentially dangerous files in the /wp-content/ directory (including sub-folders)
    location ~* ^wp-content/.*\.(txt|md|exe|sh|bak|inc|pot|po|mo|log|sql)$ {
    deny all;
    }
    ## WP Defender - End ##

    ## WP Defender - Prevent PHP Execution ##
    # Stop php access except to needed files in wp-includes
    location ~* ^/home/bouwonline/domains/bouwexpert.online/public_html/wp-includes/.*(?<!(js/tinymce/wp-tinymce))\.php$ {
    internal; #internal allows ms-files.php rewrite in multisite to work
    }

    # Specifically locks down upload directories in case full wp-content rule below is skipped
    location ~* /(?:uploads|files)/.*\.php$ {
    deny all;
    }

    # Deny direct access to .php files in the /wp-content/ directory (including sub-folders).
    # Note this can break some poorly coded plugins/themes, replace the plugin or remove this block if it causes trouble
    location ~* ^/home/bouwonline/domains/bouwexpert.online/public_html/wp-content/.*\.php$ {
    deny all;
    }

    ## WP Defender - End ##

  • Jasper

    Hi James,

    As soon as I try to save settings I get the following message: "Sorry, you are not allowed to access this page."

    Afterwards it is removed from the sub-sites and only accessible again via the Network Admin. Then I activiate it via the Netowrk Admin again, and the same thing happens, etc, etc...

    The following lines are added to config-file:

    define( 'DISALLOW_FILE_EDIT', true );
    define('WP_DEBUG', true);
    define('WP_DEBUG_LOG', true );
    define('WP_DEBUG_DISPLAY', false );
    @ini_set('display_errors', 0 );
    define('WP_MEMORY_LIMIT', '96M');
    define('SCRIPT_DEBUG', false);
    define('MULTISITE', true);
    define('SUBDOMAIN_INSTALL', true);
    define('DOMAIN_CURRENT_SITE', 'bouwexpert.online');
    define('PATH_CURRENT_SITE', '/');
    define('SITE_ID_CURRENT_SITE', 1);
    define('BLOG_ID_CURRENT_SITE', 1);
    define('SUNRISE', 'on');
    define('FORCE_SSL_ADMIN', true);
    define('WP_CACHE', true); // Added by WP Hummingbird
    define( 'WDS_SITEWIDE', false ); // Because of WP Smartcrawl

  • Ash

    Hello Jasper

    I have found something unusual too. When I logged in at your mapped domain site, it kicked me out from the main site. I had to add myself as an user of main site again.

    Also, when I tried to visit your subsite, I get 404. Would you please confirm that too?

    About the error, would you please disable all other plugins, add the required code in nginx conf to make the subsite work and then check the sitemap issue again? You can find the details here: https://premium.wpmudev.org/blog/wordpress-multisite-wordpress-nginx/

    Let us know how it goes.

    Have a nice day!

    Cheers,
    Ash

  • Jasper

    Hi Ash,

    I was thinking: I recently also had a sort of loop where it reloaded the sign-inn page multiple times before automatically signing-in when I was changing between the sites. When I look at 'Domain Mapping' and 'Multi-Domains' it looks like there is sort of an over-kill or something, because the mapped domain is already the domain we want. Maybe this could have something to do with it?

    Another thing that I noticed is that I see the Smartcrawl plugin on both the sites level and the network admin level (I believe that is since the update).

    Can you please notify me when you are going to investigate? Because I would like to try some stuff myself as well.

  • Ash

    Hi there

    I can visit both http://www.bouwexpert.online and http://www.kwaliteitsstukadoor.nl, but you probably mean the 'basis' site right?

    Yes, I was saying about the basis subsite and it's still not accessible.

    I have also seen you didn't have any pages or posts, so I created two test pages and one test portfolios, and then I noticed your site map file is created. I can see the file sitemap.xml inside wp-content/uploads folder. But still, when I visit https://domain.com/sitemap.xml I get a 404 error. Even when I try to direct access https://domain.online/wp-content/uploads/sitemap.xml I still get 404

    Then I did some more testing. I have created a file in /wp-content/uploads/ folder called a.txt. Then I checked with this: https://domain.online/wp-content/uploads/a.txt and it worked. Then I renamed sitemap.xml to a.xml and checked https://domain.online/wp-content/uploads/a.xml and that worked too. But when I visit https://domain.online/wp-content/uploads/sitemap.xml then I get a 404 error.

    That being said, would you please remove the rule:

    rewrite ^/(.*/)?sitemap.xml /wp-content/uploads/sitemap.xml last;

    And then restart the server and check again?

    First check https://domain.online/wp-content/uploads/sitemap.xml and then check https://domain.online/sitemap.xml and let me know if any of these works.

    Is it possible to open a chat when you look into the issue?

    I am sorry, this is not possible to notify before or initiate a chat as we process the tickets based on a queue so we don't know exactly when we will check your ticket next time. Thank you for your understanding.

    Have a nice day!

    Cheers,
    Ash

  • Ash

    Hello Jasper

    That's an improvement at least :slight_smile:

    I have checked your site again and found something again which is unusual. kwaliteitsstukadoor.nl domain is added as subsite domain and then you have multi domains that includes the kwaliteitsstukadoor.nl domain again. So, same doamin is pointing to subsite and main site. Besides, you also have domain mapping plugin, that is not needed if you use the mapped domain as site name.

    So, there is a hiccup on your setup. Would you please disable domain mapping, multi domains, create a functional and normal subsite and then try? If it works, then we can go forward to add additional domains on it.

    Let us know how it goes.

    Have a nice day!

    Cheers,
    Ash

  • Jasper

    Hi Ash,

    Hope you've had a nice Christmas, I did! Anyway, to get back to business...

    I removed Multi Domains, but disabling Domain Mapping made the website unreachable. So I commented the "define('SUNRISE', 'on');" line in the config-file and I removed the Domain Mapping plugin from the plugin folder. This way it doesn't seem to affect the site (so I guess I can delete the plugin entirely right?). However, the sitemap of the subsite is still unreachable, and a crawl of the sub-site says it has zero issues and only finds one URL.

    Afterwards I tried to create a new sub-site, but that gives some issues as well.

    - I can create a new site via the Network Admin,
    - and that does create tables in the database,
    - but it doesn't create a folder on the server,
    - and both the back-end and front-end are unreachable.
    - I've unsuccessfully tried mapping via the Admin area and via the DB.

    Not sure how to ix these issues at this point. Any suggestions?

    Best, Jasper

  • Ash

    Hello Jasper

    Let me explain your issues:

    - I can create a new site via the Network Admin,

    -- Good thing :slight_smile:

    - and that does create tables in the database,

    -- Positive news :slight_smile:

    - but it doesn't create a folder on the server,

    That's correct. Subsite doesn't create any new folder, it is served from the same wordpress installation connecting the database. No physical folder will be created in the root at all.

    - and both the back-end and front-end are unreachable.

    You don't have wildcard subdomain created. Please create wildcard subdomain, must point to same directory of the main site and this is an important part.

    - I've unsuccessfully tried mapping via the Admin area and via the DB.

    Don't map with DB, it might be messed up. Once your subsite is up by doing the above step, we will do this using domain mapping :slight_smile:

    Have a nice day!

    Cheers,
    Ash

  • Jasper

    Hi Ash,

    Wildcard is created, and new site is accessible (front and back), but the Sitemap issue remains.

    - When I network activate SmartCrawl it shows as menu item in the subsites, but it also remains visible in the Network Admin area.
    - I've had some instances where I wanted to change settings on a subsite level and got 'not allowed' erros. Although I have trouble recreating those errors.
    - After cloning a site I also have trouble saving changes to pages.

    Could it have something to do with sunrise.php? Which is not needed/active at the moment.

    Seems it has something to do with permissions or something, but I'm not sure how to test/fix this. Thoughts?

    Best, Jasper

  • Ash

    Hello Jasper

    I assume it's something with your nginx config. If you check the subsite URLs:
    http://kwalite********erwerk.nl/sitemap.xml
    https://kwalit********adoor.nl/sitemap.xml

    You will see white screen. If the URLs are wrong, you should see a 404 error, like
    http://kwalite********erwerk.nl/sitemap2.xml
    https://kwalit********adoor.nl/sitemap2.xml

    I edited to sitemap2.xml and now it shows 404 error. So, when you use sitemap.xml, it gets the file but for some reason, it is not rendered in the browser.

    Do you remember what edits you made in the nginx config file?

    Cheers,
    Ash

  • Jasper

    Hi Ash,

    Below the contents of the nginx.conf file. The rewrite rule from before was removed in an earlier stage. All the other contents are put there by my server adminstrator.

    # Auto generated nginx config file by DirectAdmin version 1.52.1
    # Modifying this file is not recommended as any changes you make will be
    # overwritten when the user makes any changes to their website

    # For global config changes that affect all Users, see this guide:
    # http://help.directadmin.com/item.php?id=558
    # For local config changes that only affect one User, see this guide:
    # http://help.directadmin.com/item.php?id=3

    server
    {
    try_files $uri $uri/ /index.php?$args;
    # Expire rules for static content
    # cache.appcache, your document html and data
    location ~* \.(?:manifest|appcache|html?|xml|json)$ {
    expires -1;
    # access_log logs/static.log; # I don't usually include a static log
    }
    # Media: images, icons, video, audio, HTC
    location ~* \.(?:jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm|htc)$ {
    expires 1M;
    access_log off;
    add_header Cache-Control "public";
    }
    # CSS and Javascript
    location ~* \.(?:css|js)$ {
    expires 1d;
    access_log off;
    add_header Cache-Control "public";
    }
    location ~* \.(eot|otf|ttf|woff|woff2)$ {
    add_header Access-Control-Allow-Origin *;
    }
    location ~ ^/(status|ping)$ {
    access_log off;
    allow 127.0.0.1;
    allow 37.97.235.106;
    include /etc/nginx/fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    fastcgi_pass unix:disappointed:usr/local/php70/sockets/famme.sock;
    }
    listen 37.247.40.62:80;
    server_name bouwexpert.online http://www.bouwexpert.online kwaliteitsschilderwerk.nl http://www.kwaliteitsschilderwerk.nl kwaliteitsstukadoor.nl http://www.kwaliteitsstukadoor.nl;
    access_log /var/log/nginx/domains/bouwexpert.online.log;
    access_log /var/log/nginx/domains/bouwexpert.online.bytes bytes;
    error_log /var/log/nginx/domains/bouwexpert.online.error.log;
    root /home/bouwonline/domains/bouwexpert.online/public_html;
    index index.php index.html index.htm;
    include /usr/local/directadmin/data/users/bouwonline/nginx_php.conf;
    location /
    {
    # access_log off;
    proxy_buffering off;
    proxy_pass http://37.247.40.62:8080;
    proxy_set_header X-Client-IP $remote_addr;
    proxy_set_header X-Accel-Internal /nginx_static_files;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_hide_header Upgrade;
    }
    location /nginx_static_files/
    {
    # access_log /var/log/nginx/access_log_proxy;
    alias /home/bouwonline/domains/bouwexpert.online/public_html/;
    internal;
    }
    include /etc/nginx/webapps.conf;
    }

    server
    {
    try_files $uri $uri/ /index.php?$args;
    # Expire rules for static content
    # cache.appcache, your document html and data
    location ~* \.(?:manifest|appcache|html?|xml|json)$ {
    expires -1;
    # access_log logs/static.log; # I don't usually include a static log
    }
    # Media: images, icons, video, audio, HTC
    location ~* \.(?:jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm|htc)$ {
    expires 1M;
    access_log off;
    add_header Cache-Control "public";
    }
    # CSS and Javascript
    location ~* \.(?:css|js)$ {
    expires 1d;
    access_log off;
    add_header Cache-Control "public";
    }
    location ~* \.(eot|otf|ttf|woff|woff2)$ {
    add_header Access-Control-Allow-Origin *;
    }
    location ~ ^/(status|ping)$ {
    access_log off;
    allow 127.0.0.1;
    allow 37.97.235.106;
    include /etc/nginx/fastcgi_params;
    fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
    fastcgi_pass unix:disappointed:usr/local/php70/sockets/famme.sock;
    }
    listen 37.247.40.62:443 ssl http2;
    server_name bouwexpert.online http://www.bouwexpert.online kwaliteitsschilderwerk.nl http://www.kwaliteitsschilderwerk.nl kwaliteitsstukadoor.nl http://www.kwaliteitsstukadoor.nl;
    access_log /var/log/nginx/domains/bouwexpert.online.log;
    access_log /var/log/nginx/domains/bouwexpert.online.bytes bytes;
    error_log /var/log/nginx/domains/bouwexpert.online.error.log;
    root /home/bouwonline/domains/bouwexpert.online/private_html;
    index index.php index.html index.htm;
    ssl on;
    ssl_certificate /usr/local/directadmin/data/users/bouwonline/domains/bouwexpert.online.cert.combined;
    ssl_certificate_key /usr/local/directadmin/data/users/bouwonline/domains/bouwexpert.online.key;
    include /usr/local/directadmin/data/users/bouwonline/nginx_php.conf;
    location /
    {
    # access_log off;
    proxy_buffering off;
    proxy_pass https://37.247.40.62:8081;
    proxy_set_header X-Client-IP $remote_addr;
    proxy_set_header X-Accel-Internal /nginx_static_files;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_hide_header Upgrade;
    }
    location /nginx_static_files/
    {
    # access_log /var/log/nginx/access_log_proxy;
    alias /home/bouwonline/domains/bouwexpert.online/private_html/;
    internal;
    }
    include /etc/nginx/webapps.ssl.conf;
    }

  • Jasper

    Hi Ash,

    Below the (earlier) replies from my server administrator:

    "I toggled your server from nginx to apache so that your htaccess rules work, that has no further influence on the operation of the site."

    "The reason I switched is because I have read that your multi site implementation works better with an Apache core, to still use the Nginx-speed-gain I use Nginx in a reverse proxy."

    "The adaptation that I have made makes it possible to use .htaccess files which make the implementation of the xml sitemap easier (since 90% of the Wordpress plugins assume that you are using Apache). Ask the supplier of the plug-in for the htaccess code that you need to use for the xml site map and implement it in the .htaccess file which is located in the document root of your account. This should solve the problem (if the rewrite rules are the problem, because I had previously done the nginx implementation for you as they passed it to you and unfortunately this did not work)."

    --- .htaccess file ---

    # BEGIN WordPress
    <IfModule mod_rewrite.c>

    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ - [L]

    # Force HTTPS
    RewriteCond %{HTTPS} off
    RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

    # add a trailing slash to /wp-admin
    RewriteRule ^wp-admin$ wp-admin/ [R=301,L]

    RewriteCond %{REQUEST_FILENAME} -f [OR]
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule ^ - [L]
    RewriteRule ^(wp-(content|admin|includes).*) $1 [L]
    RewriteRule ^(.*\.php)$ $1 [L]
    RewriteRule . index.php [L]

    </IfModule>

    # END WordPress

  • Jasper

    Hi Ash,

    Sorry for all the replies, but since we cannot chat and there's a day between every reply I need to give as much info as possible, otherwise we keep running in circles, but I just did a recrawl after deleting some pages from Bouwexpert.online, and:

    1. It doesn't change the sitemap.
    2. When I visit the sitemap it shows CSS-styling, which I turned off in the settings.
    3. The crawls on the other sites say they only see 1 URL, which is wrong.

    So there's more to it than just rendering.

    Thanks, Jasper

  • Coastal Data

    Hello,

    I have also run into this problem, and I think that the problem is that there is a conflict with the sitemap generated by Jetpack... In my case, turning off the JetPack sitemap was the trick that fixed the problem. Also, if you use another SEO plugin like Yoast, you might have a conflict there, as well.

    To disable the JetPack sitemap, go to WP admin, then JetPack -> Settings, and then click on the Traffic tab. Now scroll down to the Sitemaps section, and click the switch to turn off the switch labeled "Generate XML sitemaps".

    The problem may or may not be resolved right away... If not, go back to the main Dashboard, and you should have the Sitemaps widget there... click "Update sitemap now", and it should then be resolved.

    Let me know if this helps!!!!

    --Jon

  • Lindeni Mahlalela

    Hello Jasper

    I hope you are doing great today. I am sorry for the delayed response from our side and thank you for your patience while we were looking into this.

    I have checked your websites and the settings relating to SmartCrawl and compared them with those of my own setup, it seems everything is in order relating to the plugin itself. When accessing the sitemap.xml for a mapped domain (one of the sub sites) I see that it loads like this:

    Checking the Web Console in the browser I see the following error:

    Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://multisite-domain.online/wp-content/plugins/wpmu-dev-seo/includes/admin/templates/xsl/xml-sitemap.xsl. (Reason: CORS header 'Access-Control-Allow-Origin' missing).

    So it seems the sitemap is failing to load due to a violation of Cross Domain Resource Sharing policies set on your website. I then checked your .htaccess file and noticed that you have set your website to allow CORS for specific MIME Types, but the MIME Type of the stylesheet requested is not in those mime types allowed to be shared across domains.

    Could you please try to include the .xls MIME type in the list of MIME Types to be shared across domains, so this MIME Type should have the header

    <FilesMatch "\.(xls)$">
            Header set Access-Control-Allow-Origin "*"
        </FilesMatch>

    Please verify and use the correct syntax, you may add another directive or edit an existing one, its up to you or your web master.

    Once you have done this, please test and let us know if you are able to access the sitemap or not. Please let us know if you still need any further assistance.

    Have a nice day.
    Mahlamusa

    • Jasper

      Hi Mahlamusa,

      Thanks for your answer, was getting worried you gave up :slight_smile:. Anyway, I've tried your solution, but it doesn't change anything.

      The sitemaps don't show in Firefox, but in Chrome they do (the page is white, but you can see it in source), and the generated sitemaps are not right, so l think Smartcrawl is not working as supposed to:

      1. All the sitemaps are the same, so crawling is not working right (they all show an old version of the bouwexpert.online-site (main site) with pages that don't exist).
      2. I've unchecked 'add css stylesheet' everywhere, but it is still showing one, so that doesn't seem to work either.

      FYI: initially we used your Domain Mapping tool, but we removed it after we ran into some issues and found out WP offers that functionality itself already. Perhaps that has something to do with it?

      --- (part of) wp-config ---

      /* Security stuff */
      define('DISALLOW_FILE_EDIT', true );
      define('FORCE_SSL_ADMIN', true);

      /* Multisite stuff */
      define('MULTISITE', true);
      define('SUBDOMAIN_INSTALL', true);
      define('DOMAIN_CURRENT_SITE', 'bouwexpert.online':wink:;
      define('PATH_CURRENT_SITE', '/':wink:;
      define('SITE_ID_CURRENT_SITE', 1);
      define('BLOG_ID_CURRENT_SITE', 1);
      define('COOKIE_DOMAIN', $_SERVER['HTTP_HOST']);
      //define('SUNRISE', 'on':wink:;

      /* Plugin stuff */
      define('WDS_SITEWIDE', false); // Because of WP Smartcrawl

      ---

      P.S.: if we can schedule a chat-session or something that would be great, because this is not really moving forward. A lot of time is wasted.
      P.S.2: could it have something to do with the settings at Admin level (see image)?

  • Lindeni Mahlalela

    Hello Jasper

    I hope you are doing great today, thank you for your feedback and I am sorry that this is taking too long to resolve, I hope we resolve this soon.

    The sitemaps don't show in Firefox, but in Chrome they do (the page is white, but you can see it in source), and the generated sitemaps are not right, so l think Smartcrawl is not working as supposed to

    I confirm that this is right, it doesn't work as it should. The problem is why it is not working. So I have done further investigation and found that the when crawling the website it times out. I ran a check on the website and also scheduled a daily crawl to see how it goes. According to my findings, it does not finish crawling and as a result it does not update the sitemaps. The one error I have found was related to timeout:

    The scheduled crawl has already ran twice if I am not mistaken, please check your emails to see if you get any reports about completed crawls from Smart Crawl, if you haven't receive any then that is the reason why the sitemaps are not being updated, if you have received the emails then it might be a bug in SmartCrawl that we will have to investigate from our side.

    I also confirmed that the sitemap's source code can be seen from Chrome, the issue I am seeing is that it still tries to load the stylesheet and it can't find it, I believe that this is the reason it does not render on Chrome as well. Chrome does not output anything on the console but I am positive that this is related to the CORS issue I mentioned earlier.

    I've unchecked 'add css stylesheet' everywhere, but it is still showing one, so that doesn't seem to work either.

    We are still looking into this, I am not sure why it behaves this way but I think it may clear once it successfully runs a complete crawl and update everything.

    Did this work fine before removing Domain Mapping? What happens if you temporarily enable Domain Mapping?

    P.S.: if we can schedule a chat-session or something that would be great, because this is not really moving forward. A lot of time is wasted.

    I am not in the Live Chat team, they have escalated this because it is too complex to deal with in Live Chat.

    Could you perhaps share cPanel access with us so we can try to solve the CORS issue and see if it has anything to do with this, I strongly believe that something will start working once the CORS issue has been solved. I say we need cPanel access because the FTP is not enough for the things I want to try and it will much easier for me to just try out something that going back and forth trying to guide you.

    You may share your cPanel login details via our secure contact form, in the contact form choose "I have a different question" then write "Attn: Lindeni Mahlalela" in the subject and in the message box include:

    - cPanel login info (login url, username, password)
    - Any other relevant information (optional)
    - Link back to this thread for reference

    Once I have that I will do some tests and see if we can resolve this soon enough. I am sorry that this is taking too long and I can't imagine how frustrating it is to you but I assure you that we are doing our best to resolve this issue as soon as possible, but we need your cooperation as well, so please send the cPanel login info if you can and I will take it from there.

    Have a nice day.
    Mahlamusa

    • Jasper

      Hi Mahlamusa,

      Thanks for your reply. I'll send an email soon, only we don't use cPanel but DirectAdmin.

      "Did this work fine before removing Domain Mapping? What happens if you temporarily enable Domain Mapping?"

      Domain Mapping is already completely deleted, but the sitemaps have never worked properly. Although I must say they never had my full attention because there were more important things to be done :slight_smile:.

      "please check your emails to see if you get any reports about completed crawls from Smart Crawl"

      We do get 'SEO Audit Crawl completed' emails, but it looks like they are related to the Network Admin area, at least, that's where it link sends us. I believe that is actually the problem. It crawls the overall website/admin and not the subsites individually or something, shouldn't the SmartCrawl tab be removed from the Network Admin area if you make it available for each subsite?

      With regards to the sitemap in the source code: it is a wrong sitemap. It also states the last checkup was 19/12/2017 11:21. This strengthens my believe above.

      I'm sending an email as well. Looking forward to your reply.

      Jasper

  • Lindeni Mahlalela

    Hello Jasper

    I hope you are doing great today. Thank you for your patience while we were working on this.

    I have accessed your website control panel and tried to do some changes on the files to try and solve the issues on the website. The sitemap shows that it is never updated since 2017-12-21 so I tried to check the website to see what might cause this, I found two potential culprits.

    I found that the max_execution_time variable was set to 1 which means the maximum time a script (including SmartCrawl) can run is one second. So to solve this I added the following line of code to the file '.user.ini' located in the public folder of your website:

    max_execution_time = 500
    memory_limit = 512M

    The first line changes the allowed execution time to 500 seconds and the second one allows script to use up to 512M of memory. I have noticed also in the error_log file that memory was an issue at some point for other scripts which is likely to be the case for SmartCrawl.

    The reason why the Sitemaps are not updated could be that SmartCrawl reaches the maximum execution time or runs out of memory before it could update the sitemaps or finish execution. So raising these limits might help.

    Unfortunately, the CORS issue remains unsolved. I have tried to add some directives to the .htmlaccess file to enable CORS but all my attempts were unsuccessful. The .htaccess file on your website seems to have the directives already in place but they seem to be unsuccessful. So with your permission I can temporarily override all these directives and try something that already works on one of my test sites for the same issue. If that does not work, I would suggest that you contact your host about this and see if they can help with the CORS issue.

    I have a strong belief that there is something wrong in the .htaccess or the server's configuration. We can confirm this by removing all the custom directives in .htaccess file and replace them with the default wordpress directives, if that does not help then the issue can only be solved by your host.

    I hope this helps, please let us know if you have any questions or feedback.

    Have a nice day.
    Mahlamusa

    • Jasper

      Hi Mahlamusa,

      Thanks, it does say the last crawl was recently now, but the sitemap is still wrong, and it still gives an error with the URL crawler: "timed out due to an unknown error".

      So with your permission I can temporarily override all these directives and try something that already works on one of my test sites for the same issue.

      You have my permission. Please communicate when you will try this, so I can make sure I'm not working on anything at that moment.

      We can confirm this by removing all the custom directives in .htaccess file and replace them with the default wordpress directives, if that does not help then the issue can only be solved by your host.

      Let's try your solution first. Afterwards we can check if it does work with the default WP-directives. If that doesn't work I'll contact my server administrator, because then we know a lot more :slight_smile:.

      Best, Jasper

  • Lindeni Mahlalela

    Hello Jasper

    I hope you are doing great today. I am sorry for any delays and any inconveniences caused. I was working on your website and tried various configurations to see if anything will work.

    Firstly, I have verified that the sitemap.xml on the root of the WordPress install is the incorrect one as SmartCrawl does not save the sitemap on that location for multisite. This makes me think that this site must have been previously a single site install which resulted in SmartCrawl to save the site map in the root of the 'public_html' folder or some url rewrites resulted in SmartCrawl saving the sitemap in that folder.

    Right now in Multisite Mode, SmartCrawl will save the sitemaps in the folders as follows:

    Main Site: wp-content/uploads/sitemap.xml
    Subsite ID=1: wp-content/uploads/sites/1/sitemap.xml
    Subsite ID=2: wp-content/uploads/sites/2/sitemap.xml
    ...
    Subsite ID=N: wp-content/uploads/sites/N/sitemap.xml

    I have double checked and found that there sitemaps are being generated on those locations but somehow the rewrite rules are not working properly to point to the correct sitemap for each of the domains. If I type the address for each site like: http://example.com/wp-content/uploads/2/sitemap.xml then I get the correct and updated sitemap for that particular subsite.

    I have also created a code snippet to forcefully generate the sitemap when a page is published or deleted, this code snippet was taken from SmartCrawl. SmartCrawl already has the option to regenerate the sitemaps when the page is published or deleted but I have used the code to double check if the part of the code that executes the code is actually executed or not and I confirmed this by logging the data returned by SmartCrawl when the sitemap is generated and I confirmed that it is actually generated and saved in the locations as mentioned above.

    The code if you need it is like this:

    //generate sitemap when post is published or deleted
    add_action('delete_post',   'wds_update_xml_sitemap');
    add_action('publish_post',  'wds_update_xml_sitemap');
    
    //generate sitemap when page is published or deleted
    add_action('delete_page',   'wds_update_xml_sitemap');
    add_action('publish_page',  'wds_update_xml_sitemap');
    
    function wds_update_xml_sitemap () {
        if( class_exists('WDS_XML_Sitemap') ) {
            $sitemap = new WDS_XML_Sitemap;
        }
    }

    I have uploaded this to your website for testing and I uploaded it to:

    wp-content/mu-plugins/xml-sitemap-regenerate.php

    but I renamed it to 'xml-sitemap-regenerate.php.backup' so that it will not run every time because SmartCrawl already has this functionality but if you do wish to use it at some point you can rename it back to 'xml-sitemap-regenerate.php'. This is an mu-plugin (Must Use Plugin) which meanse as long as it is a .php file it will be automatically loaded or executed by WordPress when WordPress initializes to make sure it is not executed just delete it or rename it to any extension like .backup as I did.

    After confirming that the sitemaps are generated the only one issue remains: The rewrite rules do not seem to work properly. To be honest with you I am not sure what causes the rewrite rules not to work properly but I have managed to forge the following Nginx config that should be added to your nginx config file. I have masked the domains (I will send the full directive via email ):

    first: list the domains separated by spaces in the server_name directive like so (usually no http or https scheme):

    server_name domain1.com http://www.domain1.com domain2.com http://www.domain2.com

    Then then just below that add the if conditions to match the domain and rewrite to the proper sitemap.xml. Add a condition for each domain:

    location /sitemap.xml {
            if ($host = (www.)domain1.com) { rewrite ^ $scheme://$host/wp-content/uploads/sitemap.xml; }
            if ($host = (www.)domain2.com) { rewrite ^ $scheme://$host/wp-content/uploads/sites/2/sitemap.xml; }
    }

    The final server directive should look like this:

    server {
        server_name example.com http://www.example.com example2.online example3.nl http://www.example3.nl example4.nl http://www.examplewerk.nl;
        location /sitemap.xml {
            if ($host = (www.)example.online) { rewrite ^ $scheme://$host/wp-content/uploads/sitemap.xml; }
            if ($host = (www.)example3.nl) { rewrite ^ $scheme://$host/wp-content/uploads/sites/2/sitemap.xml; }
            if ($host = (www.)examplewerk.nl) { rewrite ^ $scheme://$host/wp-content/uploads/sites/5/sitemap.xml; }
            if ($host = (www.)example.nl) { rewrite ^ $scheme://$host/wp-content/uploads/sites/6/sitemap.xml; }
        }
    }

    After that restart or reload Nginx and test the sitemaps by accessing http://domain.com/sitemap.xml and see if it loads the correct or redirected sitemap. These rewrite rules can be modified to suite your website's needs and should be scrutinized by your server administrator before adding them.

    For the sitemap that is currently loaded I have removed the site XML Stylesheet that was causing the CORS issues I have mentioned earlier this ensures that at least that sitemap is loaded properly even though it is not the correct sitemap generated by SmartCrawl.

    I hope this helps, please let us know if you need any further assistance with regards to this.

    Have a nice day.
    Mahlamusa

  • Lindeni Mahlalela

    Hello Jasper

    I hope you are doing great today. Thank you for your response via email.

    AH00526: Syntax error on line 23 of /USER_DIRECTORY/***line/httpd.conf:
    Invalid command 'location', perhaps misspelled or defined by a module not included in the server configuration

    I am sorry but it seems you have added this to your Apache virtual host but it should go to your ngnx.conf instead of httpd.conf otherwise your server admin should change it to match Apache configuration. The issue reported by the error message is that 'location' is an invalid command for Apache's httpd.conf file, these rules must be added to Nginx's config.

    If it does not work on the Nginx's config then those rules must be converted into Apache's directives. Unfortunately that is server related and is out of the scope of this support forum. What I would suggest is that you turn off the Nginx proxy for a while and test the sitemaps without the proxy. Maybe it will work without the proxy and also as I mentioned before, any attempt to add rules in the .htaccess were unsuccessful so I suggest you temporarily turn off the proxy if possible and add adjust the rewrite rules so they can be used with Apache. I know these rules were written for Nginx but if your administrator wants to use them in Apache then he must rewrite them and replace the Nginx commands with those of Apache server.

    And my server administrator believes it has nothing to do with nginx, because that is only used as reverse proxy. According to him it should be fixed somewhere in the .htaccess file, but he has no idea what to do. Could the code you suggested be added to the .htaccess file somehow?

    That code is written to be added on the Ngnx.conf file you shared earlier on the post not htaccess and not Apache's httpd.conf file. Please ask your server administrator to try it on ngnx.conf and he must double check the rules before adding them to the server's config.

    It is also possible to achieve the same thing in .htaccess but that will require some custom rewrites and is beyond the scope of this support forum.

    I hope this helps, please let us know if you have any further questions and we will help in any way possible.

    Have a nice day.
    Mahlamusa

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.