[SmartCrawl] Sitemap URL not including www as it should

Both WordPress Address (URL) and Site Address (URL) in Settings > General include the www in the URL, yet the Sitemap link in SmartCrawl > Sitemap is linking ot a URL without the www.

We've also tried adding the correct sitemap address to robots.txt with no change. And the redirects in .htaccess correctly capture all other http or non-www URLs and redirect to https://www

  • Adam Czajczyk
    • Support Gorilla

    Hello Liz

    I hope you're well today!

    I just checked the sitemap of your site and it seems that all the URLs now are containing the "www" prefix so I believe you managed to deal with it meanwhile. But if you still need assistance or I'm missing something, let me know please.

    Kind regards,
    Adam

  • Liz
    • New Recruit

    Hi, the problem seems to still be there when I do a site crawl. Several issues.
    1. There are still 161 re-directs "301" euros, that are because of switching from http to https. When I redirect to the https, it gets moved to the "ignored" list. Please fix this, or tell me what to do.
    2. While on the site crawl dashboard, in the area in green where is says, "Your sitemap is available at..." the link does NOT go to http://www.sitemap.xml. It goes to a mystery place.
    3. How do I fix the "404" errors?

  • Adam Czajczyk
    • Support Gorilla

    Hello Liz

    Thank you for your response!

    Since you have granted support access to the site, I checked and and there are some things that has to be done:

    1. The site is running over SSL but is not fully configured for SSL. I see that you got some redirect to https but the first and most important thing that should be set would be setting "WordPress Address (URL)" and "Site Address (URL)" options on "Settings -> General" page in site's back-end to start with "https://" prefix instead of current "http://".

    This is basic and fundamental setting that affects a lot of things: the "mixed content" issues, the way various assets are loaded, the way internal links of the site are generated, the URLs that are included in sitemaps etc.

    I'd suggest starting with this.

    2. Apparently there's sitemap generation enabled in Jetpack currently. That's why that sitemap link in SmartCrawl leads to a "mystery place". I'm not quite sure if it was enabled before as I remember seeing SmartCrawl sitemap when I was checking that. Anyway, this feature of Jetpack is conflicting with SmartCrawl and it should be disabled. You can disable it on "Jetpack -> Settings" page in "Traffic" tab in "Sitemaps" section.

    3. There are some "404" links listed in "ignored" section of SmartCrawl crawler but apparently these links are indeed 404 links: they do lead to "Page not found" so it seems they do not exist. If a given page doesn't exist and the site is returning "404 Page not found" instead of the requested page then there's nothing to fix in the SmartCrawl (or other SEO-related) plugin but rather setting those links to be ignored in sitemap (like they currently are) or:

    - either finding where these links are used on site and removing them from there so there were no "dead links" on site
    - or adding back those missing pages.

    As for now, I'd suggest starting with these 3 things:

    1. make sure about that SSL setting (see point 1 above)
    2. disable sitemaps in Jetpack
    3. run SmartCrawl sitemap crawl (SmartCrawl -> Sitemap -> URL Crawler) again to re-scan the site and re-create sitemap

    After that we could check if there's anything else that could/should be done.

    Kind regards,
    Adam

  • Adam Czajczyk
    • Support Gorilla

    Hi Liz

    I just took a look at your site's sitemap and it looks fine - that's definitely a SmartCrawl map, properly validating and including the "www" prefix as expected! So, I guess it's fine now, right? :slight_smile:

    I assume we can consider case closed then but it would be great if you could confirm, just so I was sure :slight_smile: And if you have any follow up questions, I'll be happy to assist you further.

    Best regards,
    Adam

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.