No Index WP Content Folder

Hello,

I have noticed within Google Search Console that the number of pages which have been indexed but not submitted in sitemap has gone up no end.

It seems to all be the WP Content folder which has been indexed which wasn't before.

For example this https://solentforts.com/wp-content/uploads/cache/2016/05/MG_0847/?SD and if you right click and view the source i need it to say no index somewhere.

I have tried Options -Indexes but this created a 403 forbidden error which i didn't want. I just want no index on that specific folder without disallowing it.

Hopefully you can help

Regards

  • Adam Czajczyk
    • Support Gorilla

    Hello allanlove

    I hope you're well today and thank you for your question!

    I understand that you want to make sure that certain URLs/folders are not indexed by Google search engine, correct? Just making sure that we're on the same side.

    If so, the "Options -Indexed" is not a way to go as it's meant to be used for a completely different thing: it's about whether or not content of the directory can be displayed in browser or not if search engine is allowed to index it.

    Furthermore, you cannot add any meta to the image source (that "page source" that you see is not a "real" page source, it's just a "wrapper" added by browser) and the file itself looks like cache.

    A way to handle that is a robots.txt file that tells search engines whether to access some URLs/paths or not. In case of WordPress the robots.txt file is "virtual" so it doesn't physically exist, unless it's been changed. But you can as well user a real one.

    You'd want to create a file named "robots.txt" in the root folder of your WordPress install (the same folder where the "wp-config.php" file of the site is located) - or edit it if there already is one.

    To exclude entire /wp-content folder from indexing by search engines simply add this line to it:

    Disallow: /wp-content/*

    Best regards,
    Adam

  • Adam Czajczyk
    • Support Gorilla

    Hi allanlove

    If it's already indexed, it will eventually get "de-indexed" but you'd probably also want to try to "speed that up" manually. There's a way to ask Google to remove indexed URLs:

    https://support.google.com/webmasters/answer/1663419?hl=en

    Be aware, please, that this does take time and there's "100% reliable" way to speed the process up.

    Getting back to "Options -Indexes" then. I initially wrote that it's "meant to block whether content of the directory is displayed by the browser". That's oversimplification, I admit, but I assumed that by "indexing" you mean "including in search engine search results". And I'm slightly confused now about what you want to achieve.

    By "indexing" do you mean that:

    - Google crawler must not be allowed to include /wp-content (and sub-content) in search results
    - or it should not even be allowed to even visit it?

    In both cases for Google (they do respect robots.txt) should do the trick but you might additionally need to use aforementioned URL removal tool. The "Options -Indexes" will fully prevent Google and in fact everything/everyone else from accessing /wp-content (and sub-folders) entirely but it will return 403 Forbidden error as this is exactly it's what it should do: it returns 403 Forbidden status.

    So, there's no really anything that would "say" something like "please Google do not index these files" if you cannot control HTTP headers. There is a different thing though that you can try in addition to above:

    - make sure that you server (since we're talking about .htaccess it's Apache) does support and have enabled "mod_headers" module

    - if yes, you can try adding additional .htaccess file in all folders that you wish to be "no index" and put this line in that file:

    Header set X-Robots-Tag "noindex, nofollow"

    For any direct request to any file in the folder with such .htaccess, server would then automatically send "noindex, nofollow" header so that should do the trick.

    Still though, regardless whether you do just that or use it together with robots.txt or even user Gogole's URL removal tool - it might take significant amount of time for Google do remove already included URLs from their indexes.

    Kind regards,
    Adam

Thank NAME, for their help.

Let NAME know exactly why they deserved these points.

Gift a custom amount of points.