403 error from google

Around a month ago a site I manage... rainbowfx.co.uk started blocking google (403 error), It has only just been spotted as the site has dropped off of the rankings. I can't seem to find a way to fix this. the robots.txt file is fine, I have checked with the hosting company to see if it is anything they have done or can spot.
I have the site running through CloudFlare, I have tried deactivating that with no luck.
The next step was deactivating all plugins and google is still being blocked.

to add to the mystery the site had vanished from my webmasters tools account too

  • Dimitris

    Hey there Scott,

    hope you're doing good and thanks for reaching us!

    I tried to Google search "site: rainbowfx.co.uk" and clicking on links of your website worked well in my end. Is this that's broken on your end? Or am I missing something else here? Please advise!

    In case this is happening only in your end, try the workarounds in following links:
    https://www.techwalla.com/articles/fix-google-forbidden-403-error
    https://support.google.com/chrome/answer/95647?source=gsearch&hl=en&vid=0-1307323382462-1491562030819

    Also, you should be able to reconnect the website in Webmaster Tools as long as you have access to its server, just go ahead and re-add it.
    I really can't tell how this got disappeared though.

    Warm regards,
    Dimitris

  • Scott

    Hi Dimitris, thanks for getting back to me so quickly!

    The 403 errors are from the actual GoogleBot, the site has been live and running fine for a few years now. But something in February started blocking the bots.

    I set the webmasters tools back up yesterday and tried to submit the sitemap, a fetch and a fetch and render which all returned with 403 error.

    The guy who owns the website was on the phone to google yesterday too as it would not let him run AdWords due to the 403, they just asked to get the site to stop blocking the GoogleBot.

  • Adam Czajczyk

    Hello Scott,

    I checked your site and I can see that both Yoast (inside your site) and an external search bot simulator (here: http://botsimulator.com and here https://www.google.com/webmasters/tools/mobile-friendly/) report 403 Forbidden status on the site.

    I reviewed site's settings but I don't see anything there that could be causing that. You mentioned that "the robots.txt file looks fine". Does that mean that the "robots.txt" file physically exist on your server? If yes, could you please share its content with me? I'd also like to see contents of ".htaccess" file.

    Please also check this article and follow the "General guidelines" part to make sure that there's no blocks on CloudFlare side:

    https://support.cloudflare.com/hc/en-us/articles/200169806-I-m-getting-Google-Crawler-Errors-What-should-I-do-

    I know you already tried to disable CloudFlare but even then it would be good to make sure that nothing's blocking bots on their end as CloudFlare may be triggering some caching (browser end) that takes effect for a while; that should not affect bots but it's better to make sure.

    Finally, could you please double check in cPanel (or other host management panel that you are using) that there is no limits on bots access set there? I think it's worth a shot and if you don't find anything there it might be good to consult your host to see if they didn't put any limits on server side. Would you get in touch with them please and ask them?

    Keep me informed please.

    Kind regards,
    Adam

  • Scott

    Hi Adam, thanks for the reply.

    The robots.txt file is blank, this is the contents of the .htaccess:

    RewriteCond %{HTTP_USER_AGENT} bot [NC]
    RewriteRule . - [F,L]

    # Use PHP 5.3 as default
    # AddHandler application/x-httpd-php53 .php

    ## EXPIRES CACHING ##
    <IfModule mod_expires.c>
    ExpiresActive On
    ExpiresByType image/jpg "access plus 1 year"
    ExpiresByType image/jpeg "access plus 1 year"
    ExpiresByType image/gif "access plus 1 year"
    ExpiresByType image/png "access plus 1 year"
    ExpiresByType text/css "access plus 1 month"
    ExpiresByType application/pdf "access plus 1 month"
    ExpiresByType text/x-javascript "access plus 1 month"
    ExpiresByType application/x-shockwave-flash "access plus 1 month"
    ExpiresByType image/x-icon "access plus 1 year"
    ExpiresDefault "access plus 2 days"
    </IfModule>
    ## EXPIRES CACHING ##
    # compress text, html, javascript, css, xml: AddOutputFilterByType DEFLATE text/plain AddOutputFilterByType DEFLATE text/html AddOutputFilterByType DEFLATE text/xml AddOutputFilterByType DEFLATE text/css AddOutputFilterByType DEFLATE application/xml AddOutputFilterByType DEFLATE application/xhtml+xml AddOutputFilterByType DEFLATE application/rss+xml AddOutputFilterByType DEFLATE application/javascript AddOutputFilterByType DEFLATE application/x-javascript AddType x-font/otf .otf AddType x-font/ttf .ttf AddType x-font/eot .eot AddType x-font/woff .woff AddType image/x-icon .ico AddType image/png .png
    # BEGIN WordPress
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>

    # END WordPress

    # BEGIN WP-HUMMINGBIRD-CACHING

    # END WP-HUMMINGBIRD-CACHING
    ## WP Defender - Prevent information disclosure ##
    Options -Indexes
    <FilesMatch "\.(txt|md|exe|sh|bak|inc|pot|po|mo|log|sql)$">
    Order allow,deny
    Deny from all
    </FilesMatch>
    <Files robots.txt>
    Allow from all
    </Files>
    ## WP Defender - End ##
    # BEGIN WP-HUMMINGBIRD-GZIP

    # END WP-HUMMINGBIRD-GZIP

    I'am just having a look at cloudflare now to see if it is anything there. I spoke with my host earlier who couldnt see anything on the account that would be blocking

  • Adam Czajczyk

    Hello Scott,

    Thank you for your replay!

    An empty "robots.txt" file should not result in blocking any bots as it's pretty much the same if there was no robots.txt at all. If it's empty though, you may safely remove it.

    The .htaccess file doesn't seem to block crawlers too, except for some files that shouldn't be crawled anyway so that also seems fine.

    Once you check Cloudflare, let me know please if you found anything there.

    Best regards,
    Adam

  • Scott

    Hi there, I still am having this issue

    My hosting company have been looking into this too, they have not found any issues with robots.txt, .htaccess file, plugins on the site, there are no firewalls or anything blocking the bots on the server. I have deactivated the CDN to rule that out.

    We modified the robots.txt file to:
    User-agent: Mediapartners-Google
    Allow: /

    and then i changed it to:
    User-agent: *
    Allow: /

    I thought this bit looked like a possible threat on my .htaccess file:
    ## WP Defender - Prevent information disclosure ##
    Options -Indexes
    <FilesMatch "\.(txt|md|exe|sh|bak|inc|pot|po|mo|log|sql)$">
    Order allow,deny
    Deny from all
    </FilesMatch>
    <Files robots.txt>
    Allow from all
    </Files>
    ## WP Defender - End ##

    So I have removed it.

    My client has also spoken to google on the phone:
    "within webmaster tools you would be able to see which pages were getting blocked"
    .... not much help there.

    Other things I have checked:
    If any other sites on my server are blocked = no
    Reading settings in Wordpress = all fine
    Adjusting webmasters tools to www. & non www. = both return 403 for the googlebot

  • Dimitris

    Hey there Scott,

    hope you're doing good today!

    Could you please try to remove the following lines from .htaccess file?

    RewriteCond %{HTTP_USER_AGENT} bot [NC]
    RewriteRule . - [F,L]

    Let us know if that makes any difference at all!
    Warm regards,
    Dimitris

    PS. When you post code snippets here in our forums, please use the "code" tag for better formatting. It will make things much easier for us.