How To Prevent Soft 404 Errors In Quick Cache

 Posted by on January 6, 2013  Tips
Jan 062013
 

How to stop Quick Cache returning an incorrect 200 header response for a 404 (Page Not Found) error page – known as a Soft 404 error. Quick Cache is an excellent and easy to use caching solution for WordPress (see our beginner’s guide) but it can produce an error for a page that is not found.

The same problem may potentially occur with other WordPress caching plugins like WP Super Cache and W3 Total Cache – the actual solution will be different but the same in principle.

Brief Overview Of 404s – a 404 (Not Found) error page is what a visitor sees if they follow a broken or dead link to a page on your site that does not exist – it may be a standard 404 page or a custom page you created to offer better user navigation options.

Behind the scenes, it should send a standard 404 HTTP response code which is what search engine crawlers expect – the 404 code tells them the page does not exist so that they do not index it and can move on to crawling the next page.

By contrast, 200 (OK) is the standard HTTP response code for a page that does exist (a normal page on your site) – if a search engine crawler receives this response it expects to find a valid page of content.

404s After Deleting Pages – apart from incorrect links, a common source of 404 errors are old pages that you delete from your site (without redirecting). Any subsequent visits to those pages (e.g. from Google searches) are sent to your 404 Page Not Found page.

Such 404s are of little concern if a page was intentionally removed. Eventually Google will remove those pages from their index but this can take months – in the meantime you can see 404 errors reported in Google Webmaster Tools \ Crawl Errors \ URL errors \ Not Found.

How Caching Can Cause 404 Errors – Quick Cache (and other caching plugins) may cause a problem in such cases. Let’s say you delete a page called OldPost – Quick Cache will flush the page from your cache and the next visitor to OldPost will be directed to your 404 page with a header response of 404 – correct. Note that the URL will still be OldPost (even though it displays your Page Not Found page) – so the user can see which URL it is that has not been found.

The problem occurs because, in the background, Quick Cache just built a new cache of this OldPost URL. Therefore the next user to visit OldPost will be served the cached page – and crucially, because this cached page does exist, the header response will be 200 and not 404. This is known as a Soft 404 error – and search engines hate it. Understandably so – if a page does not exist, the server should return a 404 error code. Returning a code of 200 tells search engines that there’s a real page at that URL, even though there isn’t…

Soft 404s are problematic and could affect how well search engines crawl your site – see Google’s guide to why they are so bad. Note: Soft 404 is a technical error to do with the header response used by search engines – it is invisible to visitors who will see the same Page Not Found page regardless of whether it is cached or not.

How To Prevent Soft 404s in Quick Cache – The solution to this problem is to either stop Quick Cache from caching 404 error pages OR to let it cache them but automatically insert the correct 404 header response. Unfortunately there’s no way to achieve this within Quick Cache options so the solution is a bit ugly and requires editing a php file…

Quick Cache maintains a file called advanced-cache.php which is stored within the WP-Content directory – we can edit it to fix the problem (edit via Cpanel’s File Manager if available).

Warning: the file cautions that it should not be edited (because it is rebuilt dynamically) but, unless the developer adds a menu option in Quick Cache, we see no other way to fix these Soft 404s. The main issue is that after editing this file, your changes will be lost if you subsequently alter any options in the Quick Cache menu settings (Options) i.e. you would have to reapply the fix.

However, most people won’t change Options very often and, fortunately, the manual ‘Clear Cache’ does not rebuild the php file so your changes will be retained – it is still a good idea to backup advanced-cache.php first though.

Fixing the Problem with a Code Snippet – we will edit advanced-cache.php by inserting a code snippet under “function ws_plugin__qcache_builder ($buffer = FALSE)” as follows:

If we wanted to stop Quick Cache from caching 404 error pages we would insert this code:

if (is_404())
{
return $buffer; /* Do NOT cache. */
}

If we wanted to allow it to cache 404 error pages but automatically insert the correct 404 header response, we would insert this code:

if (is_404())
{
header("Status: 404 Not Found");
}

You will need to manually ‘Clear Cache’ for these changes to take effect on existing Soft 404 error pages – both solutions ensure that pages that are not found return a 404 header response, not a 200, even if they are cached.

Personally, we prefer the second solution – to cache the page but insert the correct 404 code. The reason is that serving up cached 404 pages instead of building them dynamically may save significant time and server resources.

Conclusion

Caching of pages that do not exist can be problematic. The above code snippets are not an elegant solution but, in the absence of a simple menu option in Quick Cache, they resolve Soft 404 errors that could adversely impact the crawling of a site by search engines.

 Leave a Reply

(required)

(required but will NOT be published)