ATOMSEO
  • Log In
  • Sign Up
ATOMSEO

Understanding Soft 404 Errors: Causes, Scenarios, SEO Impact and Resolving

Soft 404 errors are a common issue that web admins encounter, and understanding them is crucial for maintaining a healthy website and good SEO performance. This article will explore what soft 404 errors are, how they differ from standard 404 errors, their impact on SEO, and how to fix them.

1. What is a Soft 404?

A soft 404 error occurs when a webpage returns a "200 OK" HTTP status code, indicating that the page has loaded correctly, but the content suggests that the page does not exist. This can happen if a page has little to no content or if the page displays a message like "Page not found" but still returns a "200 OK" status instead of a "404 Not Found" status.

2. Soft 404 vs. Standard 404

  • Standard 404 Error: A 404 error occurs when a webpage returns a "404 Not Found" status code. This indicates to users and search engines that the requested page does not exist.

  • Soft 404 Error: In contrast, a soft 404 error occurs when a page that should return a "404 Not Found" status instead returns a "200 OK" status. Search engines like Google treat these pages as errors because they essentially mislead the user and the search engine by pretending to be valid pages.

3. Common Causes of Soft 404 Errors

Soft 404 errors can arise from various website architecture and configuration issues. Here are some of the most common causes:

1. CMS and Routing Configuration Issues

Problems at the Content Management System (CMS) level and routing settings often lead to soft 404s. This happens when users land on a custom 404 error page, but the server incorrectly returns a "200 OK" status instead of a "404 Not Found." These issues typically stem from technical errors within the CMS's internal architecture.

2. Server Configuration Errors

Server misconfigurations can lead to soft 404 errors, which occur when the server cannot locate the physical file and returns a "404 Not Found" status, even though the file exists under certain conditions. These errors often arise from incorrect server settings or missing files.

3. Improper Handling of Deleted Pages

After deleting a relevant page, failing to return an appropriate "410 Gone" or "404 Not Found" status code and instead setting up a permanent (301) or temporary (302) redirect to an unrelated page can cause soft 404 errors. This method is sometimes used to preserve link equity or to mitigate poor metrics. However, if the search engine does not see any content relevance between the pages, it will not transfer the link value, resulting in a soft 404 error.

A soft 404 page might be intentionally created for marketing purposes. For instance, when a product is out of stock, the seller may not want to lose potential leads. Instead of returning a proper 404 or 410 (Gone) status, they might offer a page with alternative products, a product search, or set up a permanent redirect to another page, such as the homepage.

4. Incomplete Page Rendering

Technical issues, such as missing content blocks or fragments, can prevent an entire page from being rendered. For example, an empty product category page that remains published despite having no products to display can cause a soft 404. It would be more logical to display the product listing with notes like "Out of Stock" or "Notify Me When Available" to avoid this issue.

When something important fails to load completely, a soft 404 can be generated by the server or CMS, as well as by the user's browser.

5. Maintenance Notices

Previously indexed URLs might display messages like "Under Maintenance" or "Site Undergoing Maintenance" instead of the expected content during maintenance. While the URL is accessible, the lack of content can trigger a soft 404 error.

6. Server Overload

An overloaded server might deliver a simplified or stripped-down version of a page, lacking full content. These issues are sporadic and difficult to replicate, occurring under specific conditions when the server is under heavy load.

7. Problematic Canonical URLs

Issues with canonical URLs can also lead to soft 404 errors. A canonical tag acts like a soft 301 redirect, informing search engines to consider another target page. If configured correctly, non-canonical pages will pass their link value to the canonical page. However, if the content is too different or the canonical page is inaccessible, it can result in a soft 404 error.

4. How Does Soft 404 Affect SEO?

The first question when discussing Soft 404 is: "Are Soft 404 errors bad for SEO?" The answer is a definite yes. Soft 404 can negatively impact SEO in several ways:

1.Wasted Crawl Budget: Search engines allocate a specific number of pages they will crawl on your site during each visit. Soft 404 errors waste this budget by causing search engines to index pages that do not provide value.

2.User Experience: These errors can confuse users who land on these pages expecting valid content but find nothing useful. This can lead to higher bounce rates and lower user engagement.

3.Diluted Link Equity: If external or internal links point to soft 404 pages, the link equity is wasted, which could otherwise have been passed to valuable pages.

5. Signs That Your Site Needs a Soft 404 Error Check

Here are some key indicators that suggest it is time to check your site for soft 404 errors:

1. Increase in Bounce Rate

A sudden rise in your website's bounce rate is often attributed to bot traffic. However, it is important not to jump to conclusions. The increase might be due to real users and potential customers visiting your site but not finding the expected content. This can happen if they land on pages that appear valid but are soft 404.

2. Fluctuations in Rankings and Traffic

Significant variations in your site's search rankings and traffic can be a sign of soft 404 errors. Search engine bots, encountering soft 404 errors repeatedly, might judge the pages as low quality. In cases of high volume, the entire site could be considered substandard. Search engines prefer to rank high-quality resources, so persistent soft 404 errors can decline your overall search engine performance.

3. Alerts from Webmaster Tools

Another indicator is receiving alerts or warnings from search engine webmaster tools (like Google Search Console or Bing Webmaster Tools) about duplicate pages that are not duplicates. Search engines, including Google and Yandex, might incorrectly identify and merge URLs with broken or irrelevant pages.

6. How to Find Soft 404 Errors

Detecting soft 404 involves several steps:

1. Identify Soft 404 Errors Using Audit Services

The first step to identifying the issue is to use any parser designed for technical analysis, such as Screaming Frog, SEO Spider, Atomseo Broken Links Checker, SiteAnalyzer, or Google Search Console, and check the response codes of the affected landing pages. Given that entire nodes of the web graph are now evaluated rather than individual landing pages, it's better to assess the whole site. There may be issues with supporting nodes that also need to be understood.

JavaScript, CSS, or include files can pose a problem in this regard. Suppose the search engine bot temporarily cannot access these files, which are directly related to rendering the content. In that case, the page will be received only in a truncated, incomplete form. If this issue occurs periodically, you will get a soft 404 error.
It is highly recommended that you assess the page's current state using historical data. You may have deleted some content, or it might have become inaccessible for other reasons.

An excellent way to identify soft 404 errors is by evaluating near-duplicates or exact duplicates on your site. If a page has dozens or hundreds of duplicates, assess the content of these pages. Sometimes, these are very similar product pages that differ only in name and price. This issue can be resolved with microdata and adding unique content specific to the site. However, other times, these are pages where the main content area displays a "Nothing found" message. At the same time, the rest consists of similar products, cross-site reviews, items from the same manufacturer, and the menu – essentially, just repetitive template elements.

It is also essential to evaluate duplicates based on titles and meta tags. A common issue arises when titles and meta tags are updated via JavaScript, leading to duplicates because the templating engine fails to generate the necessary content. This indicates problems at the CMS level, and you should evaluate the actual content on the page. Instead, placeholder text or nothing at all is displayed, unlike the intended content.

2. Analyzing Server Logs

A deeper and more precise check involves analyzing server logs, specifically the access_log, which records data about all incoming traffic events. Working with logs can be challenging: on a high-traffic site, the total volume of logs can amount to gigabytes of data. To work with these logs, you will need skills in Python or a specialized log analysis tool, such as Screaming Frog Log File Analyzer.

3.   Webmaster Tools from Search Engines

Utilize data directly from search engines. Google's native tool, Search Console, is indispensable in this regard.

Open the "Indexing – Pages" section and review the report in the error block. Here, you can typically find the following errors:

  • The submitted URL is a soft 404: The URL is not yet marked as a soft 404, and the issue might be temporary.
  • Soft 404: These are direct candidates for de-indexing if they haven't already been removed from the index.
  • Not found (404): Pages of this type are de-indexed but remain within the Googlebot's attention.

Another native tool is the URL inspection tool. Enter the problematic URL and request a re-crawl. Once completed, you can assess what the Googlebot received – the server's HTTP response, content, specific resource requests, and more.

7. How Do I Fix a Soft 404 Error?

The approach to resolving soft 404 errors depends on the type of website, the nature of the error, and the desired outcome (such as de-indexing a non-existent URL, merging a "broken" URL with a current one, or fixing a purely technical issue). Adjust your strategy accordingly to address your site's specific needs and the errors encountered.

1. Improve Content Quality

If a page has thin content, improve the quality of the content to make it more valuable and relevant. This can involve adding more detailed information, images, videos, or other engaging content.
Assess the number of requests the page sends to the server and check their availability. The issue might be related to the overall optimization of the page template: the bot may not be able to retrieve the system files responsible for rendering the page promptly. Address this with a technical specialist or server administrator. This includes resolving web server, application, or database configuration issues that lead to errors.

2. Redirect to Relevant Pages

For pages that have been removed but have relevant alternatives, set up 301 redirects to direct users and search engines to the most appropriate existing page. This helps retain link equity and provides a better user experience.
If a page has been deleted and you have set up a redirect to an unrelated page to preserve "link equity," it might be better to either remove the redirect or set it to redirect to the most relevant page on your site. Alternatively, consider restoring the deleted page. In all other cases, returning a 404 (Not Found) or 410 (Gone) response is better.

3.   Correct HTTP Status Codes

Ensure that any page that does not exist returns a proper "404 Not Found" or "410 Gone" status code. You can do this by modifying the server configuration or using appropriate response headers in your CMS or web application.

4. Regular Monitoring

Monitor your website regularly using tools like Google Search Console, SEMrush, Atomseo Broken Links Checker, or Ahrefs to prevent new soft 404 errors from appearing. Consistent monitoring helps you catch and fix these issues promptly.

8. Google's Handling of Soft 404 Errors

Google has sophisticated mechanisms for detecting and reporting soft 404. These errors occur when a page returns a "200 OK" status but does not provide helpful content, misleading users and search engines. Here is an overview of how Google handles these errors:

Detection

Googlebot, Google's web crawler, detects soft 404 errors during its regular crawl of websites. When Googlebot encounters a page that lacks substantial content or displays messages like "Page not found" while returning a "200 OK" status, it flags the page as a potential soft 404 error. Google uses algorithms to determine if the content is insufficient or if the page fails to serve its intended purpose.

Reporting

Once detected, soft 404 errors are reported in Google Search Console under the "Coverage" report. Web admins can find these errors listed with details about the affected URLs. This tool provides insights into how Googlebot interprets and categorizes a website's pages.

Updates and Changes

Google continuously updates its algorithms to improve the accuracy of soft 404 detection. Recent changes may include more advanced page content and layout assessments, ensuring that even subtle issues are identified. Google also adapts its methods based on new web standards and common practices, which means web admins should stay informed about updates in Google Search Console and related documentation.

Recommendations

To address soft 404, Google recommends that non-existent pages return the correct "404 Not Found" or "410 Gone" status. Improving content quality to meet user expectations and search engine requirements is crucial for pages that should exist. Regularly monitoring Google Search Console for new soft 404 reports and promptly addressing them can help maintain a site's SEO health and user experience.

By understanding Google's approach to handling soft 404 errors, web admins can better manage their websites and ensure that all pages provide valuable content to users and search engines.

9. Specific Cases of Soft 404 Errors

Soft 404 errors occur when a webpage displays a message indicating that the page does not exist but still returns a "200 OK" HTTP status code, misleading search engines, and users. Here are some specific instances and examples of soft 404 errors triggered by particular elements or actions:

1. Empty Product Categories

In e-commerce sites, product category pages that display "No products available" or "Out of stock" messages while returning a "200 OK" status can be considered soft 404 errors. These pages should either be improved with relevant content or set to return a "404 Not Found" status if they are no longer helpful.

2. Maintenance Pages

Some pages might show "Site Under Maintenance" messages when a website is under maintenance. If these pages return a "200 OK" status instead of a "503 Service Unavailable" status, they can trigger soft 404 errors. It's better to use the correct status code during maintenance periods.

3. Internal Search Results

Internal search result pages that yield no results and display messages like "No items found" can be flagged as soft 404 errors if they return a "200 OK" status. Providing practical alternatives or returning a "404 Not Found" status when no results are available can prevent this issue.

4. Placeholder Content

Pages with placeholder content such as "Coming soon" or "Under construction" can trigger soft 404 errors if they return a "200 OK" status. These pages should either be updated with actual content or set to return a "404 Not Found" status until they are ready.

5. JavaScript Rendering Issues

Pages that rely heavily on JavaScript to load content can also cause soft 404 errors if the scripts fail to execute correctly, resulting in incomplete content delivery. Ensuring that essential content is accessible without JavaScript can help mitigate this problem.

6. Incorrect Redirects

Redirecting a deleted page to an unrelated page, such as the homepage, instead of returning a "404 Not Found" or "410 Gone" status, can result in a soft 404 error. Redirects should lead to the most relevant page or return the correct status code for permanently removed content.
Understanding and fixing soft 404 errors is vital for maintaining a healthy website and ensuring optimal SEO performance. By ensuring that non-existent pages return the correct status codes, improving content quality, and setting up proper redirects, you can prevent soft 404 from affecting your site's visibility and user experience. Regular monitoring and adherence to best practices will help keep your website in top shape and free of soft 404 issues.

Use Atomseo Broken Links Checker to quickly and efficiently find all errors, including soft 404 errors. This tool allows you to check up to 1500 daily links for free.

10. Relevant Links