Info from 2007 on "Why is my website tanking". Still good today!
I made a copy of this very excellent posting just so it doesn't get lost!
There are many reasons why a site could suddenly drop in ranking in Google’s index. In the most cases, these changes are due to natural causes - most commonly because the value of the inbound links has dropped, leaving the site without enough value to merit the old indexing and ranking. Sometimes, however, there’s more to it than just that.
This list is a community effort - feel free to add to it in the comments. We’ll move items up into the list as required. Links will be provided to debugging tools and to reference (official or just wild guesses). This list should contain only items that have been known to trigger indexing/ranking issues - the rest is in The unofficial crawling, ranking, indexing speculation post and The confirmed non-issues with websites. General items to look out for:
Links on the website to “shady” sites, paid links, link-exchanges, all outbound links for affiliates, etc.
Technical issues causing crawlers to abort crawling
Manipulations in the on-page content, like hidden text, hidden links, too many links, all internal links with “nofollow”
Things you will need:
Firefox - the browser: http://www.mozilla.com/ Live HTTP Headers Firefox plugin: https://addons.mozilla.org/de/firefox/addon/3829
Xenu’s link sleuth (for Windows): http://home.snafu.de/tilman/xenulink.html Optional: Web Developer Firefox plugin: http://chrispederick.com/work/web-developer/ Optional: SearchStatus Firefox plugin: http://www.quirk.biz/searchstatus/
Confirm that there is a problem. Search for the domain name, search for the brand name, search for some of the keywords. Check the number of indexed pages with a “site:-query” on Google and compare that to the other search engines.
Using the Live HTTP Headers plugin, open the site with a referrer from a google.com URL. If the site is has some sort of hidden redirect, you will usually see it this way first.
Live HTTP headers http://www.google.com/url?sa=D&q=http://domain.com
Check the page in the browser. Use “Control-A” to select all of the text on the page. Confirm that there is no hidden text and no hidden links. Use “View Source” to confirm that the page source is not significantly manipulated (hidden text and/or links). Check the source code for “nofollow” links or use a browser plugin to see these - internal links should not be marked with “nofollow”.
Firefox browser plugin: SearchStatus (”nofollow” highlighting must be enabled)
Get server response headers for homepage - it should be a server response code 200; for a missing page it should be a 404.
Web-based tool: http://oy-oy.eu/page/headers/
Firefox browser based: Webdeveloper toolbar Firefox browser based: Live HTTP headers
Confirm that sufficient value based on good, strong inbound links is available. Many times the inbound links will just not be enough to merit having a website rank highly or be fully indexed. Checking the inbound links across multiple search engines helps to confirm any findings. Access to the site’s verified Google Webmaster Console would be great, but is often not available.
http://search.msn.com/results.aspx?q=linkdomain%3Adomain.com (sometimes not available)
Run a spider simulator for the homepage and some others - if not enough text is found, then there’s no content to index. It should not be a full page frame pointing to another domain. It should also not be purely flash- or AJAX-based. There should be enough textual content and links to other pages. The spider tool should display the same content and links as is visible in the browser.
Web-based spider tool: http://oy-oy.eu/page/spider/
Check for canonical domain issues (www vs non www - make a decision). Check both versions with a server resonse header tool (above) and set up a 301 redirect as required. Additionally, set the preferred domain in the Google Webmaster Console. This is usually not a dominant issue with regards to a penalty.
Setting up a 301 redirect for various webservers Setting up a 301 canonical redirect on IIS with ASP Setting up a 301 canonical redirect on IIS with console access
Check the page in the Google cache and compare it to the current version. If there is a significant difference, it is likely that either something has changed or that the server is giving Google a special version (cloaking). Confirm using “view source” in the browser. Alternately, use the translate-function to check a live version of the page as downloaded by Google.
Check the domain age. The age should match the age as specified by the webmaster. If the domain is newer than a few months, inclusion in the search engines may be difficult.
DomainTools whois service: http://whois.domaintools.com/domain.com
Check several pages with a web-page history tool. There should be no significant difference to the current version (at least with regards to links and/or hidden text). Many times a webmaster will fix the things that cause a penalty but “forget” to tell you about that. If no history is available but the domain is old enough, this might be seen as a signal that something “sneaky” was happening (eg cloaking, hidden content, previously banned domain, etc.).
http://web.archive.org/web/*/www.domain.com/* (does not include last 6 months)
Check Google (and also the other engines) for possible sneaky content and links. This can be done with a site:-query combined with keywords.
http://www.google.com/search?q=site%3Adomain.com+(mortgage|debt|loans|ringtones|diet|credit%20card) http://search.yahoo.com/search?p=site%3Adomain.com+%28mortgage+OR+debt+OR+loans+OR+ringtones+OR+diet+OR+credit+card%29 http://search.msn.com/results.aspx?q=site%3Adomain.com+%28mortgage+OR+debt+OR+loans+OR+ringtones+OR+diet+OR+credit+card%29
Check the outbound links from the site as recognized by MSN/Live. This can be done with the linkfromdomain:-query (only available on MSN/live).
Xenu’s link sleuth (for Windows)
Run a web page spam detector. In general, these issues should be visible in a manual review of the page (step 1), it is good to confirm this though.
Motoricerca Spam Detector
Validate the (X)HTML code of the homepage and others - including doctype and charset. Badly broken code,e specially at the block level, can prevent bots from crawling. Confirm uniqueness of the content. Choose several lower-level pages and copy+paste “snippets” of those pages and search for them in the major search engines. Choose text pieces which appear to be unique to the page in question. The text should only be found on the site itself and possibly on sites that are either referring to it or copying it (scrapers). It should not be found on other sites of equal or higher value.
Copyscape Simple web-based tool for checking in the Yahoo index: http://oy-oy.eu/yahoo/unique/
Check for multiple domains on the same server with significantly similar content. This can be done with a reverse-IP lookup tool.
DomainTools reverse-IP-Tool (requires paid account) SEOlogs reverse IP tool (does not have as many domains as DomainTools but is free)
Search for the domain name to check the text-based results for potentially telling links to the domain (or to any previous content on that domain).
Confirm semantically correct usage of page elements such as h1-h3 headings. This can be done by viewing the page’s source in the browser.
Web-based tool to display titles and headings: http://oy-oy.eu/page/titles/
Check the situation on the indexed pages which are only in the supplemental index.
http://www.google.com/search?q=site%3Adomain.com (displays all indexed URLs)
Check the pagerank of the root URL and several of the lower level ones. The pagerank should be 1 or higher for a site that has been indexed for some time and has sufficient links. Pagerank does not show much, but if the site does not have a pagerank assigned, it is possible that it has problems with Google. Check the robots.txt for any blocks. If access to the verified site in the Google Webmaster Tools is available, check for manually removed URLs.
Check crawl frequency in the Google Webmaster Tools, if access is available. A normal crawl-frequency signals tha Google has access to the website. A normal crawl-frequency together with no listing in the index can signal that a penalty is being served. Check server availabilty from multiple locations.
Check DNS settings.
DNS Report (with all sorts of tests)
DNS timing DNS traversal (check if all DNS servers return the same results)
Check the geographic location of the server (necessary for debugging country-specific search issues).
Check at MaxMind
– Thanks go out to:
webado - for the initial inspiration from Troubleshooting Search Engine Problems