How to Optimize a Database-Driven Website for the Search Engines
Problems and solutions in optimizing a database-driven website for the search engines
There are problems optimizing a website for the search engines, where the pages of that site are created from pieces pulled from a database. If you use php, cgi, Cold Fusion, Microsoft ASP, or various proprietary shopping carts like xCart -- these make the kind of web pages where the URL (as you see it in your browser's address bar) contains question marks ("?"), equal signs ("=") and other symbols. The links within this kind of website don't go to existing html pages. The links are set up so that when you click on them, the pages are created instantly for you from information and HTML code stored in a database on the server. It is all put together for you instantly "on the fly" when you click on the link.
Basically, search engine spiders aren't smart enough to figure out how to interact with a database to create those pages, so sometimes they never make it past the first page of the site. While indexing your site and trying to follow the links from your main page, if a search engine spider finds a question mark in the URL that you are linking to, especially if it references a "sessionID", the spider may disregard that link and move on. Session IDs are particularly troublesome to search engines because every time a search engine spider visits that website, it sees a unique URL with a session ID in it, and will index the home page (which has a unique session ID as part of its url) newly, store it away, and so end up with another copy of the home page in its index every time it views the site.
Which is another good reason to have a canonical tag at the top of every web page, so Google knows which page it has just crawled, no matter what its URL says it is. It is especially important for database-driven websites to have a canonical meta tag on every page it serves up.
If your whole site is built using question marks, session IDs and a string of variable IDs, then the busy search engine spider may well leave your site without indexing anything.
This is not a new problem! It's been around since web developers first started making sites using databases.
None of the search engines can interface directly to your database and read what is in it -- they just aren't set up to do that.
If you have a database-driven site then special actions need to be taken handle the behavior of the search engine spiders. We do not cover them in detail on this page, this is just an outline of some possible solutions:
Making Static Category Pages
You can set it up so the main (home) page of the site links to several static pages (which will need to be created if they don't already exist) which then interact with the database. It helps to have about a dozen of these static (non-changing) pages: these are .html pages which are NOT created by the database "on the fly". Typically they would describe the main categories of what you are selling. The links FROM these "category" pages can create other pages based on information pulled from the database, but these static category pages themselves should not be created newly every time someone clicks a link from the main page of the site.
Each of these static category HTML pages should then be optimized using the appropriate key word phrases for that category. These static HTML pages will then be picked up and indexed by the search engines. You stand a much better chance of getting good search engine rankings for your website by creating and optimizing some static pages like this.
Using ModRewrite to Make Search Engine Friendly URLs
If you can't set up static pages which then link to your database, don't fear. You can still optimize the site using ModRewrite in Apache. See my late-2010 blog post which goes into detail on how to the .htaccess file to make Search Engine Friendly URLs. That gives some examples and you may be able to implement this easily.
The other methods one can use are described in this article at the Web Developer's Journal and depend upon whether one is using ColdFusion, CGI scripts, ASP, or some other method of displaying the information from your database on your web pages.
For another view on this, see Jill Whalen's article about optimizing dynamic sites.
The workaround outlined above using category pages work well to get your site listed and doing better in the search engines. Unfortunately, it involves some re-design of the way the site works. Your webmaster may have some resistance to this because it means a lot of work -- but it's the only way we know of to get good placement in the search engines for a website built around a database of products and information about them.
If you are trying to optimize a database-driven site, you can contact us for an inspection and suggestions about what else can and should be done. We've seen some beautifully designed websites--very pretty, interactive and doing a good job of selling products--which were invisible to the search engines. The search engine spiders are simply not designed to crawl through any website and index all the pages that can be created from a database in response to clicks from visitors. You have to HELP the search engines get to this content properly.
Return to Top of Page.