Written by David Bayer on April 8, 2008 9:23 am EST
If you're new here, you may want to subscribe to my RSS feed. Thanks for visiting!
I’ve spent a tremendous amount of time scouring the web for information on how orphaned pages are created, their effect on non-orphaned pages within a site, how they can drag your entire site into the supplemental index and how to recover or eliminate them. Jim Boykin provides great insight into the causes and effects of “supplemental hell”, one of which can be the inadvertent creation of orphaned pages. Jim provides great advice on how to ‘re-adopt’ or eliminate orphaned pages by linking to them. While this works great when a webmaster has only a few orphaned pages, larger sites that may orphan hundreds if not thousands… of pages as modifications are made across the site face a much more daunting task. I will share my thoughts on removal of orphaned pages in a future blog. I’d like to spend a few moments on why orphans can drag your entire site into supplemental and the algorithm that creates this scenario.
How is a Page Orphaned?
A page becomes orphaned when other pages no longer link to it. This can happen as a webmaster modifies the architecture of his/her site, as links or pages are applied with nofollows, or as url naming structures change. For example, if www.sitexyz.com/page1 is indexed and later renamed to www.sitexyz.com/page-1, the original page with the original url structure will continue to be indexed unless applied with a 301 or 404 status codes. Two problems were created with the aforementioned modification – duplicate content & the creation of an orphan. We will focus on the orphan.
Do orphaned pages get reindexed?
Search bots can revisit a web page in one of two ways. Either they revisit a page via inbound links to that page as they spider across the web, or they revisit that page, from time to time, directly from the primary index. Depending on how many inbound links a page has, a bot can revisit a page every month, week, or day. What I’ve experienced is that an orphaned page, that has no inbound links mind you, is revisited directly from the index every 60 to 90 days.
Do orphaned pages affect non orphaned pages or rankings?
I had a conversation with Jim about how supplementals were affecting the rankings of one of our larger sites. He indicated that he didn’t believe that supplementals should affect rankings, simply that the supplemental pages wouldn’t rank above the primary index. He’s right. The problem is that orphans affect your rankings because they disrupt an effective linking architecture and become drains on the page rank of other pages within the same domain.
The PR Threshold Equation
Some of what I offer here is speculation but based on experience. Before Google removed the ability to view supplemental pages, I witnessed a direct correlation between the number of orphans on our site, the spread of supplemental labeling to non-orphaned pages within our site, and a drop in longtail serps. (Why only longtail serps is also the subject of a future blog.) My belief is that there is a ratio between the total number of pages within a domain, and a minimum page rank requirement for pages to be listed in the primary index. The ration follows a simple logic such that:

where Inherent Page Value = 1 & the RF is a numeric requirement factor.
ILLUSTRATIVE EXAMPLE 1 – 10 pages, maximum PR distribution, no orphans
A site has 10 pages and a simple structure where page 1 links to all other nine pages, and the other nine all link to page 1 and only page 1. (Feel free to follow along using Web Workshop’s Page Rank Calculator. I do believe there are some problems with various calculations using this tool but it should suite our purposes for this example.)

Since this is a standard structure, I imagine that all pages will make the cut into the primary index (i.e. have enough page rank to stay out of supplemental.) Thus:

and allow for inclusion in the primary index (barring no other issues with the page.) Thus RF must be a numeric value greater than 4.1.
ILLUSTRATIVE EXAMPLE 2 – 10 pages, less than ideal PR distribution, no orphans
Here’s a structure that creates the lowest value of an individual page on a 10-page site would look something like this:

where all pages link to all other pages, with the exception of one page which only has one inbound link. In this example,

if Page A is going to make the grade into the primary index. If it does, that indicates that the RF must be greater than 7.5. I don’t speculate on what the PR factor is. It should in fact decrease as the number of pages of a site increases.
ILLUSTRATIVE EXAMPLE 3 – same as EXAMPLE 2 with 10 orphans
So how do orphans play into all of this? Imagine the above example with 10 orphans floating around a core of ten pages that are linked together. We will propose that we’ve established the RF to be 7.5. The PR values of the 10 pages and the site as a whole would remain the same, but the site now has 20 pages. While I’ve seen evidence that orphaned pages do represent some value, they do not represent the same Inherent Value of non orphaned pages that benefit from inbound & outbound linking. For the purpose of this example, orphaned pages will not have any contribution to the overall PR of a site although the actual value may be somewhere between 0 and 1. Thus the equation now looks like this:

If the RF was in fact 7.5, the required PR becomes 1.25, thus driving all of the non-orphaned pages of the site into the supplemental. Clearly these are extreme examples used to illustrate a point, but I hope it provides insight into how the supplemental PR threshold equation might work, and the negative effect that orphans have on all pages of a domain. Keep in mind, increasing the internal PR value of pages or the site as a whole from inbound links has a significant effect on the overall equation and may prevent pages from going into the supplemental that might otherwise do so in a site completely dependent on internal linking value.
Please share thoughts and comments as appropriate. Hope this helps!
Trackback URL for this post: http://www.gimmiethescoop.com/the-creation-of-orphan-pages-and-how-they-affect-supplementals-and-rankings/trackback