Tips, tricks, and pokes, just WebTrends Analytics
Random header image... Refresh for more!

Reasons for seeing your own site in Referrer reports … and what you can do about it

On the all-important Referrers reports (Referring Domain, Referring Site, Referring Domain Type, and Referring Page), there are always two puzzling entries – the Direct Traffic item (see this post about Direct Traffic) and the inevitable listing for your own site. 

How can you be seeing visits referred from your own site?  More importantly, what should you do about it?

The reason is easy to understand once you “get” what a visit really is in the analytics world. 

Empathize with the visitor for a second. 

When, exactly, is the visitor really done with your site? 

  • When they click on one of your links that goes off-site?  What if they come back right away? 
  • When they type something else in the browser window?  What if they come back a couple minutes later?
  • When they close the browser window?  What if they have your site open in more than one window? 
  • When they back-button to the search engine window that brought them there in the first place?  What if they come back to your site with another click on your listing?
  • When they leave your site open and go to a meeting?  What if they start clicking again when they return to their desk?

Get the picture?

For all of these, my question to you is:  do you think the visitor is done with your site when they “leave”?  Is the visit really over? 

In many, many cases of the above behaviors, the visitor comes back to your site later.  The reality is that visitors are distractable and Internet browsing is circuitous.  It may not be realistic at all to assume the visitor is mentally done with your site when they do the above actions. 

The other problem with these definitions is that, except for the first one, WebTrends has no data on them anyway.  WebTrends, and no analytics package really, can know that the browser window was closed or the visitor switched to another site (other than by offsite links on your site).  Browsers follow protocols set up long ago by a central group of wise people (the W3C) and, for privacy reasons, that information does not get transmitted by browsers and never has.

Empathize with the WebTrends program for a second. 

It’s chunking along, assigning each hit to a visit (using a cookie, for example), and keeping track of all these open visits while they’re happening.  It (the WebTrends program) has no way of knowing when a visitor is done with their visit.  None!   It cannot tell when a different site fills the browser window or when the browser is closed by the visitor. 

WebTrends’ can only ASSUME a visit is over when it sees a long period of nothing happening.   

Consequently, WebTrends has a time-out setting.  If x number of minutes go by without any file requests, WebTrends marks the visit closed and adds that visit’s various stats to various internal tables.

Out of the box, WebTrends’ visit timeout is 30 minutes of inactivity.   So if somebody leaves a page sitting on their screen for 31 minutes, then they come back to the page and click on something, that new click will be the bucketed as the first click of a new visit

And in this particular instance, the referrer, i.e. the page of origin of that click is …. the site itself!

That’s the explanation of what’s happening.  Now, what should you do about it?  Here are possibilities.  (The first is a necessary mindset and the last one is IMHO the best action to take.)

  1. Adjust your mental labeling.  It helps to think of the self-referred visits with a special name.  I call them “visit second halves.”  “Self-referred visits” is not a good label because it implies that they really are visits.  To me, they’re not.  They’re just visit pieces.
  2. Adjust your stats.  You get a “visits” number from the Overview Dashboard and various places, but is it really the number you want to use?  A much more accurate visit count is the total visits number MINUS the second halves.  In other words, [number of visits (from Overview) minus visits referred by your own site (from Referrers)].  Yes, if you do this arithmetic, your total visits number goes depressingly down.  But at the same time, the size of a typical visit goes up!  And ratios like “percent of visits with a conversion” goes up also.  (Note that you’ll have to do these adjustments to your stats outside of WebTrends, for example in your Excel exports that are your final deliverable to your end users.)
  3. Realize and accept that your campaign attributes will underestimate conversion because of this.  The reason: if the campaign attribution happens in the first visit (the landing page), and the conversion happens after the visitor’s pause in activity, the conversion can’t be connected by WebTrends to the campaign. 
    • Well, there’s one way — set up your campaign reports to use Most Recent Campaign.  Unfortunately this will connect some conversions to old campaigns, and if you’re interested only in campaigns that your visitors experienced very recently, you’ll need to look at short timeframes, like a day or two.
  4. Change WebTrends settings to reduce the number of second halves, i.e. merge as many second halves as possible with their first halves.  You need the software to do this, not On Demand. The setting that does this is the timeout and, as far as I know, WebTrends software is the only program anywhere that allows you to do this.  The longer the timeout, the fewer visits will be split in pieces because of the timeout. 
    • The timeout is configurable by you for a given Session Tracking choice — in the Session Tracking area (Web Analysis >> Options >> Session Tracking >> General tab).  You can set up several different Session Tracking definitions, each having different timeouts, because different profiles may need different timeouts.  For example, if you really really need to have your WT stats on the same basis as other analytics programs, you’ll want at least one profile that uses 30 minutes. 
    • I, personally, set my timeouts to be 90 minutes, simply because my mental model of a visit allows the visitor to take a lunch break or have an hour-long meeting. 
    • Some analysts prefer 2-4 hours
    • There’s one analytics package out there that uses 24 hours, or at least they did when I was using that package.
    • WebTrends has no maximum, but the higher you set it, the more likely it is that WebTrends will break because it has to keep more and more visits in memory at one time.

Postscript:

There is another reason for seeing your own site as referrer if you use SDC tags for data collection, and it’s worth mentioning here (thanks for the nudge Michael M!).  What if a visitor lands on a page that doesn’t have an SDC tag?  Any clicks going from that page to the rest of your site will be recorded as originating on that un-tagged page.  The visitor’s SECOND page view will be mistakenly shown as the entry page, and the referrer will be your own site.   One way to check on this is to look at a report on “Referring Pages” (not Referring Sites or Domains) — look for any of your own site’s pages that are listed as referrers but don’t show up in the Pages report.

Finally — Don’t solve the problem by filtering out visits that are referred by your own site!  It’s appear that you’ve tidied up the referrers report and fixed your total visits number, but any KPIs that happen during second halves will be lost.

 

 

 

Share:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

Tags

, , , , ,

Somewhat Related Posts

  • Reasons for “Direct Traffic” in referrer reports
  • ...
  • Beyond WT.srch — the better way to track PPC
  • ...
  • Discrepancies between WebTrends and Google AdWords
  • ...

    16 comments

    1 Jacques Warren { 01.12.09 at 12:44 pm }

    A few years ago, I was paranoid about the self-refferring visits and would use an exclude visit filter to get rid of them. Then I realized that for some sites, I was too harsh, because of what you say in your “Things to NOT do”. So, I don’t do it anymore, unless I am facing a high percentage of visits that are self-reffered (40 – 50% +), which is not “normal”.

    As another cause: if you sessionize on IP addresses (and there still ARE many out there who do it), consider ISP IP switching as a cause of self-reference. I’ve seen cases where that would be very messy (50% + of visits were self-refferred).

    2 rocky { 01.12.09 at 8:28 pm }

    Good point on the IP switching.

    I wish I had more facts. I wonder prevalent this is, nowadays. AOL was the main user of this method, but AOL has dropped to about 5% of all traffic (on sites I’m looking at). Does anybody know whether other ISPs besides AOL are doing this?

    A quick way to get an idea of the extent of this practice would be to run the same data with two different sessionizing methods. I’ll get back here tomorrow with some more info.

    3 Marco { 01.13.09 at 4:44 am }

    Very interesting article!

    Maybe an additional solution might be to implement some sort of keep alive (e.g. firing a MultiTrack every 25 minutes after the page is opened). This would stitch visit parts together in a scenario where the page is left open without activity for >30 minutes.

    4 rocky { 01.13.09 at 6:27 am }

    Marco – nice idea! Very interesting. And, it would be possible to pull these visits out as a segment, and try to figure out what they were doing.

    5 Michael Notté { 01.14.09 at 4:57 am }

    Great article & clear explanation! I stopped counting how many times I had to answer such question from business key users. the self-referrer problem is bringing quite some confusion whenever one discover the “referring site” report for the firs time.

    I used to call these visits “interrupted visits” :-)

    The idea of increasing the timeout value is interesting – do you think that increasing to 90 minutes as you do really impacts the performances? We had our share of troubles with our software installation due to performance problems. Now that everything has been fixed – I don’t want to ask IT guys to do something that could break down again :-)

    Regards,

    Michaël
    (www.kaizen-analytics.com)

    6 rocky { 01.14.09 at 6:46 am }

    Hi Michael -

    I honestly don’t know. We’ve never seen WebTrends actually choke on anything, but we’ve definitely seen a line of profiles waiting in the queue, temporarily, when slow ones are being processed. And I would guess that increasing the timeout will slow down profiles because WebTrends might have to use disk scratch space more than usual.

    You could try a couple of new profiles, analyzing a day of data each, with a different timeout in their sessionizing method. Run them consecutively rather than concurrently. You can check the various timings by looking at the processing details in the Scheduling part of the admin.

    7 rocky { 01.14.09 at 6:47 am }

    p.s. “interrupted visits” is possibly a better term, for end users, than what we were using!

    8 Marco { 01.14.09 at 9:44 am }

    @rocky: The only drawback with my solution (see above) is that this keep-alive adds page views and counts against licence.
    Maybe using a .gif extention on the MultiTrack would be a solution, but then I don’t know if images are good enough hits to keep a visit alive…
    Any idea about these thoughts?

    9 rocky { 01.15.09 at 6:18 am }

    Good question about the gif extension being enough for WT to consider the visit still alive. I don’t know.

    The keep-alive would also artifically inflate the size of those visits, in page view count terms. So there’s the additional problem of … if you filter out the keep-alives to keep the size of those visits realistic, you’ve lost the ability to tie the parts of the visit together. Hmm.

    It sounds like we’re back to increasing the timeout.

    10 Michael M { 01.19.09 at 7:56 pm }

    Another reason for seeing your own site as a referrer is domain-name-only links, e.g. http://www.yoursite.com, which send back to the browser an index.html, which in many cases does a meta refresh to go to the home page. By the time the WebTrends code in the home page is executed, the referrer at that point is the index.html (your site). In order to capture the true referrer, the WebTrends code must be included in the index.html.

    11 rocky { 01.21.09 at 7:38 am }

    Good point. I’m not sure why so many sites whose home page is NOT index.html (for example, it is index.jsp) still allow the server to send name-only links to index.html. The default file name can be changed in the server OS to index.jsp. I’m sure there’s a good reason because people I respect (well, one person I respect) set up large sites in this odd way.

    Anyway, you are right about the meta-refresh page needing an SDC tag. But this results in a problem situation – some visits will be inflated by one page view, which the visitor never actually sees. You can’t filter them out because they contain the referrer info. You can fix some aspects of the situation by telling WebTrends that the home page is both index.html and index.jsp; in that case both will be converted to just a / for purposes of reports.

    I think the best solution is the first one, to direct the server to skip index.html entirely. But, as I said, there might be good reasons to leave it that way. Any ideas on that, Michael?

    12 Mini_Cooper_Boy { 03.03.09 at 4:27 am }

    I’m a little confused by the last few comments. Are we saying that if someone puts a link to http://www.mysite.co.uk on their website then any visits which come via that link will get counted as being referred by my own site????
    Is this still the case if dcsuri (dcsref) is being populated with the URL of the referring site??

    Also, another question which is loosely related –
    We currently track direct links into a distinct section of our site by setting the links up as a campaign which we can then track through the campaign ID’s report (pretty much as discussed here http://www.webtrendsoutsider.com/2008/cool-custom-report-small-scale-but-interesting-referrers/). These visits will not hit our homepage at all.
    I’ve done a query through Marketing > Referrers > Referring site to see how many visitors have been referred from a page which has both a link to our homepage (in the format http://www.mysite.co.uk) and a link to another section of our site. I want to check will these figures exclude or include the visits which are set up as a campaign?

    13 rocky { 03.03.09 at 10:50 am }

    Heya MCB, for now, ignore those last few comments. The bottom line is that if somebody has a link to your site, THEIR site will be the referrer. That is, if the referrer gets collected at all. You may be getting confused by a situation where the referrer is not being collected. See if this post helps you at all — http://www.webtrendsoutsider.com/2008/reasons-for-direct-traffic-in-referrers-reports/

    14 rocky { 03.03.09 at 10:55 am }

    Also, MCB, if the other site has a relationship with you, your best solution is to forget about the referrer aspect of tracking, which is unreliable. Ask them to put a marker parameter into their link. If you use the parameter WT.mc_id, WebTrends will consider it a campaign and will store that fact in the Visitor History table, which is neat because you can look at the “latent” effects of that affiliate site.

    So, ask them to change the link to the home page to be: http://www.mysite.co.uk/?WT.mc_id=thatothersitename

    And the link to the subsection of your site could be changed to http://www.mysite.co.uk/subsection/index.asp?WT.mc_id=thatothersitename.

    You can use any value you want for the parameter WT.mc_id, including have different ones for each landing page.

    Does that help?

    15 Mini_Cooper_Boy { 03.04.09 at 3:33 am }

    Rocky, That helps massively :) thanks a lot. I had a bit of a panic moment when i read those last few comments ;) I’m just about to have a bit of an investigation regarding referring sites using the post you mentioned above.

    I’ll also have to have a natter with the web dev guys to talk about possibly getting things changed to use WT.mc_id. Just one question, we have to use third party cookies will this cause any issues with WT.mc_id?

    I’m pretty new to Web Stats, and its definitely an interesting learning experience :)

    Top site BTW :D

    16 rocky { 03.04.09 at 8:23 am }

    Thanks for the compliment!

    WT.mc_id and Visitor History have to connect visits based on a persistent cookie that WebTrends can read. It all depends on how you have set up your Session Tracking (what cookie it uses). I’m not sure about the third party aspect.

    However, WT.mc_id still works fine for all the Campaign ID pre-configured reports that do NOT depend on visitor history, i.e. reports that only pay attention to the Campaign ID (the WT.mc_id value) of the present visit.

    Leave a Comment