Tips, tricks, and pokes, just WebTrends Analytics
Random header image... Refresh for more!

Tracking “Page Not Found” 404’s with SDC tags

Javascript analytics tags like WebTrends’ SDC tags generally don’t track error pages, i.e. 404s (page not found), 500’s (server errors), and so forth.  Server log files do.   Most people accept the absence of error tracking as one of the tradeoffs of using tagging, outweighed by tagging’s many other advantages.  Other people do separate analyses of server logs just to get the error stats.

It’s actually easy to track error pages even if you tag your pages.  It’s done by getting your site’s server to return a tagged page whenever one of these client or server errors happens.  There’s really no trick to it because both IIS and Apache servers make it quite easy.  Your job is to create those page files and put them in the right location on the server.

And, the cool part is that if you do it this way, the original URL (the one sought by the visitor) shows up in your SDC logs as the URL of the page, right where it normally is.  All you have to do is keep this particular collection of pages separate.  The method described below provides a way to filter them in or out, with a parameter that we call “WT.errorcode” (you can name it whatever you want).

Here’s the “right location on the server” part.

  • For IIS,  go to Manage Your Server >> Application Server >> Internet Information Services (IIS) Manager >> Web Sites >> your sitename.  (Depending on your version of Windows Server, your path to IIS Manager might be different.)  Right click on your sitename and open Properties.  In the tab called “Custom Errors” you’ll see how and where to put your custom error files. 
  • For Apache (UNIX/Linux), find the .htaccess file for your web site and add command lines like this:

    ErrorDocument 404 /404.htm
  ErrorDocument 403 /403.htm

Much easier than you thought, yes?

What do these special error pages have on them?

Here is our preferred way of creating those substitute error pages.  You have to make one error page per error code that you want to collect.

  • Put an SDC tag on the page (duh)
  • Set a <meta> in the <head> for name WT.errorcode and content of the error code itself (404, 502…). 
  • Note that we are not using the special SDC parameter “DCS.dcssta.”  If you know SDC parameters well, you might be surprised.  DCS.dcssta will set the hit to be a true 404 or 502 in the SDC logs.   If we used it we could use the out-of-the-box WebTrends error reports.  However, the out-of-the-box error reports don’t show any detail, except for the 404 report which does nicely display all the URLs that got 404s.  (You may prefer using DCS.dcssta; just set it in the <meta> area with the value being the string that you want SDC to put into the status field of the SDC log.)
  • Of course, you want to be sure that the page does actually generate a 404 (etc) code – and isn’t seen as a successful, 200-status display of a page called “404.html”!  Google, for one, is not friendly to sites that have 200-status 404 pages.  Check this, using Fiddler or some other program that displays the real code.

What about the WebTrends report setup? 

If you’ve done the above, for every error hit, you’ll now have a hit in the logs. 

You probably don’t want to count them in your regular general reports.  This means you’ll have to create a hit filter that removes them from the analysis and apply it almost all of your profiles. Sorry.  It’s the price you pay for excellence.  The filter would be based on URL, with Page = * and Query Parameter = WT.errorcode, value = *. 

And here’s the excellent part – you can now make a profile that gives you stats and trends for all the error codes you’ve set up special error pages for.  Create a profile without the hit filter we just described.  In that profile, put a custom report  in which WT.errorcode is the first dimension.  Depending on how the error code log line appears, the second dimension should be either URL or the page just before the error, i.e. “Referring Page (per hit).”  Be sure to have those dimensions set to include interval data so you’ll have trend graphs.  With errors, it can be really important to know exactly when they happened.

If you have done more than just a couple error codes and if you expect a lot of data, you should instead create a different one-dimension custom report for each error code, though you can still put all these custom reports in the same profile.  The dimension for the 1D report version would be URL, filtered to include specific values of the WT.errorcode parameter.

 

Here’s a link to a related June Dershewitz post over at SEMphonic:   http://www.typepad.com/t/trackback/2600861/31886334

Share:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

Tags

,

Somewhat Related Posts

  • The Advanced SDC Tag Builder – an end to DCS Multitrack on individual links
  • ...
  • Reminder: What the SDC Tag Builder can do for you
  • ...
  • Canonical URLs – Why You Should Care
  • ...

    9 comments

    1 Bryan { 02.24.09 at 10:11 am }

    Is there any harm in using both WT.errorcode and WT.dcssta on the same custom error page?

    2 rocky { 02.24.09 at 10:53 am }

    Bryan – WT.errorcode isn’t an out-of-the-box parameter, as far as I know. Is this proprietary to your site?

    At any rate, guessing at what it is, doing both shouldn’t be a problem in terms of funky interactions. But if you use WT.dcssta to set a page as a true 404, your WT.errorcode info won’t be available in custom reports because 404’s only get reported on in the out-of-the-box error reports.

    3 Bryan { 02.24.09 at 11:27 am }

    I referenced WT.errorcode from the bulleted list in the article. :)

    But, your reply makes sense. I would need a 200 return code in the SDC log in order to do any custom reporting. Thanks!

    4 rocky { 02.24.09 at 1:47 pm }

    Oops. So you did. My answer still is correct though.

    5 Adam { 04.23.09 at 2:37 am }

    Could you clarify a couple of things please:

    1) I couldn’t get the custom SDC reporting working in WT 8.0d using the custom dimension – could this be a version incompatibility or could I be doing it wrong?
    2) Shouldn’t the ‘WT.errorcode’ in your article actually be ‘DCSext.errorcode’, or are they the same thing?
    3) I had to end up using ‘DCS.dcssta’ instead of the ‘WT.dcssta’ specified in your article for the standard reports to work.
    4) When using the standard ‘File Not Found Errors’ report it displays all of my URLs with the all WT tags appended to the query string, even though I’ve set WT to remove them in other reports. Is there a way of filtering them?

    Thanks for a great site!

    Adam

    6 rocky { 04.23.09 at 6:49 am }

    Wow, those are astute questions.

    #1 – my guess is either you’re doing it wrong or I made a mistake somewhere. Can you provide more info? One hint you gave is that you are using DCS.dcssta. If you’re doing that in the same profile as this custom report, the custom report will not work — because you’ve turned all those hits into 404’s (in the SDC logs), and custom reports won’t report on 404’s. That is, they won’t work UNLESS YOU USE THE TRICK WE DESCRIBE. For the custom report to work, those hits have to remain as 200’s (successful), but with a parameter indicating that they are, underneath it all, 404 error hits. Does that help?
    #2 – No. You can do it either way. SDC wants either a WT or DCS prefix in order for it to recognize that a given needs to be collected by the tag. The difference is … WT.whatever stays as “WT.whatever” in the resulting SDC logs. DCS.dcsext=whatever will get turned into, in the resulting SDC logs, a parameter called “whatever.” Either one will be perfectly usable in custom reports as long as you set them up for the parameter name as it ends up in the SDC logs. For the first, WT.whatevever is the parameter name. For the second, whatever is the parameter name.
    #3 – Yes. I’ve changed the original text. Thank you!
    #4 – No. WebTrends’ File Not Found report is a special creature, ancient and living according to its own rules. It predates the WT.parameters and its original design was to show all parameters without being affected by the rest of the configurations, and there you are. I think changing this is definitely worth a feature request!

    7 LJX { 04.29.09 at 12:57 pm }

    This is a pretty brilliant post.

    Now here is the next part of it, is there a way to get the 404 error code pages AND also get it to populate a meta tag with the page that the person was trying to reach? While it is nice to see that 65,000 customers are getting a 404 error can we find that 65,000 people are getting an error code while trying to access a specific page? By using the raw apache log files you can do this and thus track broken links or other issues people are having with navigation.

    With SDC we can see there are all of the error codes but not what is going on, which is a vital part to making sure a site runs correctly. We wouldn’t want a huge link on a site to be broken and everyone clicking on things that don’t work, if there is a method to do this then everything is solved and we can abandon the expensive to maintain apache logs for good.

    8 rocky { 04.29.09 at 3:43 pm }

    As I re-read the post, I realize I didn’t make it clear. The SDC tag will capture the URL. It will NOT capture “404.htm” as the URL.

    So, if you follow the instructions, you’ll end up with a list of the pages that the people were trying to reach. And you won’t have to populate a meta-tage with that page.

    I’ll go back and change the text to make this much more obvious.

    9 Mateo { 03.23.10 at 3:53 am }

    What about just setting up a measure for the WT.errorcode = 404 and using it in a report with 1. dim URL and 2. dim “Referring Page (per hit)”? We did it like that and it looks good

    Leave a Comment