Version 8.5 Analytics software released

In case you didn’t know, WebTrends 8.5 Analytics software is going to be released any minute now. If you’ve missed the hype, here are the main reasons to consider upgrading:

Applies to:  Software

In case you didn’t know, WebTrends 8.5 Analytics software is going to be released any minute now.  If you’ve missed the hype, here are the main reasons to consider upgrading:

1)  An updated GeoTrends database, hurrah!    This GeoTrends news is absent from the general WT announcements about 8.5 but we Outsiders think it’s worth top billing.  We hope to see fixes for the problems of Chantilly Virginia (source of all AOL hits) and the Marina Del Rey California (source of many hits that really start in Asia), but we’re not holding our breaths.  Even without those fixes, we WANT it.   It is not compatible with previous versions and must be used with 8.5.  More later.

2) Calculated Measures.  For measures that are already in a report, you can easily create new columns consisting of math manipulations of the other measures in the table.  For example, you can divide Views by Visits to get a Views Per Visit calculated measure in a Pages report, or a Pages Viewed Per Content Group calculated measure in a Content Groups report.  You can also easily move columns around.  Slickly done and useful.  (But even with this we still can’t figure out a way to make Exit Ratios, boo) 

The calculated measures can be created two ways:

  • built in to the definition of a custom report
  • created on the fly whenever you have an existing report open.  And, once you’ve created on-the-fly measures, you can save them in a bookmarked version of the report.  This last is a stroke of genius on somebody’s part at WT. 

Note:  at this time Dynamic Unique Visitors is the only existing measure that cannot be part of a Calculated Measure.

3) Much more efficient User rights administration, consisting of two big changes.

  • When you edit a User’s rights you can quickly select which Profiles and Templates they have access to, by going through lists of profiles and templates right there in the user Edit function.  No more tedious opening of individual Profiles and Templates in order to give a new user access!
  • You can create “Roles” consisting of collections of privileges and rights, then assign these to individual Users.  No more lengthy re-creation of combinations of privileges for every new user.  (But, if you were on top of things, you would have already created something like Roles using dummy User accounts, copying them when you add a new person.  You did do this, didn’t you?)

4) For SDC (and for server logs for clever people), a set of new WT.* parameters, dimensions, and out of the box custom reports designed to deal with Web 2.0 information

  • RSS feeds
  • media clips
  • RIA button events such as “play” or “zoom”
  • consumer generated content events such as posts and comments. 

There’s no magic in these and it’s usually up to you to program all these events to get the events into SDC.  But at least they’re in the WT.* lexicon with some structure around them.  Watch for a WT-sponsored Lunch-and-Learn next month that will hopefully give more information on how to do that programming.

Oh, by the way.  We’ve heard indirectly that some RIA players are working with WT to automatically send some of these parameters.   This is the kind of partnering we need more of! 

5) Much better support for Japanese and Simplified Chinese

Seeing the whole Table of Contents for a profile

The Table of Contents is that stuff in the left nav – i.e. what reports are available to the WebTrends end user for that profile. It’s easy enough to use, but you really can only see a little chunk of it at a time. Wouldn’t it be nice to see the Table of Contents all at once in document form? So you could, maybe, re-organize it? Check for completeness? Annotate it? Give it to the end user with annotations as part of orientation?

The Table of Contents is that stuff in the left nav – i.e. what reports are available to the WebTrends end user for that profile.  You see the T-of-C every day.  It’s made up of all the reports attached to the profile (for that template), organized as an expandable hierarchy.  It’s sorta the same as the template but different — it’s the template when populated only with the reports that are attached to that particular profile.

It’s easy enough to use, but you really can only see a little chunk of it at a time. 

And if you’re like us, you’ve rearranged it and added lots of custom reports to various chapters, so it’s definitely not the T-of-C that came out of the box.

Wouldn’t it be nice to see the Table of Contents all at once in document form?  So you could, maybe, re-organize it?  Check for completeness?  Annotate it?  Give it to the end user with annotations as part of orientation?

There is a way.  You get it through the Scheduler, believe it or not.  The entire Table of Contents (minus the dashboards and auto-populated folders, sadly) is available for copying and pasting when you set up a scheduled export. 

Scheduler >> Scheduled Jobs >> New Job >> Scheduled Report.  Pick the profile and so forth and continue the process with dummy names (sfdaklas is my favorite) until you get to the Reports window.  At this point WebTrends shows you the entire Table of Contents, intending you to choose which reports to export.  Instead, you click, drag, copy, paste into a Word document or whatever.  Pasting as Unformatted Text gives nice indented results.

Having plundered Scheduler for what you want, cancel out.

Final step:  submit a feature request to WebTrends to get this more easily in future products.  See http://www.webtrendsoutsider.com/2008/how-to-tell-webtrends-about-a-bug-or-a-change-youd-like/

There are probably other ways to get a print of the Table of Contents, but none come to mind.  Let us know if you have one.

Postscript:  The main deficiency here is that the auto-populated folders don’t show up, as I said before.  If you have lots of custom reports that you’ve tucked into various chapters in your template, you’ll want to know if there are any that have been assigned to the profile but not assigned to a spot in the template.  I know only one way to check, which is to edit the profile, go to the Summary screen, and grab the list of custom reports assigned to the profile.  Cross-check it by hand with the Table of Contents.

 

An epitaph for server logs? AVG Linkscanner

If you keep up with web analytics news, you may or may not know about the recent fuss over the behavior of a free, widely-used virus-protection program from the company AVG.

Applies to:  server log analytics, SDC analytics

The epitaph should be:  “Here lies The Server Log.  A pioneer and solid citizen for a very long time.  It  finally met its match.”

If you keep up with web analytics news, you may or may not know about the recent fuss over the behavior of a free, widely-used virus-protection program from the company AVG.  In the last few weeks it has been noticed dumping tons of false hits (all with empty referrer fields) into server logs whenever your site comes up on search results pages.  Its method of virus protection consists of following the links on search results pages, to check all of them for malware — before the human searcher clicks on anything.   As a result, every link appearing on a viewed search results page gets a non-human visit recorded in its server logs.  (But not in its SDC or other tag-based logs, phew!  The bot does not request images, so the SDC tag doesn’t get triggered.  At this time.)

At this time, this bot leaves a very subtle calling card that can be (at this time, did we say that already?) the basis for an exclude hit filter.  You’ll have to re-analyze your older data to get it out of your stats.  We started seeing it in large quantities about May 1 although we could find it in April in lower numbers.  The extent of false visits and hits depends on how much search activity your site gets.  On one of our sites, on May 21, it accounted for:

  • 1% of all hits
  • 3% of all visits
  • 9% of all “direct traffic” visits
  • 12% of all visits identifiable as Pay-per-click by markers in the landing page

The hit filter is based on Browser and this string:

;1813

That’s semicolon one eight one three.  That’s the only way to recognize it.  And this method may not last long.  Remember, the malware people want to recognize it too, and it won’t be hard for them to adjust their code to look for this marker in any file request and return a benign file, while still serving malicious stuff to browsers not so marked.  So, it’s in the best interests of the anti-virus program to be absolutely indistinguishable from a human visitor.  When they (AVG) completely anonymize the browser string, our ability to filter them out will be gone.

[June 30 update – Well, they’ve anonymized their UA string.  They’re now using one common for IE 6.  If you filter this one out, you’ve filtered out many of your IE6 human visitors.  But for the record, the string is:  Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1)  ]

[July 6 update – as of July 9, AVG Linkscanner will no longer scan all results on a search page, instead scanning just those clicked on.  Existing copies of  the program should update themselves over the following few days]

If there ever was a good reason to change from server logs to SDC tags, this could be it.  Until, of course, programs like AVG start executing javascript (instead of just reading it) and requesting gifs.  At that point, we may as well start ignoring single-page-visits entirely, although in the server logs we looked at the average individual had 3 of these hits.  Of course, the AVG program may further morph to follow links off the landing page.  At that point we don’t have a lot of options.  Yes, this has been called a doomsday scenario for analytics.  Some of us however are confident this is a blip.  What do you think?

[June 30 update:  for a short time, the program requested the no-javascript SDC gif.]  

If you want to know more, WebMasterWorld (http://www.webmasterworld.com) and AVG Watch (http://www.avg-watch.org) have good continuing discussions by savvy people as well as entertainment from a lot of newbie panickers.  It’s a good place for the latest, best, and most amusing info. 

While we wait for it all to unfold, or perhaps the word is unravel, let’s go into the differences between server logs and SDC logs.  You server log people should be aware of what you get if you switch and what you lose.  You SDC people, probably pretty smug right now, should think about what you’ve already lost.

This part of this topic covers two questions about server logs and SDC logs –

  • what are the good and bad differences?
  • are there workarounds for the bad differences?

First, what are the differences?  Here are all that we can think of, in terms of things they don’t do that the other does:

A.       Server logs don’t:

  1. Show page views resulting from using the back button (i.e. the cache in the visitor’s browser and not requested from the server)
  2. Show page views resulting from any other kind of caching — AOL’s cache of home pages, corporate proxy caches, and other local caches that display a saved page rather than make a fresh request to the server
  3. Show page views resulting from pages that have been copied to someone’s hard disk, then viewed directly from the hard disk.  Also, showing the drive and path on which the page has been stored.
  4. Show page view resulting from page code that has been, um, copied and repurposed on someone else’s site without removing the tag.  Also, showing the domain that is hosting those repurposed files.
  5. Capture clicks that jump to an anchor point in the same page (i.e. URLs such as /faq.html#item5
  6. Collect information from the browser regarding the following aspects of the visitor’s computer:  screen resolution; size of window; enablement of (or version of) javascript, java, flash; local time zone of the visitor’s computer, …..

B.       SDC logs (using the ordinary SDC tag which is similar to the Google Analytics tag and many others) don’t:

  1. Show downloads of untaggable files such as pdfs, docs, swfs, etc that you might consider to be important
  2. Show requests for other kinds of files that most analysts don’t care about anyway – jpg, gif, css, js, etc
  3. Capture traffic from most spiders and bots
  4. Show 404 (Page Not Found) events
  5. Show 500 (Server Error) events
  6. Show time-to-serve (a standard field in server logs)
  7. Show KB sent or received as a result of the request (a standard field in server logs)
  8. POST events
  9. HEAD events
  10. Record page views where the visitor clicked away from the page immediately, before the SDC tag had time to load
  11. Capture virtual redirects

The Advanced SDC tag that comes out of the Tag Builder pretty much takes care of #1 and tracks other things like form button submit clicks that neither server or tag logs normally get. 

 http://www.webtrendsoutsider.com/2008/customized-sdc-tag-builder/

In future posts we’ll talk about a few other work-arounds. 

The Advanced SDC Tag Builder – an end to DCS Multitrack on individual links

Just released: http://tagbuilder.webtrends.com

TagMagic. Extremely cool.

Allows you to create an SDC tag that has advanced features appropriate to your situation. Read the help screen, click on all the little question mark icons, and in general be sure know what you’re doing.

Just released:  http://tagbuilder.webtrends.com

TagMagic.  Extremely cool.

Allows you to create an SDC tag that has advanced features appropriate to your situation.   Read the help screen, click on all the little question mark icons, and in general be sure know what you’re doing. 

  • Customized tracking of multiple domains and subdomains
  • Quickly create tags for specific tracking requirements such as form navigation and Dynamic Search conversions
  • Track on-page clicks to links, off-site links and other page elements.  If you have already implemented dcsMultiTrack for any type of web site tracking, consult with WebTrends Support or Services before you use the Tag Builder functions found on the Click Event Tracking page.
  • Track pdfs, .docs, .wmvs, and other non-taggable files
  • Modify an existing tag without the risk of inconsistency based on manual changes
  • Custom query parameter tracking i.e. parameter mapping
  • Customized cookie tracking including OMB-compliant cookie tracking
  • Dynamic Search conversion tracking

You may have been one of the very few customers who have been using an older, non-public version of the Advanced Tag.  You should know that almost all the previous Advanced Tag functionality is there and more, with new and presumably better javascript code, and only one Include file instead of two. 

If you use WebTrends Analytics software, you do NOT have to upgrade to 8.5  to use the Tag Builder. 

Note that if you’ve already marked up your off-site links or download links (etc) with DCS Multitrack, you’ll have to revert those links to normal ones, because the advanced tag’s automatic detection of those links will result in double-counting.

WebTrends internal backups – how and why

The Why should be obvious. It makes a difference between re-analyzing your entire history and re-analyzing only since the last backup. It’s especially useful when you’ve made a boneheaded configuration mistake and want to cover it up ASAP.

WebTrends has a built-in backup program that works well. Out of the box, it’s not turned on, so consider this a heads-up to do it now. Your main decisions are:

The Why should be obvious.  It makes a difference between re-analyzing your entire history and re-analyzing only since the last backup.  It’s especially useful when you’ve made a boneheaded configuration mistake and want to cover it up ASAP. 

WebTrends has a built-in backup program that works well.  Out of the box, it’s not turned on, so consider this a heads-up to do it now. 

Your main decisions are:

  • Where will the backups be stored?  They take up a lot of room, so you might want to specify a different computer with more space than your analysis server.  Specify the location in System Management >> Storage Locations.
  • Do you want backups for all your profiles, or just a few critical ones?  Many people don’t back up experimental or rarely-viewed profiles.  If you want all future from-scratch profiles to be automatically backed up, there’s a global setting for this in System Management >> Backup Options (Note:  you’ll still have to individually turn on backups for any profiles that already exist.  See below). 
  • How far apart should you space the archived copies?  Do you really want to have 365 individual copies of your analysis by the end of the year?  Or can you live with one per week or one per month?  Many people run backups every day but keep only the most recent four to seven dailies, then keep older copies at increasingly spaced-out intervals.   A generous general setting for somebody who thinks they can notice most problems within 4 days would be:  save 4 dailies, 8 weeklies, 8 monthlies, one yearly, for a maximum of 21 backup files.  These global settings are in System Management >> Backup Options.  The same archive schedule is in effect for every profile that is set to get backups.
  • Do you need to back up your configuration settings too?  Many people assume their current configurations are the best ones and don’t archive old configurations at all.  If you do, the settings are in System Management >> Backup Options.

Here’s how to turn on backup for one profile at a time. 

  1. Scheduler >> Scheduled Jobs >> New Jobs
  2. Choose Backup
  3. Choose the profile you want to be backed up
  4. If you have the processing time available, set it for Daily, at a convenient time

A few more details:

  • All previous backups get wiped out if you ever run “re-analyze” on a profile.  (This is a good time to promise yourself to never touch the re-analyze button, which destroys everything except your configurations.  Try to use copies of profiles instead, then delete the originals when the copies are correct.  Or at least to hit “re-analyze” only with a cool head and a few deep breaths.) 
  • If you’re backing up only certain profiles, it’s a good idea to change their names to show their backup status, for example by adding (B) to the profile name.
  • Backup settings carry over to copies of a profile.
  • Changes to backup setting cycles don’t take effect until after the next backup has run.
  • As said above, backups can take up a lot of disk space.  You might want to review all the backup jobs running from time to time and eliminate those you don’t care about.  Also, you might want to store your backups somewhere where there’s room.  You get to choose the location ins  Administration >> System Management >> Storage Locations.  Be sure to use UNC notation for the locations, NOT mapped drive letters.
  • Backup files and folders can be manually deleted if you need to do some emergency hard disk space-making. 
  • How to restore a backup (roll back a report).  You’ve just noticed one of your log files arrived late and you have a hole in your reports four days ago.  You need to roll back that profile its state as of more than 4 days ago, then analyze the logs (including the log that was missing at the time.)  It’s a simple matter of going to Administration >> System Management >> Backup/Restore >> Restore Backup, choosing the backup copy that is old enough to precede your problem time point, clicking on Restore.  Then, back in the Web Analysis list of profiles, click on “analyze” (NOT re-analyze!!!)

You should read the Administration Users Guide for full details, especially if you use Parent-Child profiles.  The situation is a little different.