Use URL Search/Replace to Undo Hard-Coded Content Groups

URL Search & Replace – using it to remove hard-coded content groups from your reports.

 

This post is about a specific use for the Webtrends “URL Search and Replace” functionality.  We wrote about URL S&R in a general way in this post.

You should know about URL S&R because once in a while it’s very helpful.  Irreplaceable, in fact (haha).

URLSR

Basically, what URL Search/Replace does is this:

The first task the Webtrends processing engine performs is to look at the URL of the hit it’s about to process and to check whether any “Search and Replace” rule matches that URL.  If yes, it makes the specified change then sends the altered hit to be processed as usual.  If no, it sends the original hit, unchanged, to be processed as usual.  That’s it.  The important thing that makes it so useful is that Webtrends does this before absolutely any other processing of that hit.

I don’t know of any other web analytics tool that allows this, but I could be wrong.

Examples of uses:

  • Take a dedicated landing page URL and add the WT.mc_id parameter that you should’ve put there in the first place, but forgot to do, in order to get the traffic to show in campaign reports that depend on seeing WT.mc_id.
  • Change “redir.jsp?url=othersite.com/whateverpage.asp” into “othersite.com/whateverpage.asp” so you can see redirects in pages reports in a less confusing way.
  • Remove the parameter “sessionID=whatever” from all URLs in case you have those kinds of archaic things happening.
  • (if you process server logs rather than SDC data) change an important image into a page file, i.e. change “importantimage.jpg” into “importantimage.html”.

And, the subject of today’s post …

  • Make Webtrends completely ignore any hard-coded content groups (WT.cg_n) and only use the UI-defined content groups you have turned on for that profile.

Why?  If you have hard-coded content groups, they will show up everywhere  – in content group reports and also in content group path reports.  If you want to look at back-and-forth travel among a few select content groups that you defined in the UI, those hard-coded groups mess up everything.  (I know some of you out there have discovered Content Group paths, so this post is for you!)

The answer to the mess is to devote a profile to those select UI-defined content groups and, in that profile, make Webtrends blind to the hard coded ones.

Here’s how:

Since hard-coded content groups contain the text “WT.cg_n=<something>&” you can “remove” them all with this configuration in the S&R interface:

    • Replace from
    • Start of first
    • WT.cg_n=
    • Up to
    • End of next
    • &
    • with
    • <empty field> (i.e. nothin’)

Note that this will leave any content subgroups in place, which is not a big deal – these don’t show in Content Group reports, the Content Group dimension, Content Group paths, or anything else.  If you really want to suppress the subgroups also, use the specification below which relies on the fact that the WT.sys parameter pretty much always follows WT.cg_n.  (You might want to check with a debugger or in an actual log file to be absolutely sure)

    • Replace from
    • Start of first
    • WT.cg_n=
    • Up to
    • Start of next
    • WT.sys=
    • with
    • <empty field>

That’s it.  Once you have made the S&R rule, just turn it on for the selected profile.  Make sure only the important UI-defined content groups are active in that profile.

If you have any other outrageous examples of using URL S&R, let us know!

Postscript:

I realize that Webtrends probably prefers that we only use hard-coded content groups and that they (Webtrends) are trying to lead us in that direction.  It’s true that UI-created content groups use processing time and may not make it easy to architect some functions and reports.  But I think that’s a bit wrong-headed, because the UI-based ones are so much more versatile.  Google Analytics’ recent addition of content groups to their UI is, I think, validation of this.  First of all, UI-defined content groups can be created really fast.  Second, they can be turned on and off as needed, just by assigning/unassigning them to a profile, individually.  Agree?  Disagree?  Feel free to write to us.

Google Analytics’ Content Groups compared to Webtrends’ Content Groups

Google Analytics has added Content Groups! Since I rely heavily on Content Groups in Webtrends, I wanted to see whether GA did it differently.

 

Let it be known that I really like Webtrends’ Content Groups.  I use them constantly.  The Content Groups report, rather than the Pages report, is my go-to report.  And the Content Group Paths from Entry report is important enough to sacrifice an entire profile slot to.

So I was excited when I found out Google Analytics has implemented its version of content groups a few weeks ago.  I took a look and saw a few differences I’d like to call out.

apples and oranges

Can I assign a page to more than one group?

In Google Analytics, a given page can be in a maximum of five content groups.  That is, a page can be in only one group per grouping and there are five possible groupings.  Within a grouping, Google Analytics will assign a page to the first content group criterion it matches.

In Webtrends, a page can be in any number of content groups.  I can have any number of schemes going at once, in a given profile.

This can make the resulting Content Groups report a bit of a beautiful mess because of the intermingling of schemes, but it’s a minor drawback since I can filter the Content Groups dimension, bookmark the resulting set, and end up with permanently available individual reports for each of my schemes.

(Tip: Put the scheme name into content group names, for example “Checkout, Step 1” and  “Checkout, Step 2,” or “Page Type:  Form” and “Page Type: FAQ”.)

Comment:  I do like that Google Analytics allows me to activate different Content Group schemes one by one.  I don’t like limiting a given page to participation in only five content groups.  I use many content group schemes and, in Google Analytics, I’d have to start proliferating profiles.

Is there Content Group pathing?

In Google Analytics, I can see the previous and the next content group.  I cannot see content group paths (anything longer than one step).  Chaining “nexts” isn’t the same as paths, btw!

In Webtrends, I can see the previous and the next content group only for those content groups for which I have set up a one-step path report.  I can, however, see content group paths — longer paths for individual groups (specially set up) and the all-important (to me) Content Group Paths From Entry.

Comment:  Webtrends’ Content Group Paths from Entry is a mainstay of my analytics practice.  Google Analytics’ pathing, which involves chaining of “nexts” rather than paths that correspond to real visitors’ extended paths (the WT method), is inferior.

Can I assign pages to content groups using page code or tag management?

In Google Analytics, yes.  But each page can be assigned to only one Content Grouping in the tag code.

In Webtrends, yes, and you can hard-code the same page into as many content groups as you want.  One drawback is that hard-coded content groups can’t be eliminated from a profile instantly, which can be done in Google Analytics by switching groupings.  (On the other hand, de-activating hard-coded content groups in a Webtrends profile is fairly easily done with a URL Search & Replace operation.)

Comments:  I put individual pages into several content groups all the time.  Being limited to one just won’t work out.

Can I assign pages to content groups by extracting part of the URL or page title?

In Google Analytics, yes.   It allows either simple matching or regex for turning part of a URL into the name of a content group.  This includes parameters – I can turn the value of a parameter into a content group.  For example, if I have a parameter “color=” then this can be a Grouping and each color will be a content group.   This will in all likelihood use up one of my five groupings.  (In Google Analytics, this was previously sort of available by filtering on the Content – All Pages report, then saving the resulting report as a shortcut.)

In Webtrends, this extraction method sounds familiar because Webtrends does this exact same thing.  But Webtrends doesn’t call it “content groups.”  In Webtrends, this is called “defining a custom dimension”.   In Webtrends, there is no limit to the number of custom dimensions.

Comment:  Being able to see only five tabulated parameters is just not going to work for most of my clients.   I can’t even think of a web site I work with that has only five parameters that need tabulating.

Can I assign pages to content groups based on rules?

In Google Analytics, yes.  But your rules will work with URL parameters only if you haven’t suppressed those parameters in the GA Reporting View Settings.

In Webtrends, yes.  And you can use parameters in rules whether you have suppressed them in report views or not.  Another difference in the details:  Webtrends allows content groups to be based on numeric value logic as well as text, while I’d have to use regex to do that in Google Analytics.

Comment:  Suppressing parameters in report views bring order to the chaotic-looking GA pages reports.   It feels like a big compromise to have to sacrifice content groups at the same time.  But hey, that’s what Excel and APIs are for.

Can I drill down to see metrics for the individual pages in the content groups?

In GA, yes.

In Webtrends, no.  I’d have to create a 2D report of Content Groups over Pages.  Easy enough to create and apply, but still an extra step.

Comment:  I can’t decide whether it’s too big a hassle to create that 2D report in Webtrends.  Um … no.

What metrics can I have?

In Google Analytics, only six metrics are shown:  Page Views, Unique PageViews (equivalent to WT’s Visits), Average Time on Page, Entrances, Bounce Rate, % Exit, and Page Value.  You can also get other measures, sort of, using a Secondary Dimension (but the resulting report needs to be exported and sorted).

In Webtrends, I can apply any of the dozens of out of the box measures as well as any measure I can make up.  In addition, there’s a “Content Group Duration” report that supplies time spent viewing pages in the content group (total and average), and a “Content Groups of Interest” feature that provides a Unique Visitors metric encompassing past visits (i.e. how many unique visitors have ever seen this content group).

Comment:  No comment needed!

Are Content Groups retroactive?

In Google Analytics, no.

In Webtrends, no …. but you do have 90 days of replay analysis available in OnDemand, and infinite re-analysis with OnPremises.

Comment:  Another “no comment needed.”

 

Final comment:  much as I love many other GA features (which I think will be addressed by the upcoming Webtrends Explore) I really can’t live with GA’s current CG limitations except for uncomplicated sites.  We’ll see what the future brings.

Doubles! Reasons for Discrepancies between Webtrends and Google Analytics Visit Counts

Three reasons why Google Analytics sees more visits than Webtrends does

doubles

Google Analytics usually shows more visits than Webtrends does, for the same site, same time.

There are three reasons:

  1. If a visit starts before midnight and finishes after midnight, Google Analytics counts two visits.  Webtrends counts one visit.
  2. If a page view happens in the middle of a visit that has a different campaign (organic search, paid search, or any hit with utm_campaign= in it), Google Analytics counts two visits. In other words, if a visitor who has your site open in one tab, then uses a campaign or search link on another tab to come to the site separately, Google Analytics considers that second action the start of another visit.    The same thing happens if the visitor backs out of the site then returns via another search or campaign.  In all the above, Webtrends counts one visit.  (Note: these assume the visitor is moving around with no gaps of 30 minutes or more.)
  3. If you have WT or GA tags on two or more domains, Google Analytics will start a new visit when you cross domains.  The exception is when the sites are linked and the links have been specially coded to transfer a Google Analytics visitor ID.   Webtrends counts one visit.  The exceptions for Webtrends are Safari (and soon Firefox, and maybe eventually other browsers), or any situation where third party cookies are not accepted.

If you know of any extra wrinkles to this or other reasons for the higher visit count in Google Analytics, let us know!

In 2D Reports, Be Careful When Combining Hit-Based and Visit-Based Dimensions

There are two types of Webtrends report dimensions: Hit and Visit. In reports with nested dimensions, you have to be a little careful of the combinations.

There are rules about how dimensions can be combined in 2-D reports.   The rules have to do with the  important distinctions between hit dimensions and visit dimensions.

A visit-based dimension applies to the whole visit – things that never change during the visit.  The referrer does not change.  The browser does not change.  And so on.

A hit-based dimension can change from hit to hit.  The URL changes.  The day of the week can change.

Rule 1:
In two-dimension custom reports, the first dimension must be broader than the second.

Visit is broader than hit.  Hit is finer-grained than visit.

So you can use these combinations of Primary > Secondary dimensions in a custom report:

Visit > Visit
Visit > Hit
Hit > Hit

And you must NEVER use this combo:

Hit > Visit

For the latter, you will get a report with results, but it will freeze your soul if you look at it too closely, and astute consumers of your data will laugh at you, or worse.

Rule 2:
For Hit-Hit 2D reports, both events must happen in the same hit!

Of course, Rule 1 begs the question of what is a hit-based dimension and what is a visit-based dimension.  It’s not always intuitive and it’s definitely not in the UI or as far as I can tell in the documentation.  Here’s the correct list for all the currently available Dimension choices and how WebTrends categorizes them.  Pay attention, because you’d probably guess wrong on some.

Hit Dimensions:

  • browser
    browser version
    content group
    cookie parameter
    day of week
    directory
    download
    extension
    hour of day
    pages
    query parameter
    any custom drilldown
    query parameter (when collected on “all hits” or “hits that match xxx” or “most recent value”)
    query string (when collected on “all hits” or “hits that match xxx” or “most recent value”)
    referrer (labeled “per hit”)
    return code
    server
    time period
    url
    url with parameters

Visit-based Dimensions

  • ad campaign
    agent
    area code
    authenticated username
    city
    country
    dma
    domain name
    duration
    entry page
    entry request
    exit page
    geography drilldown
    MSA
    network
    network type
    new vs returning
    organization
    page views
    platform
    PMSA
    query parameter (when collected on “first occurrence” or “last occurrence” )
    query string (when collected on “first occurrence” or “last occurrence” )
    referring page (the one labeled “initial per visit”)
    referring site
    referring top level domain
    search engine
    search keywords
    search phrase
    state
    throughput
    time zone
    top level domain
    visitor
    visitor segment (WT.seg and WT.vhseg)

 

 

 

What is “Direct Traffic” ?

What is “direct traffic” in your referrer reports? Lots of things. More than you thought. Here are all the known reasons for the Direct Traffic classification.

Since Yahoo traffic is quickly turning into “direct traffic,” it seemed like a good idea to re-post this (see item 18).  This post is one of the most frequently-cited posts we’ve ever done.

“Direct Traffic” is a legacy label that no longer makes sense.   Once upon a time, in long-ago simpler days (approximately 2003), the absence of a referrer in log files could only mean that somebody typed your site’s name into the browser’s address window, or used a bookmark, which amounts to the same thing.

No longer.

Here’s our current list of reasons for an empty referrer field, a.k.a Direct Traffic, or as it should be called, “Unknown Referrer Traffic.” (Webtrends 10 does call it “Unknown Referrer,” I’m happy to say.)

Although this list will help you find ways to reduce your Direct Traffic to more realistic numbers (i.e. closer to just reporting on #1), Direct Traffic is still a mess.  Or, as Jacques Warren once said, “I, for one, never use Direct Traffic in my reports and analyses anymore. It’s full of unreliable crap.”

  1. Somebody really did type in the address, or used a bookmark to get to your page.
  2. They clicked on a link in an email.  (Recently, this includes webmail servers also.  They don’t pass “mail.yahoo.com” and the like any more.)
  3. The link was in a document, Excel workbook, or PDF
  4. The link was in Skype, GTalk, or AIM
  5. The link was in a mobile app and opened your site in a (mobile) browser.  For example, clicking on a URL in a tweet viewed in a Twitter app or client, or a mention seen in a Facebook app.
  6. The link originates at a secure (https:) page and your page is not secure (http) (this sometimes includes web mail servers)
  7. The link originates at a secure (https:) page and your page is secure (https:) for certain browsers, but not all
  8. Spiders and bots were working from a list of URLs from a previous crawl (this one mostly applies to server logs, rarely to SDC)
  9. Spiders and bots may be programmed to suppress the referrer information (this one mostly applies to server logs, rarely to SDC)
  10. The link to your site was in Javascript (this is mostly a problem with IE).  Javascript links to your site include those that open your site in a new browser window, or any kind of javascript redirect.  Many banners’ links are programmed this way.
  11. The link to your site is from within a Flash application (mostly a problem with IE) (there are a lot of ways to do this in Flash so there may be exceptions)
  12. Your landing page redirects to another page via a 302 temporaryserver-side redirect
  13. The link was on an intranet or some other web site behind a proxy or corporate gateway that was set up to strip referrers from requests
  14. The visitor has made changes to their browser that suppresses the referrer information
  15. The visitor has set one of your pages as their browser’s home page or a pinned tab.  This is especially a problem where you’re a big company and your employees have the site as their home page … but you should be filtering out your own company’s IP addresses in the first place.
  16. Another site has put your page content into an iFrame and coded the frame to suppress the referrer, in order to make it difficult for you to find out who is framing your content
  17. Certain A/B situations where visits directed to the B page group are redirected via 302  from the control page (A) to B too quickly for the tag to fire on A.  Check with your A/B vendor about whether this might happen with their product.
  18. Starting early 2014, they came from any Yahoo Search, if your landing page is http rather than https.  The complete anonymization of Yahoo Search traffic will be complete March 31 2014.
  19. Starting early 2014, they came from Bing secure search (secure search is optional to Bing users at this time, but may in the future follow Yahoo’s lead).