Basing a dimension on a subpart of a URL
This post is about the “Advanced” button that you see (and probably ignore) whenever you define a custom dimension or measure. The how-tos we describe will seem a little daunting, but … you owe it to yourself to at least read the first part of this post. You need to know what is possible.

The Advanced button appears in the “Based On” part of the configuration. It leads to a very powerful way to simplify complicated elements. You can set up the dimension so that it uses a subpart or section of an element (a URL, query parameter, referrer, etc).
The Advanced button functionality can save a whole lot of potential hassle compared to the alternative, which is to change code on a page so that the logs or SDC tags collect extra information. Yay, WebTrends, for allowing us to do this from the UI! Our programmers are grateful because of the work they don’t have to do. And our end users are grateful because we can do more, faster (although they don’t KNOW how grateful they really are).
IMPORTANT NOTE: We’re keeping this post simple from this point forward by describing only what it can do with a custom dimension based on URL. BUT …
- It works with any kind of custom dimension, not just ones based on URL. Query parameters, referrer, user agent strings, cookies …. the possibilities are huge.
- It works for custom measures too. For example, you can use it to extract an individual numerical value from a long cookie.
- It’s not available when setting up custom filters. After our first surprise about this, we realized that it’s actually not needed for custom filters — the existing filtering logic can do everything this functionality would achieve.
End of IMPORTANT NOTE.
Getting back to the “Based On” screen … click on that Advanced button, and the simple “Based On” screen expands to show a bunch of fields and choices, shown below.

The expanded screen offers three different operations (WebTrends usability people, please observe that the screen design doesn’t do a good job of making clear that there are three different operatioins).
In this post, we’re only discussing the three radio buttons in the middle. Below is the same screen shot as above, with the portion we’re discussing shown in heavy red outline. (FYI the other two functions, not discussed in this post, are “Override Default String” and the ultra-valuable Translation (Lookup) file function. Lookup files are mentioned in this other post of ours. What, you’re not already using lookup files? Hmm, we’ll have to write a whole post on it.)

Okay, let’s go back to the part of the screen we care about, the center part that allows us to select a section or subpart of the URL. Remember, that’s our original topic?
When you first get to this expanded screen, the first radio button of the center (outlined in heavy red) section is turned on by default; it is ”Use Full String.” The other two radio buttons are two ways to force your dimension to use only a subpart or section of the URL.
- The first of the two additional radio buttons allows you to dissect the URL using a “pattern match” – i.e. the characters before and after the chunk you want to isolate.
- The other radio button does the same thing using regular expressions which, in the hands of a master, are able to do just about anything..
What this “base the dimension on a subpart” operation does is best conveyed by examples. After you’ve read the examples, those of you who are daunted can leave. But at least get this far!
Example 1: Your URLs are like this:
/discount/Pineapple_Puppy/coupon/ 105135RED01EP001.html
/discount/Catamaran/coupon/ 8299937-fda.html
/discount/Star_Bar/coupon/ 29348-203849-fda.html
/discount/Cigartown/coupon/ 29348-203849-fda.html
and you want your dimension to use only the values that appear between /discount/ and /coupon/, i.e.:
Pineapple_Puppy
Catamaran
Star_Bar
Cigartown
Example 2: Your faceted search produces, for every search menu choice made by the visitor, a URL that looks like this:
/facetedsearch/search/|F_1_25|F_2_8|F_19_3|
/facetedsearch/search/|F_2_6|F_4_8|F_7_21|F_9_1|
and you want your dimension to show only the numbers that appear after the F_2 combination, in other words:
8
6
Example 3: Your URLs are like this:
/shc/s/ s_10153_12605_Clothing_Juniors_Shorts +Capris
/shc/s/ s_10358_12988_Tools_Juniors_Shorts +Capris
and you want your dimension to show only the text in the fourth underscore-delimited field in /shc/s/ hits, thereby reducing the above to just:
Clothing
Tools
How to isolate what you’re after
This is the daunting how-to part and you may leave if you really think you don’t need to know any more. But, realize that those who keep forging on will know that it’s really not so bad, sleep better, and will feel sorry for those who left.
Example 1. In the URL “/discount/Catamaran/coupon/ 8299937-fda.html”, you want to use only the value that appears between “/discount/” and “/coupon/”, when “/discount/” and “/coupon/” are in the URL immediately before and after. In this example, in other words, you want the value “Catamaran”.
Sample fixed pattern method: “/discount/%val%/coupon/”
Sample regex method: “/discount/([^/]*)/coupon/”
We say “sample method” because there are a lot of ways to do it.
Example 2. in the URL “/facetedsearch/search/|F_1_25|F_2_8|F_19_3|”, you want to use only the numeral after “|F_2_”. In this case, “8″ is what you’re after.
Fixed pattern method: can’t use Fixed Pattern because the “|F_2_” does not always occupy the first spot in the string. The fixed pattern method starts at the beginning of the string, see. (WebTrends, could you see if you could change this?)
Sample regex method: “\|F_2_([0-9]*)”
Example 3. in the URL “/shc/s/ s_10153_12605_Clothing_Juniors_Shorts +Capris”, you want to use only the text that’s in the fourth position of the longish string following “/shc/s/”. The longish string’s fields are separated by underscores _, you’ll notice, and the method will take advantage of that fact. In this example, you want to isolate “Clothing”.
Fixed pattern method: can’t use Fixed Pattern because the characters preceding and following the desired section are unpredictable.
Sample regex method: ”/shc/s/[^_]*_[^_]*_[^_]_([^_]*)”
PostScript
Whoa. You can see that the regex method can be a little sophisticated-looking. We highly recommend the method we described in our Frabjous Regular Expressions post, to wit: “Find somebody who knows regular expressions, and ask them to do it.”
But don’t let that prevent you from thinking about the potential, in your analytics, of this very powerful little corner of the WebTrends custom reporting functionality. Find that regex maven. Or learn regex yourself (it’s more fun than Sudoku). Either way, go to the next level.






6 comments
Kudos! About time something clear is written about this. Should directly enter the manual!
This was in front of me all the time. I could have saved a lot of work and hassle getting it into the page code. This would be great for dissecting (your word) a bread crumb trail. A landmark post for me.
Beautiful article about a very useful mechanism in WT, as you say.
But just my 2 cents on example 2:
You say “you want to use only the numeral after ” and then use ([a-zA-Z0-9]*) in the regexp.
English is not my mother tongue, but if “numeral” is equivalent to numeric value (and the sample URLs make think it would, as only digits occur afte F_XXX_), then the regexp should merely use ([0-9]*), it would be alphanumeric otherwise…
I admit, just a detail…
This works great with referring page, which, naturally, cannot be rewritten using URL Search/Replace. For instance, some advertising networks tend to add cookie IDs on redirect, which causes endless unique referrers.
Marco – you are absolutely right. I’ve changed the original. Thanks!
Winter – yes, referring *page* is a perfect use for it, much more so than domain or site.
Have now added a post that works with this principle based on referring page >> http://www.webtrendsoutsider.com/2009/cool-custom-report-google-organic-rank-dimension/
Leave a Comment