Thursday, March 17, 2016

Hedgehog Persona Tool Part VI

(Squared Euclidean) Distance and going the extra mile to get it.

In my previous blog post I touched on the concept of Squared Euclidean Distance between patterns. So what is this Distance and why do we care? It is the numerical representation of a difference between two Sitecore patterns. Going back to an example from Persona Tool let's look at several patterns:

The yellow one almost maxes out on Analytics but quite low on Customer Relations and Campaigns and Targeting keys. The pink one is obviously very different: high on Innovation and Customer Relations and very low on Technology and Analytics. If we were to find out how we can measure this difference we would use Sitecore's GetDistance method from Sitecore.Analytics.Patterns namespace in Sitecore.Analytics assembly. I decompiled this method and this is what it looks like:

Pattern's Space.Dimensions is a number of Profile Keys. In our case it would be 6. Following the logic we see that each Profile Key value is compared between the two patterns and this result is multiplied by itself. Then added to a cumulative value which is our distance.

Here's a snapshot from watching two keys and how they compare to each other. The first key ([0]) happens to be Analytics.

Here's a great blog post on visualizing pattern cards by Adam Conn. You can download his module and watch the current visit's pattern change before your eyes!

Hedgehog Persona Tool takes this process of pattern matching to a whole new level. As previously discussed, Sitecore patterns are preset, and may not accurately reflect visitor behavior. You can compare this process to a conference planning: you expect visitors from certain states and prepare badges (Pattern Cards) for them with state names. But as visitors come you might realize that some of them arrive from totally different areas and your badges (Pattern Cards) are irrelevant. The Hedgehog Persona Tool allows us to see the accurate visitor behavior in real time. Each visit stored in Mongo is turned into a pattern, matched against other patterns and then assigned to a cluster with similar patterns for a much more detailed and accurate analysis.

If you'd like more information about persona development and marketing strategy, reach out to Hedgehog Digital Marketing Innovation Team.

Thursday, March 3, 2016

Hedgehog Persona Tool Part V

Look who's getting engaged!

The Persona Tool helps us know our visitors better, and the evolution of the Persona Tool brings more and more new features. The one I want to focus on this time is the exceptional, extraordinary and essential Engagement Value (EV) per cluster of visits.

Once we pick one of our Sitecore profiles (i.e. Audience Segment) in the dropdown we would see all our visits with Audience Segment data combined into one cluster that represents an average visit to our site. Let's take it one step further and reduce our distance. I know: what on earth is distance? For our current purposes, a distance here is a difference between one pattern and another. Take a look at the radar charts in the image below and observe that the three figures are easy to differentiate. The difference between each one is represented as a numerical value in Sitecore: that is our 'distance'.

So, in the slider I reduce my distance to 16% (of the largest possible distance between patterns representing our visits). This action is highlighted in red.

The tool runs the analysis and displays three resulting clusters of visits. Within each cluster all visits are 'averaged out' and represented as one pattern. You can see those patterns as the radar charts. There is information about each clusters: percentage of visits in the cluster and average Engagement Value (EV), highlighted in green in the image above.

There is plenty of information in Engagement Value as a concept out there on the web. For example, I really like Martina Welander's post on it Sitecore's marketing features for developers: What is 'Engagement Value'? Essentially we set up Goals in Sitecore and assign a value to each goal. All those goals might have various values from low to high. Then we associate certain content or events (such as downloading a document) with goals. As users achieve them (i.e. download those documents) Sitecore stores that value (Engagement Value) inside Interactions collection in Mongo for each visit.

In our specific example we can see that the largest cluster (the one contaning a whopping 61% of our visits) has the largest average Engagement Value (17.56). It is not always the case. Depending on your data you might discover that a small slice of your site's visitors is actually the most 'engaged'!

Those are the people who are achieving the goals we set for them: they are downloading important documents, visiting pages we want them to visit most, signing up for events and services that mean a lot for our business. If we are an e-commerce site they are clicking the Buy button, thereby increasing our revenue.

Wouldn't that be great to know who they are in terms of their behavior on the site? What content do they view? What is their pattern? Well, this is what this cluster's radar chart is all about: it shows us the pattern of our most active (or 'engaged') visitors.

In my next post I will write more about Distance: what it is, how to get it, why does it matter, and how the Persona Tool can help you best utilize this metric. If you'd like more information about persona development and marketing strategy, reach out to Hedgehog Digital Marketing Innovation Team.

Thursday, February 25, 2016

Hedgehog Persona Tool Part IV

What to Expect when using Mongo Query to get Sitecore Profile data in .NET

After configuring the Persona Tool to work with profile data in Mongo and creating and populating a mock collection of visit data using .NET driver for Mongo, I saw that the continued development of the Persona Tool required using MongoDB Query to get Sitecore profile data in .NET. The excellent blog post MongoDB C# Driver CheatSheet by my esteemed colleague Derek Hunziker was my "Go to" when working with Mongo Query for .NET.

MongoDB.Driver.Builders namespace offers a lot of functionality to work with Mongo data. The one we are particularly interested in here is a class called Query. If you open the Object Browser you'll see a plethora of operators.

In the code snippet below I am using .And (I am combining all filters within that And), .GTE (greater than or equal to), .LTE (less than or equal to), and .Exists.

This is quite intuitive and would not intimidate somebody familiar with SQL and/or Linq. However I ran into some difficulties querying Sitecore xDB profile data. It just so happens that Sitecore's schema for profiles is set up like this: Profile is not a json array of objects but rather an object in itself, as shown in the image below. I erroneously assumed that I can query for Audience Segment by doing something like "Profiles[0]" or "Profiles["Audience Segment"]" but neither of those worked. In order to query successfully I needed "Profiles.Audience Segment".

I passed a profile name (i.e. "Audience Segment") along with a date range to my method that reads visit data for this profile. Once I got the collection I build the string I mentioned above - currentProfileString = "Profiles." + profileName which comes out as "Profiles.Audience Segment".

Then I construct the query object. I filtered by StartDateTime and EndDateTime.

'Query.Exists(currentProfileString)' is equivalent to '.Where(x => x.Profiles != null && x.Profiles.ContainsKey(profileName))' in Linq.

Human translation: I asked the database to return visit data for documents where profile node "Audience Segment" exists AND where visit start date was within a given range.

It's very informative to look inside the VisitData class offered by Sitecore.Analytics. From there this can be turned into a custom object, serialized or used as is. Below I have highlighted Profiles property (essentially our profile data) and Valu (which is actually an Engagement Value).

In my next blog post I'll discuss how exceptional, extraordinary and essential Engagement Value is, and how Persona Tool analyzes this important piece of information from each visit. If you'd like more information about persona development and marketing strategy, please reach out to Hedgehog Digital Marketing Innovation Team.

Thursday, February 18, 2016

Hedgehog Persona Tool Part III

Writing mock Sitecore Profile data as BSON documents with .NET driver for Mongo

I’ve spoken in previous posts about how useful Profiles and Pattern Cards can be and how profile data is handled by Mongo, and as I continued to work on the Persona Tool I saw a need to create and populate a mock collection of visit data. To accomplish that, I decided to work with .NET driver for Mongo and form my BSON documents on the back end. First things first: in my custom class responsible for communicating with Mongo I have a constructor that expects a connection string:

new AnalyticsDatabase(ConfigurationManager.ConnectionStrings["analytics"].ConnectionString);

In ConnectiongString.config file I have the following:

<add name="analytics" connectionString="mongodb://localhost:27001/my_analytics_db" />

I have references to MongoDB.Bson and MongoDB.Driver in my project. You can get the latest packages from NuGet at

My next step is to instantiate a MongoClient, then get the server and locate the database:

var client = new MongoClient(_connectionString);
var server = client.GetServer();
var database = server.GetDatabase("my_analytics_db");

I left the original Interactions collection intact and created a new one for my purposes.

var test = database.CreateCollection("Test");

As we know a Mongo collection is a collection of BSON documents. Therefore I created the following list:

List<BsonDocument> docs = new List<BsonDocument>();

Then I made a loop of however many fake visits I wanted and formed each document.
First I want to create the Values node for my document (highlighted in yellow below):

I pick a random value between 1 and 20 for the Count (which is a count of times someone in the real world would look at my content associated with any of the Profile Cards under the current Profile).

var count = rnd.Next(1, 20);

Then I create the values for the Values node:

var newValuesDoc = new BsonDocument {
{ "analytics", GetRandomFloat()},
{ "campaigns and targeting", GetRandomFloat() },
{ "content management", GetRandomFloat() },
{ "technology", GetRandomFloat() } };

Make sure that in your custom method that returns a random float you don't exceed the value: MaxValue on the Profile Key (i.e. "analytics") times Count. Please see my previous blog post that explains Count and Values in Mongo Profile data as well as how they relate to each other in detail.

Now that we have all our values we add them up and put them in the Total.

float total = 0;
foreach(BsonElement em in newValuesDoc.Elements)
       total += float.Parse(em.Value.ToString());

Next I created the Profile Node:

var myProfileNode = new BsonDocument {
{ "Count", count },
{ "Total", total },
{ "ProfileName", "Audience Segment" },
{ "Values", newValuesDoc }};

var profileDoc = new BsonDocument("Audience Segment", myProfileNode);

Give it the dates you desire and make the document that would represent a visit:

var doc = new BsonDocument{
{"_id", new Guid()},
{"t", "VisitData"},
{"ContactId", new Guid()},
{"StartDateTime", *give it some DateTime*},
{"EndDateTime", *give it some DateTime*},
{"SaveDateTime", *give it some DateTime*},
{"Profiles", profileDoc},
{"Value", rnd.Next(20, 100)} };


Then finally outside the loop insert the newly created batch of documents:


In a perfect world it’s best to work with real life visit data or use a tool like JMeter to generate visits to the site, but for putting together a POC or testing, mocking Profile data works.

With the code above users can create a legitimate Mongo collection containing Profile data. In my next blog post in this series I will cover some particularities of reading it using Mongo Query. If you'd like more information about persona development and marketing strategy, please reach out to Hedgehog Digital Marketing Innovation Team

Thursday, February 11, 2016

Hedgehog Persona Tool Part II

Profiles, Patterns, Cards and how Mongo keeps track of it all.

In the first post in this series, I discussed how useful Profiles and Pattern Cards can be (read that post first if you haven’t yet!). But there are a lot of tools, terms and details involved, and it can be so confusing! There's profiles, profile keys, profile cards and profile card values. Then there's patterns and pattern cards. What is it all about? I’m going to go through some of those details, clarify a few things and also show how profile data is handled by Mongo.

1. Profile is a category Sitecore uses to define criteria by which we track visitors' behavior. In the example site I used in my previous blog post PersonaTool Part I Audience Segment would be a profile.

2. Profile Key is an attribute of a Profile. In our example Analytics is one aspect under Audience Segment. We need to set MinValue to the lowest value in the scale and MaxValue to the highest. So what is MaxValue? Can it be 197? How about 12? It could be any integer really but in most cases a scale from 0 to 5 is sufficient.

3. Profile Card (or it's Profile Card Value) is a preset collection of profile key values combined. It is these cards that get assigned to content items. You can pick and choose values for each Profile Key (such as Technology) to reflect the degree of relevancy. In this example Technology is quite a bit more relevant to a content item that will be associated with the Developer profile card.

4. Pattern Card are used by Sitecore to match a visitor profile in real time with its closest pattern. They don't have to contain exactly the same values from a corresponding Profile Card but in most cases it makes sense to mirror values between the two. For example, you can see in the image above that the Developer Profile card shows identical metrics for Analytics, Campaigns, Content Management and Technology as the Developer Pattern Card (below).

Now let's look at the way profile data is represented in a MongoDB collection. The one we need is called Interactions. Thinks of it as visit data, which contains wealth of information about interactions with out site. We are interested in is Profiles node. In the screen shot below I have highlighted Audience Segment profile that we have been using so far. Some things, like ProfileName and PatternLabe are self-explanatory. Others need a bit more investigation.

Count is visit "score", or a way of measuring how many times visitor clicked on content associated with any profile card. Count is of particular interest because this value, along with a Max Value on each profile key, is involved in calculating the accumulated profile key value. Take analytics (5). This number will never be more than Max Value for Analytics in sitecore (5) multiplied by Count in MongoDB (4). We shouldn't worry about this auto-generated value too much unless we are mocking profile data and need to know what value to put in each profile key in Mongo. I will discuss this further in my next blog post Hedgehog Persona Tool III coming out next Thursday. Stay tuned!

Values are profile key values accumulated by our visitor.

Total is the sum of all accumulated values between all profile keys (analytics, campaigns and targeting, content management and technology combined).

PatternID is an internal way to identify a pattern.

In my next upcoming blog posts I will touch on more aspects of working with Sitecore Profile data such as reading it with Mongo Query and writing BSON documents to MongoDB using .NET driver for Mongo. And if you'd like more information about persona development and marketing strategy, please reach out to Hedgehog Digital Marketing Innovation Team.

Tuesday, February 2, 2016

Hedgehog Persona Tool Part I

Do we really know our visitors?

Profile and Pattern Cards can be very useful, but using them the first time requires that the marketers make a lot of assumptions about their visitors. Marketers create an initial series of Profile and Pattern Cards and as visitors navigate the site they are matched to one of these established cards. Sitecore does that by calculating Squared Euclidean distance between a visit’s pattern and each Pattern Card. This distance is expressed in a numerical value (double). Whichever pattern happens to be “closest” is the one the visitor is assigned to.

There’s a great project you can download and play with. I installed it on the Sitecore 8.0 (rev. 150621) and connected it to MongoDB 3.0. I also added more Profile Keys and one additional Pattern Card. In the example below we have a Developer Pattern Card. Sitecore creates a radar chart representing the card.

As I navigate the Launch Sitecore site I am matched to the very Pattern Card we were looking at: Developer.

The question is – how does a marketer know what the actual pattern for the visit looks like? Sitecore determined the closest pattern and essentially forces it upon the visit. What if erroneous assumptions were made? What if there are large groups of visitors that don’t even come close to either of the established Pattern Cards?

My project began with this idea in mind, because it would be great if there was a tool that could help us accurately gather this information. The program that I am currently developing from Hedgehog Development was originally masterminded by Mike Edwards. The tool looks at all visits to your site, determines their patterns and groups them together depending on the Squared Euclidean Distance between the patterns decided by the user. At the maximum possible distance all patterns are aggregated and averaged into one grouping that represents an average visitor to a given site. As the distance is decreased more groups are formed with patterns differing from each other by that new, smaller distance. What’s great is that a user can lay the groups over their Sitecore Pattern Cards and visualize the difference.

The screen shot below shows an average visit to my site. The radar chart displays all Profile Keys for this Profile (such as Analytics, Technology, etc). The grouping tells us what percentage of visits were aggregate into this grouping, how many visits the percentage represents, and a value for each Key.

I can "turn on" one of my Pattern Cards (Content Owner for instance) and see the difference between my average visitor and a Pattern Card. This is incredibly helpful to understanding my users and the content they are looking at on my site.

The tool is currently undergoing enhancements to make it more informative, intuitive, and to offer additional features.

If you'd like more information about persona development and marketing strategy, reach out to Hedgehog Digital Marketing Innovation Team.