The phrase “Web Analytics” can refer to several different topics, anything from putting a page counter on several key pages on your Web site to the use of sophisticated Web server log analysis software to analyze Web site visitor navigation patterns on your site.
The second and more traditional approach to Web analytics, is “log file analysis”, where the log files that Web servers use to record all server transactions are also used to analyze Web site traffic. This topic is also discussed in some detail below in the “Log File Analysis” section.
The following section was taken from the Wikipedia “Web Analytics” entry:
On-site Web Analytics Technologies
In addition other data sources may also be added to augment the data. For example; e-mail response rates, direct mail campaign data, sales and lead information, user performance data such as click heat mapping, or other custom metrics as needed.
Web Server Log File Analysis
Web servers record some of their transactions in a log file. It was soon realised that these log files could be read by a program to provide data on the popularity of the website. Thus arose web log analysis software.
The Early Years
In the early 1990s, web site statistics consisted primarily of counting the number of client requests (or hits) made to the web server. This was a reasonable method initially, since each web site often consisted of a single HTML file. However, with the introduction of images in HTML, and web sites that spanned multiple HTML files, this count became less useful. The first true commercial Log Analyzer was released by IPRO in 1994.
Page Views and Visits (Sessions)
Two units of measure were introduced in the mid 1990’s to gauge more accurately the amount of human activity on web servers. These were page views and visits (or sessions). A page view was defined as a request made to the web server for a page, as opposed to a graphic, while a visit was defined as a sequence of requests from a uniquely identified client that expired after a certain amount of inactivity, usually 30 minutes. The page views and visits are still commonly displayed metrics, but are now considered rather unsophisticated measurements.
Search Engines Complicate Matters
The emergence of search engine spiders and robots in the late 1990’s, along with web proxies and dynamically assigned IP addresses for large companies and ISPs, made it more difficult to identify unique human visitors to a website. Log analyzers responded by tracking visits by cookies, and by ignoring requests from known spiders.
The extensive use of web caches also presented a problem for log file analysis. If a person revisits a page, the second request will often be retrieved from the browser’s cache, and so no request will be received by the web server. This means that the person’s path through the site is lost. Caching can be defeated by configuring the web server, but this can result in degraded performance for the visitor to the website.
Page Tagging to The Rescue
Concerns about the accuracy of log file analysis in the presence of caching, and the desire to be able to perform web analytics as an outsourced service, led to the second data collection method, page tagging or ‘Web bugs’.
The web analytics service also manages the process of assigning a cookie to the user, which can uniquely identify them during their visit and in subsequent visits.
With the increasing popularity of Ajax-based solutions, an alternative to the use of an invisible image, is to implement a call back to the server from the rendered page. In this case, when the page is rendered on the web browser, a piece of Ajax code would call back to the server and pass information about the client that can then be aggregated by a web analytics company. This is in some ways flawed by browser restrictions on the servers which can be contacted with XmlHttpRequest objects.
Log File Analysis vs Page Tagging
Both log file analysis programs and page tagging solutions are readily available to companies that wish to perform web analytics. In some cases, the same web analytics company will offer both approaches. The question then arises of which method a company should choose. There are advantages and disadvantages to each approach.
Advantages of Log File Analysis
The main advantages of log file analysis over page tagging are as follows:
- The web server normally already produces log files, so the raw data is already available. To collect data via page tagging requires changes to the website.
- The data is on the company’s own servers, and is in a standard, rather than a proprietary, format. This makes it easy for a company to switch programs later, use several different programs, and analyze historical data with a new program. Page tagging solutions involve vendor lock-in.
- Logfiles contain information on visits from search engine spiders. Although these should not be reported as part of the human activity, it is useful information for search engine optimization.
Advantages of Page Tagging
The main advantages of page tagging over log file analysis are as follows.
- Page tagging can report on events which do not involve a request to the web server, such as interactions within Flash movies, partial form completion, mouse events such as onClick, onMouseOver, onFocus, onBlur etc.
- The page tagging service manages the process of assigning cookies to visitors; with log file analysis, the server has to be configured to do this.
- Page tagging is available to companies who do not have access to their own web servers.
Understanding Page Tagging
A great way to gain an understanding of page tagging – the technology as well as how it’s implemented – is to take a very close look at Google Analytics (UA, GA4 and GTM).
In order to do this, I’m going to assume that you have a Web site (personal or business) and that you know how to access and modify the pages on that Web site. If you do NOT have these skills or Web site accessibility, then the person/company who does these tasks for you can easily follow these directions.
The quickest way to get up to speed on Google Analytics is to take the tour of their free service. Click on the following link (make sure your sound is turned on) to begin the tour:
Now let’s go to the Google (Web) Analytics Support Center and follow their instructions for creating, installing and using their Analytics tools:
Google Analytics Support Center (www.google.com/support/analytics/)
Log File Analysis
What I describe here is how I used to use the Unica NetTracker “On Demand” Web Server Log Analysis Service (discontinued – read NetTracker’s history on Wikipedia) to monitor key Web site traffic statistics on a monthly basis, as well as several other page-specific parameters I monitor using this service. (We switched to Google Analytics on this site in mid-2009.)
I used to oversee the marketing activity on a fairly large commercial Web site. For a fee, the Web hosting company for this site used to provide Web server log analysis using NetTracker. Here’s a list of the statistics I kept track of on a month-to-month basis:
Number of pages viewed:
Number of estimated visits:
Number of unique visitors:
Weekly, Daily Statistics (Monthly Averages):
Number of pages viewed per day:
Number of pages viewed per visit:
Length of visit (minutes):
Number of visits per day:
Number of visits per week:
Ave. # unique visitors per day:
Ave. # new visitors per day:
Ave. # repeat visitors per day:
Ave. visitor repeat rate:
All of these statistics are generated by the NetTracker “On Demand” Service as it analyzes the Web site server logs (“Web logs”) for this site.
What Good Are All These Stats?
Quite simply, what I am looking for in these set of statistics on a month to month basis is growth – growth in number of pages viewed, growth in total number of visitors, growth in number of unique visitors, growth in number of new visitors, growth in average amount of time spent on the site, etc.
(Note: much of this measuring is made possible by placing “tags” or “cookies” on your Web site visitors’ computers when they first visit the site, so these stats are vulnerable to mistakes if visitors regularly delete these cookies on their computers. Because deletion of cookies is not a common procedure for most computer owners at this point, the month-to-month stat comparisons I use are valid for the type of growth patterns I’m looking for.)
Page-By-Page Web Site Traffic Analysis
Another important use of this Web server log analysis software/service is the ability to observe Web site traffic on a page-by-page basis. Let me give you a recent example.
On this particular Web site that we used to monitor with NetTracker, the most important visitor activity we encourage is requesting more detailed information on the highlighted products.
If visitors want to perform this activity while on the site, as you would expect most Web visitors would want to do, there is a specific Web page-based form they are asked to complete. Upon completion, they click on the “Submit” button, the form is sent in email form to our email server, and a “Thank You” page is displayed to the visitor so that s/he knows that the form was successfully submitted.
Of course, we are never satisfied with the total number of Web site visitors who complete this process, so we used NetTracker analysis results to look at the number of visits to the “Request More Info” Web page and compared that to the number of “visits” to the “Thank You for Requesting More Info” page.
What we saw was the typical “shopping cart abandonment” pattern you have probably read about with online e-commerce sites. In other words, the Web site content was compelling enough to get a fair number of visitors to go the “Request More Info” form, but based on the number of “Thank You” page displays, we could see that many visitors “abandoned” the “Request More Info” exercise before completing it.
Based on this analysis, we guessed that one of the main causes of this visitor pattern was that too much of the info on the Info Request Form was required to be filled in before the form was considered completed and the submission process could work.
We reduced by over half the required info on the form – the requested form info remained the same, but only certain info was required to be filled in. Since making that change, the number of completed forms submitted has gone up significantly.
Where Are Web Site Visitors Coming From
Another important use of this software/service is helping you understand where your visitors are coming from and how they got to your site. In particular, you can not only see which search engines are sending you the most traffic, but you can also see what search phrases were used by those visitors to find your site.
Finally, because this NetTracker “On Demand” Web server log analysis service was “hosted” by the same company that hosts the Web site, I could access NetTracker statistics from any computer that has Internet/Web access.