- Home
- Online Publishing
- Website statistics
Website statistics
This document was written, and licensed for publication by MDA.
What are Web Statistics?
A website is a document or series of documents that are held on a central computer and 'served' on request to a remote user through their browser software. As part of this process, the central computer or 'server' retains useful information such as the number of requests it receives, which pages it delivers to users and at what time of day. This information is called 'website statistics', and has an important role to play in almost every aspect of your website, from design to ongoing management and development.
Website statistics in their raw state are kept as 'HTTP server logs' or 'log files'. These log files are kept on the server and updated each time it receives a request for a web page. If you open a log file, you are presented with a long string of text and numbers. It is therefore necessary to analyse the log files in order to extract the information they contain in a meaningful way.
How Can I Analyse the Log Files?
There are two main ways of analysing log files:
- Arrange for the company that hosts your website to analyse them and display the information via a web page, which only you can access;
- Download the log files from your server and analyse them yourself via a separate piece of software.
There are pros and cons to either approach. Site statistics provided by your host company are easy to implement and require little technical expertise. Analysis of your own log files provides a greater degree of flexibility in the information you are able to gather and the format in which it is presented.
Whichever method you use, the log files will contain the following information as a minimum:
- Date and time a page in your website was requested;
- The name of the internet service provider (ISP) through which the request was received;
- The page requested;
- The country from which the request originated;
- The hyperlink through which a user arrived at your site (where relevant)
- The search-term which a user entered into a search engine to find your site (where relevant);
- A 'response code' produced by your server indicating whether or not the page requested was delivered to the user successfully;
- The 'amount' of information transferred to the user, measured in 'bytes';
- The technical specifications of the computer used to access your website, including browser type, operating system and additional software plugins.
If you are using a dynamic database-driven site, or one which requires a user to login before accessing your pages, you will have access to a much richer set of information, including profiles of past visits and full details of which pages your user has visited during their last visit.
How to Use Web Statistics
Each time a person visits your site, it represents an expression of interest in your organisation and what it is doing. By learning more about your users through web statistics you can tailor your website so that it is more engaging and more relevant to the people who use it. It is therefore essential that a structured and practical approach to statistics and their analysis underpins your web strategy.
The following is a quick introduction to the various applications of statistics in developing and managing your site:
- Date and time of request - this information, aggregated over time, provides an insight into patterns of usage of your website. Do people mostly visit in the afternoon or the morning? Does the number of visits drop sharply over the weekend? This information can then be used to plan a range of activities - routine maintenance can be scheduled for your quietest times, while launches and promotions for your busiest;
- The name of the ISP - the ISP will tell you a great deal about what kinds of people are visiting your website. For example, if the majority of your users have an ISP with a suffix ending '.ac.uk' then they are using a computer based in an academic institution. If the suffix is '.com' then your users are predominantly commercial. You can use this information to refine your content so that it is more relevant to your majority audience;
- The page requested - the server makes a note each time a page is requested. This will tell you which are the most popular pages on your website, and which are visited the least. You can use this information to refine your navigation, making it easier for users to reach popular pages from the homepage, or re-write content on the less popular pages;
- The country of origin - in many cases, the log files will be able to tell you the country from which a request originated. This information can be used to gain an insight into the audiences for your website. Are a significant number of your users coming from non-English speaking countries? If so, should you consider providing a dynamic translation service to help them access your content?
- The pages that link to your site - this information can be used to find out the routes by which users arrive at your website, and also to identify organisations to approach to arrange reciprocal web links.
- Search engine keywords - many users rely on search engines to navigate around the Web. It is therefore important to ensure that you are listed with the main search engines using appropriate keywords (see the fact sheet on Website Promotion for further information). The list of terms can be used to refine the keywords you use on your website, and to ensure that people stand a better chance of reaching your site;
- The response-code - this lists the pages which were delivered either successfully or unsuccessfully to the user upon request. Where a page could not be delivered, this is most commonly due to a broken link in your website. You can use this information to check your links and ensure that people are able to navigate your site smoothly.
- User's technical specifications - this information is vital in informing the design and functionality of your website. It will tell you the proportion of your users that run either Netscape, Internet Explorer or any other type of web browser, and also the extent to which they are able to access interactive or multimedia content.
Using web statistics in this way is known as a 'goal oriented' approach. This approach demonstrates how important this information can be in the strategic development of your website.
Increasingly, organisations need to justify expenditure on web services in terms of impact, user needs, and formal business planning. Website statistics remain a relatively 'blunt' way of gathering purely quantitative information about the role of your website. They do, however, play a fundamental role in making the case for future funding and development.
There are some features of the Internet which mean that your website statistics will never be 100% reliable. These include:
- Page caching - some browsers create an offline copy of a web page once they have requested it. When the user returns to this page, the browser displays the copy instead of the version on your website. This will not register as a 'visit' to that page on your website;
- Proxy servers - many computer networks process requests through a 'proxy server', which will also 'cache' copies of web pages. This means that when someone on a network requests a page, they may in fact receive this 'cached' copy. This will not register as a 'visit' to that page on your website;
- IP Addresses - an IP (Internet Protocol) address is an identity that is assigned to the user when they log on to the Internet. A network of computers may all be using the same IP address (because the address is allocated to the proxy server instead of the user). This means that different people accessing your site from one network will all be registered as the same user.
Other Ways of Measuring the Impact of Your Website
In addition to the analysis of log files, there are a number of other methods by which the performance of a website can be evaluated. These include:
- e-Commerce - If your website is equipped to handle commercial transactions, then you can evaluate it in terms of revenue generated. Alternatively, if you display third-party advertising or 'deep-sell' products by linking to other suppliers, then you can monitor the 'click-through' rate - a measure of how many users click on the link or banner to view the products;
- Online communities - if you offer users the chance to take part in a 'community' then the level of involvement in an important index of the success of your website. How many users have registered as part of your community? How often do they re-visit the site? How active are they in discussion via email or bulletin boards? How often do new users join? All of these factors give an invaluable insight into the success of your online community, as well as guiding ongoing development;
- Self-sustainability - to what extent do you have to keep your site 'alive'? Do you have sole responsibility for updating content, or are your users able to contribute actively to the creation of new information and resources. The greater the level of active involvement from users, the greater the chances of long-term survival for your website.
Where Can I Go for Further Advice and Information?
Advice on all aspects of gathering website statistics, including tutorials.
[ www.hotwired.lycos.com/webmonkey/e-business/tracking/]
Stout, Rick. Website Stats: tracking Hits and Analysing Web Traffic,
(ISBN: 0-07-882236-X)
This fact sheet was compiled with generous assistance from Nick Poole of Resource: the Council for Museums, Archives and Libraries, Nick Gander of the former Yorkshire Museums Council and the former Committee for Area Museum Councils.
MDA provides impartial resources for all aspects of collections information management. For more information please contact: Collections Link - 0845 838 4000.
November 2002