An Investigation of Website Ranking Methods
by
Suckkee Lee
Doctoral Candidate
Department of Advertising CMA 7.142
College of Communication
The University of Texas at Austin
Austin, TX 78712
Phone: (512) 471-1101 (Fax: 471-7018)
Email: suckkee@mail.utexas.edu
and
John D. Leckenby
Everett D. Collier Centennial Chair
in Communication
john.leckenby@mail.utexas.edu
http://uts.cc.utexas.edu/~tecas/
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712
Paper to be presented at
1998 Annual Conference of
American Academy of Advertising
Lexington, KY
An Investigation of Website Ranking Methods
Abstract
Website rankings by three methods and criteria are compared and analyzed in order to determine their impact on site rankings. Findings of the study indicate that website rankings generated by different measurement methods differ more than they agree with each other. Website rankings generated by different sorting criteria also are found to differ more than they agree, especially among the websites with higher popularity and traffic volume. To rectify this weakness of a single criterion, Traffic Index (TI), which is a composite index of site reach, frequency and duration, is proposed.
An Investigation of Website Ranking Methods
The World Wide Web (WWW) as a part of the Internet has been growing very rapidly (see Table 1), and it is receiving much attention from advertising practitioners and academicians (ARF 1995, Allen 1996, Bhatia 1997, Berthon et al. 1996a, Berthon et al. 1996b, Carter 1995, Cohen 1996, Danner 1997, Dolinar 1995, Economist 1997, FIND/SVP 1998, Gattuso 1995, Hoffman and Novac 1996, Lebow 1995, Murphy 1996, Randall 1997, Robello 1996). It is also gaining wide popularity among ordinary computer users. In fact, the WWW has become almost synonymous with the Internet although there are other important applications of the Internet such as emailing, file transfer protocol (FTP) and telnet. Because of the wide popularity of the Web among the public and the advertisers' need for information about various websites, several research firms and other organizations are announcing their lists of popular websites on a regular basis. List names like "Top 25 Websites" by RelevantKnowledge, "Top 50 Websites" by PC Meter of Media Metrix, "Hot100" by Web21 and "Top 100 Sites" by PC Magazine are such examples.
Presumably, each announcing organization is making every effort to be accurate and fair in selecting and ranking the popular websites. However, since each organization is using a yardstick that is not exactly the same as the others, it is reasonable to suspect that website rankings may vary from list to list. For example, a site ranked high by one method may not be ranked similarly, or may not even make it to the list of selected sites at all if a different selection method/criterion is applied. This is because there is no industry-wide standard for website traffic measurement. There are, however, joint projects and organizations such as the Internet Advertising Bureau (IAB), the Audit Bureau of Verification Service (ABVS) of the Audit Bureau of Circulation (ABC) and the Coalition for Advertising Supported Information and Entertainment (CASIE) whose purpose is to reach an industry-wide consensus about the standards for Web traffic measurement and analysis.
With this current status of the Web industry in mind, this study tries to compare three different measurement methods and three sorting criteria. The second purpose is to demonstrate what impact different measurement methods and criteria make on the rankings of websites. Finally, it proposes a measurement method of its own, Reading Yesterday with Traffic Index (TI) as the standard, that can be used for website traffic measurement and analysis.
Points of Investigation
There are two aspects to be mentioned about site traffic measurement: a method and its criterion. The criterion part is examined first. Criteria are used to rank websites in a descending order. Therefore, criterion is related to the question of what to measure among various aspects of site traffic. This is because there can be more than one aspect to the definition of "site traffic" when the concept is actually deconstructed. For instance, how many people visit a site can be one variable while how long they stay there once they come to the site can be another variable. Certainly, how often they come back to the same site again can be another. Therefore, three constructs or criteria reach as an indication for how many different people visit a site, frequency as an indication for how often people visit the same site, and duration as an indication for how long people stay at the site are chosen for analysis. Definition of each criterion is presented later in this section.
Meanwhile, method relates to the question of how to measure the criterion. Broadly speaking, there are two measurement approaches that are currently employed by the Web industry: site-centric and user-centric. The central difference between these two approaches lies at the source of data collection. A site-centric tracking system collects the audience data from the site server's log whereas a user-centric system collects the audience data from the personal computers of its sample respondents. There are pros and cons in each system but further discussion about the weaknesses and strengths of each tracking system remains beyond the scope of this paper.
A user-centric method is chosen for analysis in this study because of its capability of providing both demographic data of audience and information about their web-browsing activities and also because of the projectability of the method into the universe of web users,. This user-centric measurement method is named Method 1, or simply M1 for notation. The second method chosen for analysis or M2 is the Recent Reading method that has been a traditional tool for readership measurement in print media. The third method or M3 is the Reading Yesterday method which is proposed by the authors (see Appendix for further explanation of the Reading Yesterday method). The reasons for inclusion of the Recent Reading (M2) and Reading Yesterday (M3) methods for comparison with the user-centric, mechanical measurement method (M1) are a) that they too are capable of providing the demographic data of the respondents along with the matching information about their exposure to media, b) that they are statistically projectable into the universe, and c) unlike M1 that uses a mechanical or electronic metering device so as to measure web-browsing activities of its sample respondents, both M2 and M3 depend on respondents' memory. The major difference between M2 and M3 is the length of the measurement period of the respondents' memory, i.e., M2 asks the respondents to recall the events of the past 7 days while M3 asks about the last 24 hours.
To summarize, three measurement methods and three criteria are selected for analysis. The three methods are M1 for the user-centric, mechanical measurement method, M2 for the Recent Reading method and M3 for the Reading Yesterday method. The three criteria are site reach, frequency and duration. The differences that the three methods make on site rankings, i.e., the method difference, are investigated in data analysis part 1, and those that the three criteria make, i.e., the criterion difference, are explored in data analysis part 2. Definitions of sorting criteria used in the analyses of the study follow.
Definition of Criteria
Data Collection
Three sets of data were collected for the three investigated methods: M1 for a user-centric, mechanical measurement method, M2 for the Recent Reading Method and M3 for the Reading Yesterday method. A data set for M1 was obtained from a Web research company in the eastern part of the U.S. The company utilizes a user-centric tracking system by measuring activities of Web usage among its panel participants. Its definition of the universe of Web users is all people age 12 or older in the U.S. who used the Web at least once in the last month at home, work or college with Mac or PC with Windows 3.1, 95 and NT operating systems.
The initial data set for M1 contained the records of 744 unique visitors who were measured on a daily basis during September 29, 1997 and October 26, 1997. Those 744 unique visitors visited the top 99 sites, producing 13,039 records or visits. From these 13,039 records, only the records related to the top 25 sites were extracted and used for analysis of M1. The initial sorting criterion that was used to extract and rank the top 25 sites was site reach for the four-week period mentioned above. Site frequency was used as a tie-breaker if two or more sites were the same according to the reach criterion. Rankings and reach of the top 25 sites for the four-week measurement period are presented in Table 2. Table 3 describes the summary of the extracted data and the demographics of the panel participants.
Data sets for M2 and M3 were gathered by setting up a survey page on the Web and asking respondents to visit the authors' survey page (at URL: http://uts.cc.utexas.edu/ ~audience) and participate in the survey (see Appendix for a sample of the survey). The same list of the top 25 sites used in M1 was used again in M2 and M3 data collection. A small financial incentive in the form of a sweepstake entry was given to the survey participants so as to increase the response rate to the survey.
For M2 data, respondents were asked whether or not they visited the top 25 sites "during the last seven days." For M3 data collection, they were asked whether or not they visited the 25 sites "in the last 24 hours", how much time they spent at the sites and how often they visit the sites. The survey lasted for three weeks from January 19 to February 8, 1998, and a total of 511 people 64.2 % male and 35% female participated in the survey. Table 4 shows the geographic distribution of all survey participants. However, in an effort to close the gaps between the universe definitions of the three methods, responses from outside the U.S. are excluded from the M2 and M3 data sets. Therefore, only the responses from 450 people in the U.S. are used for data analysis for M2 and M3.
It is worth reminding here that these data can be utilized not only for site traffic analysis so as to rank the sites but also can work as inputs to reach/frequency estimation models as Hong and Leckenby (1996 and 1997) have shown.
Data Analysis
Analysis 1. Impact of methods on rankings of websites: Method Difference
The focus of the first part analysis is to compare the ranked orders of websites by different methods and see if different methods generate significantly different rankings. For the sorting criterion, M1 uses site reach for four weeks; M2 uses weekly site reach averaged over 3 weeks; M3 uses Traffic Index or TI averaged over 21 days. Traffic Index (TI) is derived from the following equation.
Equation for Traffic Index (TI)
TI = R 5 F 5 D
Where
R, F and D denote daily site reach, frequency and duration respectively.
Rankings of the 25 sites and the criterion that is used by each method for its rankings are presented in Table 5. Based on the three rankings M1-, M2- and M3-Rank in Table 5, Kendall's tau (Hays 1988) is calculated on the three pairs of the rankings M1-M2, M2-M3 and M3-M1 to find out "concordance" between the pairs, which is shown in Table 6.
The first finding from Table 6 shows the low levels of agreement (.353 and .300) between the mechanical measurement method (M1) and memory-based measurement methods (M2 and M3) whereas the agreement between the latter two is relatively high (.693). This discrepancy in concordance between the pairs should come as no surprise because both M2 and M3 use memory-based measurement approaches while M1 uses a mechanical measurement approach. This discrepancy only suggests that there is a difference between the rankings generated by different measurement approaches (M1 v. M2 and M3). However, this is not to say that the agreement between M2 and M3 both of which are memory-based methods is very satisfactory. Although it is debatable whether or not 69.3% agreement is acceptable, it should also be taken into account that at least some sites (30.7% of the sites) get significantly different rankings depending on which of the two methods is employed. Certainly, the discrepancy in site rankings is much greater between the M1-M2 and M1-M3 pairs where site rankings differ much more than they agree (35.3% and 30% agreement respectively).
Secondly, the overall level of agreement between the three pairs 44.9% on average is low or decent at best. In other words, ranking methods differ more than they agree with each other's rankings. Although it is beyond the scope of this study to find out which method generates the most accurate rankings, it is safe to say that different measurement methods do not generate similar or comparable site rankings.
Analysis 2. Impact of criteria on site rankings: Criterion Difference
The focal point of the second part analysis is to find out the differences that different sorting criteria make on site rankings. Since M3 contains all three criteria (reach, frequency and duration) while the other methods have only two (reach and frequency), the M3 data set is used for investigation of the criterion difference.
Using the M3 data, Table 7 is prepared. In Table 7 where the sites are sorted by Traffic Index (TI), three other rankings by the reach, frequency and duration criteria are also presented. Additionally, standard deviation for the three different rankings by reach, frequency and duration "StDev(3ranks)" in the table is calculated in order to locate the sites that fluctuate widely depending on the sorting criterion. Values of more than five standard deviations are marked in bold letters.
Based on Table 7, Kendall's tau is run again on the three pairs of the rankings reach-frequency, frequency-duration and duration-reach pairs in order to find out the pairwise concordance levels. The result is shown in Table 8.
The first finding from Table 8 is very similar to what was found in the previous analysis, i.e., the concordance levels between the rankings sorted by different criteria are low or decent at best, and far from satisfactory. Average Kendall's tau for the three pairs stands at 43.6%, which leads to the interpretation that site rankings generated by different criteria differ more than they agree. Although the agreement in the reach-frequency pair (.54) is a little better than those of the other two pairs (.43 and .34), the overall picture of agreement in site rankings, or the lack of it, is supportive of the conclusion that site rankings differ considerably depending on which criterion is used.
The second finding from Table 7 is that the impact of different criteria is more severe, generally speaking, on the higher ranked sites than the lower ranked sites. As one can see easily from a cursory examination of locations of the bold letters in the "StDev(3ranks)" column in Table 7, eight sites out of the top 13 sites show standard deviations of five or greater while only one site, msn.com, out of the bottom 12 sites does so.
Overall, nine sites out of 25 show standard deviations of 5 or greater, and this means that the sites can sway 5 rankings up or down very easily depending on which criterion is employed in their rankings. The magnitude of the sway 5 rankings up or down is quite considerable when there are 25 ranks all together. For example, the cnn.com site would claim the top spot if the return visit of the audience is used as the standard for site rankings. The frequency standard would also put the hotmail.com site at the third place, which would be ranked at 13th if the standard is how many different people visit the site. The same frequency standard, however, would not favor the microsoft.com site which would be ranked much higher by the duration standard. On the contrary, the altavista.digital.com site ranks much higher by the reach standard than when it is ranked by the duration standard. Certainly, there are some sites whose rankings are not swayed much by the sorting criterion. Rankings of the usatoday.com, webcrawler.com and whowhere.com sites, all of which are ranked in the bottom half, are not affected much by the selection of standard. In other words, they all rank similarly regardless of the sorting criteria.
To summarize, site rankings differ considerably depending on which criterion is used for the rankings. And the impact of the sorting criterion seems to be much greater on the higher ranked sites than the lower ranked sites. As to the question of which criterion is a more accurate standard, there is no clear choice among the three criteria that have been investigated in this study. Rather, this is the question that each advertiser should decide individually when they buy spaces in websites. If the goal of the media campaign is to expose the ads to as many different people as possible, site rankings by the reach criterion would better serve the purpose. If the goal is to expose the ads, for example, to a "small but royal" segment of audience repeatedly, the frequency or duration criteria are a better fit to his campaign. By the same token, if the goal of selecting and ranking the websites is to see, comparatively, how much "traffic" or exposure the sites receive from their audience, the proposed Traffic Index (TI) criterion would be a more accurate ruler since TI reflects all facets of "site traffic" in a fair fashion to all sites.
Another noteworthy point from Table 7 is that, in terms of traffic volume of websites, discrepancy between two sites increases very markedly as their rankings get higher (see "TI" column in the table). For instance, the difference in Traffic Index between the sites ranked 1st and 2nd is much greater about 1200 times greaterthan the difference between the 24th and 25th sites. This discrepancy should have some implications on the pricing of websites.
Conclusion
The impact of different measurement methods and criteria on the ranked order of websites is quite suggestive if not conclusive. As far as the three investigated methodsM1, M2 and M3are concerned, the site rankings generated by them are more different than they are similar. This is also the case with the criterion difference. Site rankings sorted by the three different criteriareach, frequency and durationdiffer much more than they "concur" or agree. The discrepancy in the site rankings due to the choice of ranking criterion is more prominent among the sites with higher popularity and more traffic volume. In order to equalize this discrepancy, a composite index for site traffic, Traffic Index (TI), is proposed as a standard measure for site rankings.
Two limitations of the study should be mentioned here. The first limitation is the discrepancy in time, about 3 months, between data collection periods for M1, M2 and M3. The second limitation of the study is the discrepancy in the universe definitions for the three methods, i.e., the universe for M1 may or may not be comparable to those for M2 and M3. Although the authors tried to synchronize these two external factors as closely as possible in this study, given the practical and political issues involved, it is still possible to hypothesize that they too might have affected the results of the study to some extent. It is hoped that their impact will be investigated in the future.
Finally, in regard to M3 (Reading Yesterday method using TI) which the authors proposed in this study, one of its benefits is the addition of the time and frequency factorsthe duration and frequency criteriain the analysis of site traffic. Another benefit of M3 is the increased accuracy in measurement of site exposure. The Internet is often claimed to be an interactive medium, and one of the constructs that is quantitatively indicative of the interactivityand qualitatively to a lesser extentis the duration of time a user spends on the medium as well as the frequency of his repeated visits to it. As was shown previously in the second part analysis of the study, there are sites that many people visit but spend little time on and/or return to infrequently whereas there are sites that not as many people visit initially but where visitors spend more time and/or return to frequently. Therefore, a standard or sorting criterion such as Traffic Index (TI) that reflects not only the reach level of a site but the other aspects of exposure may be a more effective ruler when measuring how much audience exposure there is to a website.
Tables
Number of Hosts and Domains advertised in the DNS
|
Date |
Hosts |
Domains |
|
Jan 97 |
16,146,000 |
828,000 |
|
Jul 96 |
12,881,000 |
488,000 |
|
Jan 96 |
9,472,000 |
240,000 |
|
Jul 95 |
6,642,000 |
120,000 |
|
Jan 95 |
4,852,000 |
71,000 |
|
Jul 94 |
3,212,000 |
46,000 |
|
Jan 94 |
2,217,000 |
30,000 |
|
Jul 93 |
1,776,000 |
26,000 |
|
Jan 93 |
1,313,000 |
21,000 |
Source: Internet Domain Survey (CommerceNet, Jan. 1997, at URL: http://www.nw.com/zone/ WWW/report.html)
Rankings and Reach of Top 25 Sites By M1
|
Site |
Reach* |
Rank |
Site |
Reach* |
|
|
1 |
yahoo.com |
56.72 |
14 |
Pathfinder.com |
9.14 |
|
2 |
Microsoft.com |
44.62 |
15 |
Msnbc.com |
8.74 |
|
3 |
Netscape.com |
41.53 |
16 |
four11.com |
7.93 |
|
4 |
aol.com |
35.35 |
17 |
cnn.com |
6.72 |
|
5 |
excite.com |
33.60 |
18 |
Tripod.com |
6.59 |
|
6 |
Geocities.com |
26.48 |
19 |
Whowhere.com |
6.18 |
|
7 |
Infoseek.com |
24.19 |
20 |
Usatoday.com |
6.18 |
|
8 |
lycos.com |
20.97 |
21 |
Sportszone.com |
6.05 |
|
9 |
msn.com |
20.43 |
22 |
Download.com |
6.05 |
|
10 |
Altavista.digital.com |
16.40 |
23 |
att.net |
6.05 |
|
11 |
Webcrawler.com |
11.29 |
24 |
hotbot.com |
5.91 |
|
12 |
zdnet.com |
10.48 |
25 |
amazon.com |
5.91 |
|
13 |
hotmail.com |
9.81 |
|
|
|
* For 4 weeks of measurement period.
Summary of M1 Data and Demographics of Panel Participants
|
N |
Total Traffic (Visits) to the Top 25 Sites |
Measurement Period |
|
725 |
9,343 |
4 weeks (Sep. 29 to Oct. 26, 1997) |
Demographics
|
Gender |
Average Age |
N |
|
Female |
34.7 |
278 (38.3%) |
|
Male |
36.3 |
447 (61.7%) |
Geographic Distribution of Respondents of M2 and M3 Survey
|
Total N |
Foreign |
Domestic |
|||
|
511 |
61 (11.9%) |
450 (88.1%) |
|||
|
Country |
N |
State |
N |
State |
N |
|
Australia |
10 |
AL |
2 |
NE |
1 |
|
Belgium |
1 |
AR |
3 |
NH |
1 |
|
Brazil |
1 |
AZ |
3 |
NJ |
5 |
|
Canada |
7 |
CA |
69 |
NM |
2 |
|
Guam |
2 |
CO |
6 |
NV |
2 |
|
Hungary |
1 |
DC |
1 |
NY |
14 |
|
Ireland |
1 |
FL |
3 |
OH |
6 |
|
Korea |
1 |
GA |
3 |
OK |
3 |
|
Malta GC |
1 |
IL |
8 |
PA |
10 |
|
Mexico |
3 |
IN |
12 |
SC |
1 |
|
Netherlands |
3 |
KS |
2 |
TN |
4 |
|
Norway |
1 |
LA |
3 |
TX |
216 |
|
Qatar |
1 |
MA |
11 |
UT |
5 |
|
Scotland |
1 |
MD |
2 |
VA |
11 |
|
Sweden |
23 |
MI |
4 |
VT |
1 |
|
Turkey |
1 |
MN |
5 |
WA |
9 |
|
UK |
1 |
MO |
2 |
WI |
1 |
|
Ukraine |
1 |
MS |
1 |
||
|
Yugoslavia |
1 |
NC |
18 |
Rankings of 25 Sites by M1, M2 and M3
|
Site Weekly Reach by M2 (%) |
Traffic Index by M3 |
M1-Rank |
M2-Rank |
M3-Rank |
|||
|
Altavista.digital.com |
16.40 |
40.27 |
1.071 |
10 |
2 |
4 |
|
|
Amazon.com |
5.91 |
13.56 |
0.088 |
24 |
11 |
18 |
|
|
aol.com |
35.35 |
8.63 |
0.177 |
4 |
20 |
16 |
|
|
att.net |
6.05 |
2.33 |
0.017 |
21 |
25 |
22 |
|
|
cnn.com |
6.72 |
25.62 |
1.379 |
17 |
7 |
3 |
|
|
download.com |
6.05 |
12.88 |
0.557 |
22 |
13 |
8 |
|
|
sportszone.com (espn) |
6.05 |
13.01 |
0.621 |
23 |
12 |
6 |
|
|
excite.com |
33.60 |
26.16 |
0.416 |
5 |
5 |
11 |
|
|
four11.com |
7.93 |
8.90 |
0.035 |
16 |
19 |
21 |
|
|
geocities.com |
26.48 |
17.95 |
0.364 |
6 |
9 |
12 |
|
|
hotbot.com |
5.91 |
11.10 |
0.193 |
25 |
17 |
15 |
|
|
hotmail.com |
9.81 |
10.55 |
0.424 |
13 |
18 |
10 |
|
|
infoseek.com |
24.19 |
32.19 |
0.572 |
7 |
4 |
7 |
|
|
lycos.com |
20.97 |
21.10 |
0.341 |
8 |
8 |
13 |
|
|
microsoft.com |
44.62 |
25.89 |
0.623 |
2 |
6 |
5 |
|
|
msn.com |
20.43 |
4.52 |
0.042 |
9 |
21 |
20 |
|
|
msnbc.com |
8.74 |
11.10 |
0.089 |
15 |
16 |
17 |
|
|
netscape.com |
41.53 |
40.00 |
3.557 |
3 |
3 |
2 |
|
|
pathfinder.com |
9.14 |
3.56 |
0.007 |
14 |
24 |
24 |
|
|
tripod.com |
6.59 |
3.97 |
0.012 |
18 |
23 |
23 |
|
|
usatoday.com |
6.18 |
12.33 |
0.279 |
19 |
14 |
14 |
|
|
Webcrawler.com |
11.29 |
12.19 |
0.077 |
11 |
15 |
19 |
|
|
Whowhere.com |
6.18 |
4.11 |
0.006 |
20 |
22 |
25 |
|
|
yahoo.com |
56.72 |
66.16 |
4.794 |
1 |
1 |
1 |
|
|
zdnet.com |
10.48 |
14.25 |
0.424 |
12 |
10 |
9 |
|
Pairwise Concordance (Kendall's tau) between Rankings by M1, M2 and M3 (N=25)
|
|
M1-Rank |
M2-Rank |
|
|
Correlation Coefficient |
M2-Rank |
.353* |
|
|
M3-Rank |
.300* |
.693** |
* Correlation is significant at the .05 level (2-tailed)
** Correlation is significant at the .01 level (2-tailed)
Rankings of 25 Sites by Reach, Frequency and Duration Criteria
|
Site |
R* |
F* |
D* |
TI* |
R-Rank |
F-Rank |
D-Rank |
StDev(3Ranks)** |
TI-Rank |
|
yahoo.com |
0.421 |
0.55 |
20.8 |
4.794 |
1 |
2 |
5 |
2.08 |
1 |
|
netscape.com |
0.244 |
0.36 |
40.1 |
3.557 |
2 |
7 |
1 |
3.21 |
2 |
|
cnn.com |
0.145 |
0.68 |
13.9 |
1.379 |
5 |
1 |
13 |
6.11 |
3 |
|
altavista.digital.com |
0.218 |
0.38 |
12.9 |
1.071 |
3 |
5 |
18 |
8.14 |
4 |
|
microsoft.com |
0.116 |
0.21 |
25.0 |
0.623 |
7 |
15 |
2 |
6.56 |
5 |
|
sportszone.com (espn) |
0.075 |
0.40 |
20.7 |
0.621 |
11 |
4 |
6 |
3.61 |
6 |
|
infoseek.com |
0.167 |
0.26 |
13.0 |
0.572 |
4 |
10 |
17 |
6.51 |
7 |
|
download.com |
0.060 |
0.38 |
24.5 |
0.557 |
15 |
6 |
3 |
6.24 |
8 |
|
zdnet.com |
0.077 |
0.30 |
18.5 |
0.424 |
10 |
8 |
7 |
1.53 |
9 |
|
hotmail.com |
0.073 |
0.41 |
14.1 |
0.424 |
13 |
3 |
12 |
5.51 |
10 |
|
excite.com |
0.125 |
0.25 |
13.3 |
0.416 |
6 |
12 |
15 |
4.58 |
11 |
|
geocities.com |
0.084 |
0.20 |
21.3 |
0.364 |
9 |
16 |
4 |
6.03 |
12 |
|
lycos.com |
0.093 |
0.20 |
18.1 |
0.341 |
8 |
17 |
8 |
5.20 |
13 |
|
usatoday.com |
0.074 |
0.26 |
14.7 |
0.279 |
12 |
11 |
11 |
0.58 |
14 |
|
hotbot.com |
0.051 |
0.28 |
13.8 |
0.193 |
16 |
9 |
14 |
3.61 |
15 |
|
aol.com |
0.047 |
0.23 |
16.6 |
0.177 |
18 |
14 |
9 |
4.51 |
16 |
|
msnbc.com |
0.048 |
0.24 |
7.8 |
0.089 |
17 |
13 |
22 |
4.51 |
17 |
|
amazon.com |
0.060 |
0.12 |
12.6 |
0.088 |
14 |
22 |
19 |
4.04 |
18 |
|
webcrawler.com |
0.041 |
0.17 |
11.1 |
0.077 |
19 |
19 |
20 |
0.58 |
19 |
|
msn.com |
0.018 |
0.16 |
15.0 |
0.042 |
21 |
20 |
10 |
6.08 |
20 |
|
four11.com |
0.034 |
0.11 |
9.3 |
0.035 |
20 |
23 |
21 |
1.53 |
21 |
|
att.net |
0.015 |
0.08 |
13.1 |
0.017 |
23 |
25 |
16 |
4.73 |
22 |
|
tripod.com |
0.018 |
0.14 |
4.8 |
0.012 |
22 |
21 |
23 |
1.00 |
23 |
|
pathfinder.com |
0.015 |
0.19 |
2.6 |
0.007 |
24 |
18 |
25 |
3.79 |
24 |
|
whowhere.com |
0.014 |
0.09 |
4.7 |
0.006 |
25 |
24 |
24 |
0.58 |
25 |
* R, F, D, and TI denote daily reach, frequency, duration and Traffic Index respectively.
** Std(3ranks) denotes standard deviation among the 3 rankings by reach, frequency and duration.
Pairwise Concordance (Kendall's tau) between Rankings by Reach, Frequency and Duration (N=25)
|
|
|
Reach Rank |
Frequency Rank |
|
Correlation Coefficient |
Frequency Rank |
.540** |
|
|
Duration Rank |
.427** |
.340* |
* Correlation is significant at the .05 level (2-tailed)
** Correlation is significant at the .01 level (2-tailed)
References
Advertising Research Foundation (1995), "The Top 10 Insights About Measuring," The Effects Of Advertising, November Workshop.
Allen, Mike (1996), "Testing Whether Internet Readers Will Pay," The New York Times, Sep. 16, Section D, 2.
Berthon, Pierre, Leyland Pitt, and Richard T. Watson (1996a), "The World Wide Web as an advertising medium: toward an understanding of conversion efficiency," Journal of Advertising Research, v.36, n.1 (Jan-Feb), 43-54.
Berthon, Pierre, Leyland Pitt, and Richard T. Watson (1996b), "Re-surfing W3: research perspectives on marketing communication and buyer behavior communications on the Worldwide Web," International Journal of Advertising, v.15, n.4 (November), 287-301.
Bhatia, Manish (1997), "Web Audience Measurement: Issues, Challenges and Solutions," Nielsen Interactive Services at URL: http://www.nielsenmedia.com/sfpres.
Blake, Paul (1997), "Customized news arena promises growth: online customized news services," Information Today, n.2, v.14, 10.
Brown, Michael (1994), "Estimating newspaper and magazine readership," in Measuring Media Audience, Raymond Kent, ed., New York, NY: Routledge. 105-45.
Carter, Meg (1995), "Sunset Times For Papers?" Marketing Week, July 7, 28.
Cohen, Jodi B. (1996), "Measuring the Web audience," Editor & Publisher, v.129, n.26, June 29, 37.
CommerceNet (1997), "Internet Domain Survey," at URL: http://www.nw.com/zone/ WWW/report.html, January.
Danner, John (1997), Jupiter at URL: http://www.jup.com/newsletter/webtrack/ feature.shtml.
Dolinar, L. (1995), "Experts say the standard measure on on-line usage may grossly exaggerate the popularity of some Internet features," Newsday, July 4, B19.
FIND/SVP (1997), "Beyond the Hype: Internet ‘Indispensable’ To Many, Disposable To Others," at URL http://www.find.com/findsvp/0506.html/tigebd1, May 6.
Gattuso, Greg (1995), "Website 'hits' viewed vague as measurement," Direct Marketing, August, v.58, n.4, 8-9.
Giuliano, Vince (1997), "The Internet is redefining distance, time and community," The New York times On the Web, Mar. 6.
Hays, William L. (1988), Statistics, 4th ed., Orlando, FL: Holt, Rinehart and Winston, 845-847.
Hoffman, Donna L. and Thomas P. Novak (1996), "Marketing in hypermedia computer-mediated environments: conceptual foundations," Journal of Marketing, v.60, n.3 (July), 50-58.
Hond, Maurice de and Walter Huzen (1983), "New approach to readership surveys - the media scanner," in Readership Research: Montreal 1983, Proceedings of the 2nd International Symposium, Harry Henry, ed., Amsterdam, The Netherlands: Elsevier Science Publishers B.V., 137-142.
Hong, Jongpil and John D. Leckenby (1996), "Audience Measurement and Media Reach/Frequency Issues in Internet Advertising," Proceedings of American Academy of Advertising, Vancouver, B.C., Canada, April.
Hong, Jongpil and John D. Leckenby (1997), "Reach/Frequency Estimation for the World Wide Web," Proceedings of American Academy of Advertising, St.Louis, MO, March.
Lebow, Irwin (1995), Information highways and byways: from the telegraph to the 21st century, New York: IEEE Press.
Lysaker, Richard L. (1983), "The audience levels produced by the 'claimed first time reading' method," in Readership Research: Montreal 1983, Proceedings of the 2nd International Symposium, Harry Henry, ed., Amsterdam, The Netherlands: Elsevier Science Publishers B.V., 149-160.
Murphy, Ian P. (1996), "On-line ads effective? Who knows for sure?" Marketing News, v.30, n.20, Sept 23, 1, 38.
Opfer, Gunda (1987), "The Idea of Shortening the Period of Recall," in Readership Research: Theory and Practice, Harry Henry, ed., Amsterdam, The Netherlands: Elsevier Science Publishers B.V., 72-80.
Randall, Neil (1997), "The New Cookie Monster," PC Magazine, v.6, n.8, Apr. 22, 211-214.
Rebello, Kathy (1996), "Special Report: Making Money on the Net," Business Week, September 23, 104-118.
The Economist (1997), "Suited, Surfing and Shopping," at URL: http://www.economist. com/issue/25-01-97/wb8565.html.
Woods, Bob (1997), "New Sites Move Into PC-Meter Top Website," Newsbytes Jan. 14.
Appendix
Example of survey questionnaire for the Recent Reading and Reading Yesterday methods
|
Homepage of Site with its name in caption
|
|
|
|
Q1) During the past 6 months, have you ever read or looked into this Web site? 1. Yes, I am sure I have 2. I am not sure 3. No, I am sure I have not (please skip to next site) |
Q2) During the past 7 days, did you happen to read or look into this Web site? 1. Yes, I am sure I have 2. I am not sure 3. No, I am sure I have not |
Q3) Did you go to this Web site in the past 24 hours? 1. Yes (please continue to #4) 2. No (please skip to #5) |
|
Q4) How much time did you spend when you went to this Web site in the past 24 hours? 1. less than a minute 2. 1-5 minutes 3. 6-10 minutes 4. 11-30 minutes 5. 31-60 minutes 6. over 1 hour |
Q5) How often on average do you go to this Web site? 1. less than once a week 2. 1-2 times a week 3. 3-6 times a week 4. almost daily 5. several times a day 6. none of the above |
Q6) How much time on average do you spend when you go to this Web site? 1. less than a minute 2. 1-5 minutes 3. 6-10 minutes 4. 11-30 minutes 5. 31-60 minutes 6. over 1 hour 7. none of the above |
Questions 1 and 2 are for M2 (Recent Reading method) and Questions 3 to 6 are for M3 (Reading Yesterday method).
About the Reading Yesterday method
The measurement method that is proposed in this study by the authors is named Reading Yesterday. Reading Yesterday originates from the measurement method called "First Reading Yesterday" or "(for the) first time, read yesterday." Following is a brief explanation of the original First Reading Yesterday (FRY) and Reading Yesterday methods (RY).
First Reading Yesterday is a measurement method often employed for readership research in print medium. The method is widely used in some countries in Europe, particularly in Denmark, the Netherlands and Norway (see, for further discussion of First Reading Yesterday method, Brown 1994, Hond and Huzen 1983, Lysaker 1983, and Opfer 1987).
There is not much difference between the Recent Reading and First Reading Yesterday methods. Basically, First Reading Yesterday is a variation of Recent Reading. They are same in that both try to measure memory in the form of recall. They are different, though, in that Recent Reading measures memory of respondents during the last issue period preceding the day when the survey is conducted while First Reading Yesterday measures it during "yesterday or in the last 24 hours", i.e., the day preceding the day of survey. Specifically, First Reading Yesterday first asks respondents whether or not they saw a publication yesterday. To those who said yes to the first question, it asks whether or not the issue of the publication that they read yesterdaythe age of the issue that respondents read does not matterwas the issue that they read yesterday for the first time. Only those who answer affirmatively to the second question are counted for the average issue readership (AIR). In order to get the AIR, First Reading Yesterday multiplies the number of people who said that they read an issue of a publication yesterday for the first time either by the number of the days in the issue period of the publication or by the claimed reading frequency of the subjects.
The underlying logic of the First Reading Yesterday method is that no matter how many times an issue of a publication is read and how old it is, there can be only one first reading event for one person and one issue. This logic of First Reading Yesterday gives us one distinct advantage against Recent Reading, especially when the issue period of a publication is long or when it is difficult to define the issue period of a publication as is often the case in websites: a less burden on respondents’ memory, thus, leading to maximization of respondents’ recall accuracy. To put it simply, it is much easier to remember what happened yesterday than what happened, for instance, five days ago.
The drawback of First Reading Yesterday is that the method produces a larger error during its calculation of the AIR because it projects a weekly and monthly readership from a daily readership. This drawback can be minimized by increasing the sample size.
The advantage of applying First Reading Yesterday to website traffic measurement is, in addition of increased accuracy in recall, that First Reading Yesterday is compatible with the issue period of most websites. Although there are websites that are not updated for more than a week, most websites are updated on a daily basis even when their print counterpart is a weekly magazine. As a matter of fact, most popular commercial websites are in the process of continuous updating.
Due to this nature of continuous updating in websites, the First Reading Yesterday method per se cannot be applied to website traffic measurement. In other words, the "first" concept in First Reading Yesterday applies every time when a Web user goes to a web site and every visit amounts to a "first" visit. The underlying logic of First Reading Yesterday that there can be only one first reading event for one person and one issue does not holdor, to be more precise, becomes redundantin measurement of website traffic since most websites are updated daily, or several times a day. Therefore, the "first" concept in First Reading Yesterday is dropped in its application to website traffic measurement, resulting in Reading Yesterday as the name of measurement method proposed in this study. As for the websites that are not updated very frequently, the frequency and duration variables should reflect this lack of frequent updating in those websites because their total traffic volume will decrease accordingly.
Furthermore, unlike the First Reading Yesterday used in a readership study of a print publication, Reading Yesterday in website traffic measurement does not have to differentiate a publication from an issue of the publication. This is because the distinction between a publication and its issues is blurred in most websites. Therefore, the concept of "issue" in the Average Issue Readership (AIR), which is commonly used in print vehicles, does not apply to website vehicles. Instead, the average traffic readership of a website or simply average readershipper day, per week, or per monthtakes the place of the AIR.
In regard to the Traffic Index (TI) criterion which is used in Reading Yesterday, its concept is quite similar to that of the GRP (Gross Ratings Point) in traditional media except that it adds to the equation one more factor: the time spent by visitors in a web site. This factor of time figures the level of interactivity of a web site in the equation of site traffic analysis.