Using Reach/Frequency for Web Media Planning*
By
John D. Leckenby
Everett D. Collier Centennial Chair
in Communication
john.leckenby@mail.utexas.edu
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712-1092
And
Jongpil Hong
Doctoral Candidate
jphong@mail.utexas.edu
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712
published in
Journal of Advertising Research
January/February 1998
*
The authors wish to gratefully acknowledge the assistance of Steve Coffey of Media Metrix. (URL: http://www.mediametrix.com), for his invaluable support in the collection of the data which serve as the basis for this study.
Using Reach/Frequency for Web Media Planning
Abstract
The potential usefulness and viability of the twin concepts of reach and frequency in traditional media planning are discussed and examined in this study. It is suggested reach/frequency estimation will become the next main issue once Web audience measurement issues are addressed. Results of testing standard/non-standard reach/frequency estimation methods on a sample of 7,162 respondents show that for the six models studied all except the binomial estimate reach/frequency within acceptable limits of error. One of the oldest and simplest models, the Beta Binomial Distribution (a.k.a. The Metheringham Method), provided the greatest accuracy of estimation. This finding illustrates the simplicity of audience exposure patterns to the top 50 sites included in this study. Even at this early and unsettled stage in the development of audience exposure patterns for the Web, methods of reach/frequency estimation can provide accurate estimates of the audiences of media schedules containing this media type.
Using Reach/Frequency for Web Media Planning
Introduction
Reach and frequency estimation of media schedules for major media types has become the common method of operation for media directors and planners in advertising agencies over the last 50 years or so. All indications point to an increasing trend in the usage of reach/frequency models for traditional media types as seen in studies of the practices of media directors in the top 200 advertising agencies in the U.S. In 1982, 87.9% of these agencies used estimation models for reach and 74.5% used them for the frequency distribution estimation. The corresponding figures for 1993 were 90.5% for reach and 87.3% for the frequency distribution (Leckenby and Kim 1994). While these increases in 1993 are not dramatic, it is clear the mode of operation in media planning continues to rely upon reach/frequency estimation and the trend is upward rather than downward in usage. These estimations are conducted using probability as well as non-probability models of proprietary and non-proprietary natures.
With the advent of the new Internet World Wide Web (the Web) as a medium for advertisers, the issue arises as to the applicability of the standard operating approach in media planning as applied to this new medium. The main questions are:
These questions assume importance because of the rapid growth of this medium and its projected consequences.
The Internet has been the fastest growing medium in the last few years. User demand on the Internet has experienced dramatic increase since its launch. Current estimates place the number of Internet users between 20-52 million, with projections of the number of users by the year 2000 ranging from 120 million to 200 million (Meeker 1997). In January 1997, a Gartner Group survey predicted that there would be 150 million Internet users by 1998. Project 2000 (URL: http://www2000.ogsm.vanderbilt.edu) estimated in April that there were 28.8 million U.S. households with potential access to the Internet (Holsendolph 1997). Intelliquest (URL: http://www.intelliquest.com) currently estimates the number of Internet users at 52 million while MediaMetrix (URL: http://www.mediametrix.com) estimates 25 million home users.
The capabilities of the Internet add up to a potential market that is literally worldwide. In a relatively short period, the Internet has come to be perceived as so important that companies feel compelled to have a Web presence. InterNIC reports that, through December 1996, there were 897,662 registered Internet domains (domains are the unique names that identify an Internet site), of which 73% were created in 1996 alone. Of the total sample, 796,039 (or 89%) were commercial (".com") domains. The number of Internet hosts (a host is any computer whose services are available on the Internet), tracked by Network Wizards, has shown similarly explosive growth (Meeker 1997).
Another research firm, Internet Info, reported in August 1997 that there were 419,360 commercial domains (".com"), a 139 percent increase from the first half of the year. And according to Manning Selvage & Lee, 66 percent of the Fortune 500 companies have Internet access. Forrester Research (URL: http://www.forrester.com) reports that 46 percent of Fortune 500 executives think the Internet will either have "a huge or significant" impact on their sales processes over the next three years (Holsendolph 1997). According to International Data Corporation, the estimated number of Fortune 500 companies with a Web presence increased in 1996 from 175 to nearly 400 (an increase from 35% to 80% penetration). It is an important barometer for how quickly the Web is becoming a mainstream channel for major corporations' marketing, communications and business transactions (Meeker 1997).
Forrester Research also reports that in 1997, the Internet's business-to-business revenue is expected to be $ 8 billion, and business-to-consumer revenue, $ 1.1 billion. By 2000, business-to-business revenue is expected to reach $ 105 billion and business-to-consumer revenue $ 6.6 billion (Holsendolph 1997).
The Internet Advertising Bureau reported advertisers spent $ 129.5 million online in the first quarter of 1997, an increase of 18% from the preceding quarter (URL: http://www.iab.net). While this is extremely small compared to other media types, the rate of increase is expected to accelerate.
It is estimated that 167 million PC users worldwide by the end of 1996 and it is expected about 84 million PCs to ship in 1997. PC shipments are expected to pass TV shipments in the next year or two. Moreover, record-high sales of modems and networking equipment imply that PC connectivity is on the rise. All of these lead credibility to the idea that the Internet as a medium for delivering information and entertainment content may become a significant alternative to TV (Meeker 1997).
Reach and Frequency for the Web?
With respect to the question of reach/frequency applicability to this new medium, it is clear that history is on the side of the usage of the "old tools" in the "new media." When radio and television came along as new media available to advertisers, many of the techniques in reach/frequency estimation as well as the terminology of magazines and newspapers were applied to these new media. As noted earlier, 87.9 percent of media planners apply reach/frequency models to schedules consisting of several different media types (Leckenby and Kim 1994). It also worked the other way around; gross rating points (GRPs) has come to be applied to magazines and other media when invented for the new medium of television. History points toward an interactive relationship between media planning applications in the "old" and the "new" media.
This question, however, can be addressed from at least two vantage points: (1) the conceptual issues involved; and (2) the technical issues inherent in measuring audiences of the Web.
Conceptual Issues. If it is the only objective of the use of Web advertising as part of the overall marketing plan to generate immediate sales, then it may be the case that knowledge of the reach/frequency for a media schedule of sites is not critical. In this case, the so-called "click-through" model of media buying or some yet-to-be-developed improvement may suffice. Even in this model borrowed from direct marketing, however, the question arises as to "why" the particular click-through rate occurred? It would be worthwhile, for example, to know the answer to this question in order to make some generalizations about the creative approach. A debate is rising over whether or not click-through tells advertisers anything worth knowing and whether it should be a factor in pricing online ad campaigns. "Measuring click-through rates is tantamount to counting how many people open a direct mail piece," says Ben Addoms, senior VP/sales & marketing at MatchLogic Inc., a third-party provider of media management services for online ad campaigns (Min's New Media Report 1997).
However, if one of the goals is to engage in "brand building" through the Web as well as traditional media, then reach/frequency information assumes greater importance. Recent studies point to the effectiveness even of simple Web banners to increasing brand awareness. For example, one study conducted on behalf of the Internet Advertising Bureau by MBinteractive, a unit of researcher Millward Brown International, shows that awareness of brands participating in the study increased 5%, on average. In addition, 49% of study participants said they recalled seeing a tested ad banner on a particular site (URL: http://adage.com/interactive/articles/19970616/article3.html). And the content of these banners is getting more involving as in Intervu's V-Banner Video (URL: http://www.intervu.com) or in Infoseek's Java-enabled banners (URL: http://www.infoseek.com). And, of course, if the site is considered as an ad, then brand building is certainly a concern of most Web sites.
Another principal conceptual issue concerns the eventual development of the Web itself. Will it become a mass medium or a "one-to-one" communication medium? If it is the former, then cost-per-thousand (CPM) based upon reach estimations is critical; for the latter, click-throughs or some variation may be more suitable. The advent of "push technology" and new developments in the Web TV arena seem to point toward the "broadcast model" of the Web. Most likely, the medium will become a mixture of both macro and micro communication modes as it is essentially at the current time. With a potential audience of some 70% of 50 million people (e.g., Yahoo and Netscape) for a few sites, the Web already is a mass medium. Of course, it is also true that penetration of the Web in households in the U.S. has a long way to travel to the magical number of 70% considered by agency media planners to represent a viable mass medium. David Yoder, VP-media at Anderson & Lembke, San Francisco, estimates in Business Marketing that, assuming only half of the readers will see the given ad, the cost per thousand impressions ranges from about $40 to $70 for magazines. Even with the still-small Web audience in mass media terms, he estimates this number compares favorably with the cost of 1,000 impressions of a banner ad on a high-tech Web site (Neal and Maddox 1997).
At the current time, it may be the case then that the traditional concepts of reach/frequency are appropriate as means of selecting media schedules of Web sites in some instances while this may not be true in others.
Technical Issues. There are a number of technical issues that face the measurement of Web "audience" specifically in relation to reach/frequency estimation which will require some consideration.
A Bifocal View of Reach/Frequency for the Web
After an extensive review of the current practices in traditional and Web media types, two prominent thinkers in the Web arena (Novak and Hoffman 1996, URL: http://www2000.ogsm.vanderbilt.edu/novak/Web.standards/Webstand.html) concluded that at least two different levels of measurement of audiences were appropriate at the current time: (1) Exposure measures and (2) Interactivity measures.
From this bifocal view of measurement of audiences of the Web, "click-through" is a measure of interactivity while reach/frequency are indicators of exposure to either the site, page or ad on the Web page. It is clear that most consider the distinguishing characteristics of the Web to be interactivity. This would explain the current interest in the click-through as a measure of interest. However, exposure to sites remains a key issue since it addresses the question: what is the potential number of people who can either mentally or physically interact with the site (page, banner ad).
From another vantage point, the click-through represents an attempt to measure the "many-to-many" aspects of communication on the Web while reach/frequency attempts to assess the "one-to-many" aspects of communication inherent in the broadcast model of the Web on larger sites.
At the moment, given the discussion above of both conceptual and technical issues facing the usage of reach/frequency to the Web, it seems clear that such applications will capture part but not all of the "central meaning" of communication on the Web. Interactivity will probably not be captured by reach/frequency. Click-throughs or some other similar measure will undoubtedly measure this. Nonetheless, it is not necessarily the case that the audience must interact in order to convey a brand image, for example, successfully. However, it is not difficult to imagine a media planners' interest in the number of people who click a series of banners one or more times in some time period which we could name "banner-click reach." Also, the media planner might wish to know the number of people who clicked a series of banners on a site one time, two times and so forth up to the number of rotated banners on the site which might be called "banner-click frequency." Reach/frequency estimation methods could conceivably provide such information based upon generalized banner-object audience measurement across the major sites.
Just as reach/frequency would probably not be employed at the local level to choose between eight or nine club newsletters in which to advertise to a very small, specialized audience, so reach/frequency will also not be applicable in the case of specialized, one-to-one communication for small sites. But to the extent that a push, broadcast, macro model of communication is extant on the Web, reach/frequency will be necessary to measure and capture that aspect of communication in this new medium.
As a practical matter, Steve Coffey of Media Metrix, a user-centric Web audience measurement company, indicated in e-mail communication in October 1997 their firm is receiving a good deal of pressure from leading advertising agency media planners are pressuring them to provide reach/frequency estimation methods. Email communication at this same time with an executive of a leading media planning software supplier also indicated this same pressure.
There are many issues facing audience measurement of the Web at the moment as indicated by Novak and Hoffman (1996), among others. Once these issues, some of which have been noted above, have been addressed, it is likely the estimation of reach and frequency for Web schedules will assume center stage for media planners. This study attempts to point in that direction in anticipation of such developments. It is not suggested, through this application, that measurement and related issues have been resolved at this time.
The Current Study
Given the above backdrop and proviso, the issue of the viability of estimating reach and frequency for a media schedule consisting exclusively of Web vehicles serves as the focus of this study. With data currently available to advertisers, standard and non-standard media reach/frequency models developed for magazines, television and other media can be applied to the new Web medium. The main question concerns how accurate these old methods may be in the new media environment.
First, a selection of models to be applied will briefly be described. Second, the development of data to test the models will be discussed. Results of the testing of these models are presented followed by some conclusions and implications of this study.
Models
Six models which have served as the basis for performance comparisons in magazine and television media schedules, among others, will serve as the estimation methods in this study of Web reach/frequency: (1) Binomial Distribution (BIN); (2) Beta Binomial Distribution (BBD); (3) Sequential Aggregation Distribution (SAD); (4) Dirichlet Multinomial Distribution (DMD); and (5) Hofmans Beta Binomial Distribution (HBBD); and (6) Conditional Beta Distribution (CBD).
These six models have been studied extensively and are selected to represent the spectrum of methods available for reach/frequency estimation (Chandon 1976; Danaher 1988a, b; Danaher 1989; Headen, Klompmaker and Rust, 1979; Hofmans 1969; Ju 1990; Kishi 1987; Leckenby and Kim 1992; Lee 1988; Rice 1985; Rust 1986; and Rust and Leone 1984, Kim 1994). In addition, many of these approaches have been available for use in proprietary formats for several years (Lancaster 1987; and Telmar 1980). These six models are described in the Appendix.
Test Procedure
The history of such studies shows that, most frequently, input data for the models and benchmark data for accuracy assessment of those models are syndicated data such as those developed routinely by Simmons Market Research Bureau, A. C. Nielsen or, prior to their demise, Arbitron, Inc. Fortunately, the syndicated data currently available for the Web do provide the type of information needed for an initial test of reach/frequency model performance.
Data were obtained from Media Metrix, The PC Meter Company (URL: http://www.mediametrix.com) on individual respondents for two weeks of March 1997. Media Metrix operates a user-centric tracking system that shows which Web site respondents visited during any given time period. The company uses composite name and address lists for the universe of U.S. households from which it identifies personal computer users. It installs proprietary software on the computers of panel participants who agree to let the company monitor their activities on the personal computer and on the Web. The participants are compensated with a typical incentive program including gifts and chances in a sweepstakes program. Under guaranteed anonymity, they also fill out detailed questionnaires about their own characteristics. The sample of the n=7,162 respondents who were measured and were active Web users in the two weeks covering March 1-14, 1997, consisted of 4,560 males and 2,602 females. A total of 5,407 people were active in week #1 while 5,530 were active in week #2. People duplicated in the measurement periods amounted to 3,775. For these data, the presumed "publication interval," therefore, is considered to be one week. This is consistent with the direction banner ads seem to be going in the industry. For example, David Henderson of DoubleClick (URL: http://www.doubleclick.net) now estimates that response rates to banner ads drop by one-half from the second to third time someone sees the ad (Marx 1996). This finding, coupled with the speed of the Web as a communication medium, undoubtedly will lead to shorter lives for media buys and ad placements. Currently, it is estimated 80% of all online ad placements are banners (Jupiter Communications, URL: http://www.jup.com). In their quest to delineate the "standards issue" in Web advertising, Novak and Hoffman recommended a week as the planning interval (Novak and Hoffman 1996).
Based upon these data, average site audience, cumulative site audience, and between-vehicle duplication (cross-pairs) were defined on a weekly basis using the definitions used by SMRB (Simmons Market Research Bureau 1989). The Average Site Audience was calculated as:
Average Site Audience = (Site viewers in Week 1 + Site viewers in Week 2) / 2 (1)
The Cumulative Site Audience was calculated as:
Cumulative Site Audience = (Site viewers in Week 1 or Site viewers in Week 2
or Both Viewers) (2)
The Between-Vehicle Duplication was calculated as:
Between-Vehicle Duplication = (Site 1 and Site 2 Viewers) (3)
These data were divided by 7,162 (measurement sample size) to provide estimates of the audience proportions which are input to the media reach/frequency estimation models. The above three media vehicle statistics for each of the top 50 sites served as input data for the model estimations. Table 1 shows the calculated average site audiences for the 50 sites as well as their cumulative audiences.
Benchmark Tabulations
For this study, 560 schedules of Web site vehicles were developed to show a reasonable test of the models' efficacy. Schedules were constructed by randomly selecting vehicles from the 50 sites available in the study. Forty schedules were developed for each of 14 schedule sets where a set consisted of a given number of vehicles from two to 15 vehicles. The vehicles each contained two insertions to be compatible with the tabulation and measurement systems over week #1 and week #2. The tabulated schedules were randomly selected from 2 to 15 vehicles in each schedule with 2 inserts each for total of 30 exposures maximum. This provided exposure distributions in range of size from 4 insertions total to 30 insertions total.
Tabulation involves, for each schedule, counting person-by-person exposure to each of the vehicles on each of the two measurement phases. This results in the "true" answer for the sample of exposure to the schedule vehicles over two occasions. This shows, in a two-vehicle schedule, for example, the proportion of the sample exposed no times, one time, two times, three times or four times to the vehicles in the schedule.
The reach and frequency distributions for these schedules were then estimated using the six models. The performance of the models is assessed using the standard error criteria described below. It should be noted that among the top 50 sites included in this study, four sites were of a non-commercial nature (government or university, for example). The presumption is that the behavior across these sites is consistent with that of the commercial sites.
Performance Evaluation Criteria
Definition of Error
In evaluating performances of different models, their accuracy depends partly on the manner in which error is defined in the study. In this study, two different error factors, error in reach estimation (AER) and error in the exposure distribution (APE), are adopted from previous studies (Kishi and Leckenby 1982; Leckenby and Kishi 1984). Danaher (1992) also used these definitions of AER and APE (he renamed AER as RER, "relative error in reach," and APE as EPOR, "error in exposure probabilities over schedule reach").
The error in the reach estimates for the test schedules was defined as the absolute value of the difference between the observed and predicted reach in terms of percentage.
Average percentage error in reach
(AER) AER=
S (|oi-ei|/oi)/K (4)where:
oi = observed reach of schedule i
ei = estimated reach of schedule i
K = total number of schedules.
The error in the each exposure level is simply defined as the absolute difference between the observed and the estimate frequencies.
Average percentage error in exposure distribution:
(APE) APE = (
S PEi ) / K (5)where:
S
PEi = (S | oij - eij| ) /S oijPEi = percentage error in the schedule i
oij = observed frequency at exposure level j of schedule i
eij = estimated frequency at exposure level j of schedule i
S
oij = observed reach of schedule iK = total number of schedules.
Results
Tables 2 and 3 show two sample schedules consisting of three vehicles and eight vehicles, respectively. The simplicity of these schedules reflects that generally found throughout the n=560 schedules in the study. That is, there is generally only one mode in each distribution; this is unlike magazines where typically two or three modes or more may be observed, depending upon the number of inserts in the schedule.
Tables 6 and 7 show the statistics for between-vehicle (cross-pair) and within-vehicle (self-pair) duplication rates for the Web in this study and the magazines of the 1979 SMRB study, respectively. It is interesting to note that the between-vehicle duplication rates of the 50 sites in this study match almost identically those of the 98 magazines in the 1979 study by SMRB.
The Average Percentage Errors in Reach (AER)
The Average Percentage Errors in Reach are shown in Table 4. These errors tend to be small and consistent with those found in other media types in studies of the type developed here. Table 5 shows the results for magazine studies in New Zealand on AGB data, two SMRB magazine data studies, and one SMRB television study.
The best-performing model for the error in reach criterion was HBBD with an AER of 2.23%. This was followed by the CBD at 2.43% and the BBD with 2.50% AER.
The Average Percentage Errors in the Exposure Distribution (APE)
The Average Percentage Errors in the Distribution (APE for the sample schedules are also shown in Table 4.
These errors are relatively lower compared to other media types. This can be seen by examining Table 5 for other magazine and television study results on some of the same models studied here. For magazines, the APE's tend to be between 13 and 34 percent. For the Web schedules studied here, the range is between 8 percent and 24 percent.
Of the six models studied, the Hofmans' Beta Binomial Distribution (HBBD), Conditional Beta Distribution (CBD), and the Beta Binomial Distribution (BBD) have the lowest APE in distribution.
The fact that the BBD performs so well in these schedules for the Web compared to its history of performance in other media types is an indication of the simplicity of the distributions for these Web media schedules.
Conclusion
This is the first publicly-available study of which the authors are aware that formally has examined the estimation characteristics of existing reach/frequency models in the Internet medium environment using readily available data to advertisers from a syndicated service. In this sense, this study is pioneering.
This study has examined the performance of existing reach/frequency estimation models for a sample of 560 Web media schedules ranging in size from two to fifteen vehicles and four to thirty insertions. Results show that the models perform as well on Internet data as they did for either magazine or television data. Further, the audience exposure patterns as indicated by the uni-modal distributions are rather simple compared to other media types of the traditional variety.
Duplication of site visits in 1997 is at about the same level for the top 50 sites studied here as it was for 98 magazines measured by SMRB in 1979. This also points to the relative simplicity of viewing behavior on the Web at this point in its development as well as some interesting parallels with the magazine audience behavior of almost 20 years ago.
It would be appropriate to conduct such work as this using different definitions of site publication period to determine the effect of this variable on estimation accuracy. This can be undertaken with data such as those utilized in this study.
A study of this type should also now be applied to data from other sources that may overcome some of the limitations inherent in the current data.
One of the inherent difficulties in defining average audience in the traditional manner as in this study concerns the entrenched perceptions of the Web industry. Average audience of the variety here will always be a lower figure than would be found from site-centric measurement of "visits" in some time period. But this issue has been faced many times before when a new methodology for audience measurement has been introduced to an existing, traditional medium.
Most importantly, the Advertising Research Foundation should undertake a study to address those technical issues outlined in this paper so that the ambiguities, which were dealt with somewhat arbitrarily in this study, could be resolved.
It is now clear, however, that the reach and frequency distributions can be estimated quite accurately for schedules of Web vehicles. To mix these vehicles with other traditional media types and then estimate reach/frequency will need to await resolution by the industry of the problems of conceptual and technical natures noted earlier in this study.
Table 1
Vehicle Average Audience and Cumulative Audience of WWW Sites in the Study
|
50 Sites in This Study |
Average Audience(%) |
Cumulative Audience(%) |
|
"AOL.COM" |
26.88 |
40.70 |
|
"NETSCAPE.COM" |
23.44 |
33.90 |
|
"YAHOO.COM" |
22.40 |
34.45 |
|
"MICROSOFT.COM" |
13.98 |
21.20 |
|
"WEBCRAWLER.COM" |
13.73 |
23.79 |
|
"EXCITE.COM" |
12.64 |
21.08 |
|
"GEOCITIES.COM" |
9.68 |
15.82 |
|
"MSN.COM" |
9.32 |
13.85 |
|
"INFOSEEK.COM" |
8.83 |
14.44 |
|
"LYCOS.COM" |
7.91 |
13.54 |
|
"DIGITAL.COM" |
7.41 |
11.92 |
|
"ATT.NET" |
5.89 |
8.06 |
|
"PRODIGY.COM" |
5.75 |
9.02 |
|
"COMPUSERVE.COM" |
4.94 |
8.18 |
|
"ZDNET.COM" |
3.84 |
6.37 |
|
"NETCOM.COM" |
3.57 |
5.75 |
|
"EARTHLINK.NET" |
3.52 |
5.84 |
|
"PATHFINDER.COM" |
3.34 |
6.02 |
|
"INTER.NET" |
3.05 |
5.14 |
|
"FOUR11.COM" |
3.04 |
5.43 |
|
"SWITCHBOARD.COM" |
2.78 |
5.15 |
|
"CONCENTRIC.NET" |
2.77 |
4.61 |
|
"DISNEY.COM" |
2.69 |
4.43 |
|
"TRIPOD.COM" |
2.58 |
4.44 |
|
"SPORTSZONE.COM" |
2.43 |
3.94 |
|
"WEATHER.COM" |
2.42 |
4.09 |
|
"CNET.COM" |
2.25 |
3.84 |
|
"REALAUDIO.COM" |
2.15 |
4.05 |
|
"AMATEURS.COM" |
2.15 |
4.01 |
|
"MINDSPRING.COM" |
2.14 |
3.63 |
|
"HOTBOT.COM" |
2.14 |
3.49 |
|
"ANGELFIRE.COM" |
2.10 |
3.78 |
|
"WHOWHERE.COM" |
2.09 |
3.88 |
|
"WEBCOM.COM" |
2.03 |
3.56 |
|
"DOWNLOAD.COM" |
1.97 |
3.56 |
|
"CNN.COM" |
1.97 |
3.16 |
|
"MAPQUEST.COM" |
1.88 |
3.42 |
|
"UIUC.EDU" |
1.82 |
3.46 |
|
"DEMON.CO.UK" |
1.77 |
3.31 |
|
"UMICH.EDU" |
1.77 |
3.31 |
|
"MIT.EDU" |
1.70 |
3.27 |
|
"REAL.COM" |
1.67 |
3.16 |
|
"CRIS.COM" |
1.66 |
3.02 |
|
"SHAREWARE.COM" |
1.63 |
3.00 |
|
"HOLOWWW.COM" |
1.62 |
2.99 |
|
"NASA.GOV" |
1.61 |
2.97 |
|
"BEST.COM" |
1.58 |
2.93 |
|
"SONY.COM" |
1.50 |
2.69 |
|
"TELEPORT.COM" |
1.49 |
2.81 |
|
"INFOSPACE.COM" |
1.35 |
2.53 |
Table 2
Sample Exposure Distribution for a Three-vehicle Schedule
(Two inserts each in ZDNet.com, MapQuest.com, and Excite.com)
|
|
Observed |
Binomial |
BBD |
CBD |
MSAD |
DMD |
HBBD |
|
# of exp. |
% |
% |
% |
% |
% |
% |
% |
|
0 |
72.63 |
68.46 |
72.31 |
72.12 |
72.72 |
72.04 |
71.90 |
|
1 |
20.01 |
26.78 |
20.56 |
20.51 |
19.24 |
20.14 |
21.18 |
|
2 |
5.84 |
4.36 |
5.54 |
6.35 |
6.80 |
7.20 |
5.48 |
|
3 |
1.10 |
.38 |
1.30 |
.86 |
1.07 |
.32 |
1.20 |
|
4 |
.38 |
.02 |
.25 |
.15 |
.15 |
.29 |
.21 |
|
5 |
.04 |
.00 |
.03 |
.00 |
.00 |
.00 |
.02 |
|
6 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
Errors |
|
|
|
|
|
|
|
|
AER |
--- |
15.24 |
1.17 |
1.86 |
.33 |
2.16 |
2.89 |
|
APE |
--- |
34.23 |
4.31 |
5.52 |
7.38 |
8.80 |
6.61 |
Table 3
Sample Exposure Distribution for an Eight-vehicle Schedule
(Two inserts each in MIT.edu, NASA.gov, Halowww.com, Microsoft.com, Demon.co.uk, Digital.com, Webcrawler.com, and Prodigy.com )
|
|
Observed |
Binomial |
BBD |
CBD |
MSAD |
DMD |
HBBD |
|
# of exp. |
% |
% |
% |
% |
% |
% |
% |
|
0 |
46.68 |
37.50 |
46.92 |
45.94 |
47.38 |
45.00 |
47.45 |
|
1 |
28.92 |
37.93 |
28.34 |
28.46 |
25.11 |
27.73 |
27.84 |
|
2 |
13.73 |
17.98 |
14.11 |
15.85 |
16.78 |
19.13 |
13.89 |
|
3 |
6.41 |
5.31 |
6.36 |
6.56 |
7.42 |
4.64 |
6.36 |
|
4 |
2.61 |
1.09 |
2.66 |
2.34 |
2.51 |
2.79 |
2.72 |
|
5 |
1.12 |
.16 |
1.04 |
.66 |
.64 |
.29 |
1.09 |
|
6 |
.32 |
.01 |
.38 |
.15 |
.13 |
.35 |
.41 |
|
7 |
.17 |
.00 |
.12 |
.03 |
.02 |
.00 |
.14 |
|
8 |
.06 |
.00 |
.04 |
.00 |
.00 |
.04 |
.05 |
|
9 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
10 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
11 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
12 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
13 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
14 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
15 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
16 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
.00 |
|
Errors |
|
|
|
|
|
|
|
|
AER |
--- |
17.22 |
.45 |
1.39 |
1.31 |
3.15 |
1.29 |
|
APE |
--- |
32.58 |
2.36 |
7.16 |
16.57 |
17.98 |
2.94 |
Table 4
Summary of Average Error Calculations
Web schedules (n=560)
|
|
Error Type |
|
|
Model |
AER(%) |
APE(%) |
|
Binomial |
21.00 |
41.50 |
|
BBD |
2.50 |
8.63 |
|
CBD |
2.43 |
9.66 |
|
MSAD |
3.51 |
18.80 |
|
DMD |
6.67 |
24.10 |
|
HBBD |
2.23 |
9.78 |
Table 5-1
New Zealand 1985 AGB Magazine Data (n=600)
|
|
Error Type |
|
|
Model |
AER (%) |
APE (%) |
|
DMDLK |
2.23 |
15.80 |
|
LOGLIN |
1.20 |
12.83 |
|
CANEX |
2.19 |
14.82 |
|
BBD |
11.85 |
50.40 |
Table 5-2
SMRB US 1979 Magazine Data (n=515)
|
|
Error Type |
|
|
Model |
AER (%) |
APE (%) |
|
DMDLK |
2.77 |
17.68 |
|
CANEX |
3.08 |
19.91 |
|
BBD |
6.46 |
33.23 |
|
CBD |
3.12 |
13.76 |
|
MSAD |
3.13 |
15.27 |
Table 5-3
SMRB US 1984 Magazine Data (n=508)
|
|
Error Type |
|
|
Model |
AER (%) |
APE (%) |
|
DMDLK |
1.73 |
23.30 |
|
CANEX |
3.26 |
25.83 |
|
BBD |
4.92 |
33.59 |
|
CBD |
3.26 |
17.85 |
|
MSAD |
5.84 |
20.58 |
(continued)
Table 5-4
TV SMRB Data 1984
|
|
Error Type |
|
|
Model |
AER (%) |
APE (%) |
|
ALBET |
2.7 |
12.5 |
|
BBD-LD |
3.9 |
5.3 |
|
DMDLK |
3.5 |
12.6 |
|
BBD-IE |
3.1 |
13.7 |
Table 6
Average Within- and Between-Vehicle Duplication in This Study
|
|
Mean |
Standard Deviation |
Maximum |
Minimum |
|
Observed Within* |
1.90 |
3.00 |
13.06 |
1.3 |
|
Observed Between** |
.42 |
.68 |
8.20 |
.02 |
* 50 Web Sites
** 1,225 distinct vehicle pairs
Table 7
Observed and Random Within- and Between-Vehicle Duplication
SMRB Magazines 1979 Study
(Boyd and Leckenby 1985)
|
|
Mean |
Standard Deviation |
Maximum |
Minimum |
|
Observed Within* Random Within |
2.40 .40 |
3.30 .10 |
20.60 7.40 |
.20 .00 |
|
Observed Between** Random Between
|
.30 .20
|
.50 .30 |
8.10 6.80 |
.00 .00 |
* n=98 magazines
** n=4,753 distinct vehicle pairs
Appendix
Description of Models Utilized in This Study
Binomial Distribution (BIN)
Binomial distribution model constitutes the implicit landmark against which the performance of any other model should be compared since it is the simplest of all of them. The assumptions of this model are: vehicles are homogeneous; individuals are homogeneous; and vehicle exposure constitutes a set of mutually independent events such that exposure to one vehicle does not modify the probability of exposure to any other vehicle. The binomial model generally overestimates reach.
Beta Binomial Distribution (BBD)
This is one of the oldest known models which was developed by Richard Metheringham in the 1960's (Metheringham 1964) for use in advertising agency media planning. It is one of the simplest of the models studies and, therefore, has been frequently used in practice (Leckenby and Kim 1994). It is also the least accurate, generally speaking, of any of the known models except the binomial distribution (Leckenby and Ju 1990). But it serves as a benchmark for more complex models and may be appropriate in certain media situations. If the Internet exhibits low between-vehicle duplication, for example, then this method may be appropriate. The main problem in the BBD lies in its estimation of between-vehicle duplication (cross-pair duplication) which results in low overall estimation accuracy.
Sequential Aggregation Distribution (SAD)
This method uses Morgenzstern's reach formula to estimate reach (Chandon 1976). Then, each vehicle's marginal probability distribution is developed using the BBD separately for each vehicle. This overcomes the problem of the BBD used alone concerning between-vehicle duplication. These marginal distributions are combined sequentially to form a two-dimensional joint exposure distribution which is collapsed at each step along the main diagonals to form a marginal distribution to combine with the next vehicle's marginal distribution. It is known that different order of vehicle aggregation produces different results (Lee 1988). This method is often used in practice (Leckenby and Kim 1994) and is very accurate (Leckenby and Ju 1990) if theoretically inelegant.
Dirichlet Multinomial Distribution (DMD)
The Dirichlet multinomial distribution is a member of the family of multivariate Poyla-Eggenberger distribution. It has been also called the 'compound multinomial distribution' (Chandon 1976) and the 'n-dimensional basic beta distribution' (Mauldon 1959).
Unlike the univariate distribution models, the DMD model attempts to preserve the individual vehicle homogeneity. Many of the univariate models, in contrast, average all vehicles into one univariate, composite vehicle and then treat n schedule of n(i) insertions in each of i vehicles as one vehicle with the sum of n(i) insertions.
In the DMD model, each vehicle is treated as heterogeneous and the probability of exposure to each vehicle is incorporated to obtain the exposure distribution parameters. The Dirichlet distribution provides the distribution of probabilities of individuals to be exposed to none, any one, any two, or up to all vehicles assuming one insertion in each vehicle.
Hofmans Beta Binomial Distribution (HBBD)
Hofmans (1969) developed a method to calculate reach for any combination of media based on Agostini's estimation method. However, the Hofmans reach estimation is slightly different from Agostini's in that the 'K' variable in his formula is used for each duplication pair instead of a constant 'K' as in Agostini's formula.
Despite the superiority in theory, Chandon (1976) found that the Hofmans accumulation model did not significantly improve on Agostini's model. However, the Hofmans model for cumulative net coverage has been found to produce reach estimates better than beta-binomial distribution model (Leckenby and Boyd 1984, Kishi 1983).
A major drawback of the Hofmans accumulation formula is that it requires data for the reach of the three issues (Chandon 1976, p. 109). Since the syndicated data services do not provide the accumulation data of more than two issues, the Hofmans model is limited in use and consequently is less popular than the BBD model.
In this application, after the estimation of reach is completed using Hofmans' model, a beta binomial distribution is fit to this reach post hoc using the method of means and zeros (Anscombe 1950).
Conditional Beta Distribution (CBD)
This method was developed by Leckenby and Kim and reported in Kim(1994) as an improvement on the traditional employment of the Beta Binomial Distribution (Metheringham 1964). In this model, it is assumed each vehicle's marginal distribution is Beta Binomial; each individual in the population is characterized by a personal probability of exposure to a given vehicle. Also, primary assumption of this model is that exposures by individuals to a given vehicle on the condition they have previously been exposed to one insertion or no insertions in this vehicle also follow a Beta Binomial. That is, the conditional distributions of exposure all follow a BBD. In addition, this model employs the Markov assumption developed by Danaher(1992) to convolute the joint exposure distribution, as estimated using the CANEX model (Danaher 1991), with each conditional distribution to form the final collapsed exposure distribution. Unlike the Danaher (1992) approach, this is a non-random convolution.
References
Anscombe, F. J. (1950), "Sampling Theory of the Negative Binomial and Logarithmic Distributions," Biometrica, 37.
Bogart, Leo (1994), Strategy in Advertising: Matching Media and Messages to Markets and Motivations, 3rd edition, (Lincolnwood, IL: NTC Books).
Business Wire (1998), " Media Metrix Publishes Complete December Rankings of Top 25 World Wide Websites At Home and At Work," January 20.
Chandon, Jean-Louis (1976), A Comparative Study of Media Exposure Models. Unpublished doctoral dissertation, Northwestern University.
Coalition for Advertising Supported Information and Entertainment Guidelines (1996). [URL: http://www.commercepark.com/AAAA/casie/index.html].
Danaher, Peter J. (1988a), "Parameter Estimation for the Dirichlet-Multinomial Distribution Using Supplementary Beta-Binomial Data," Communications in Statistics, A17, 6 (June), 777-88.
____________(1988b), "A Loglinear Model for Predicting Magazine Audiences," Journal of Marketing Research, 25 (November), 356-62.
____________(1989), "An Approximate Loglinear Model for Predicting Magazine)Audiences," Journal of Marketing Research, 26 (November), 473-9.
___________(1991), "A Canonical Expansion Model for Multivariate Media Exposure Distributions: A Generalization of the 'Duplication of Viewing Law'," Journal of Marketing Research, 28 (October), 361-7.
___________(1992), "Some Statistical Modeling Problems in the Advertising Industry: A Look at Media Exposure Distributions," The American Statistician,, 46 (November), 254-60.
Forrester Research (1996). [URL: http://www.forrester.com].
Headen, Robert S., Jay E. Klompmaker, and Roland Rust (1979), "The Duplication of Viewing Law and Television Media Schedule Evaluation," Journal of Marketing Research, 16 (August), 333-40.
Hofmans, Pierre (1969), "Measuring the Cumulative Net Coverage of Any Combination of Media," Journal of Marketing Research, 3 (August), 269-78.
Holsendolph, Ernest (1997), "It's Time to Jump on the Bandwagon," The Atlanta Journal and Constitution, September 28, Business Sec. p. 01G.
Interactive PR & Marketing News (1997), "Market Research," December 12, Vol. 4, No. 36.
Ju, Kuen-Hee (1990), Simple Approaches to Modeling Advertising Media Exposures. Unpublished doctoral dissertation, The University of Texas at Austin, Austin, Texas.
Jupiter Communications Inc. (1997). [URL: http://www.jup.com].
Kim, Heejin (1994), A conditional Beta Distribution Model for Advertising Media Reach/Frequency Estimation, The University of Texas at Austin, Austin, Texas.
Kishi, Shizue (1987), "Exposure Distribution Models in Print, Spot-TV, and Mixed-Media Schedules: Empirical Test on Japanese Data," The Annual Conference of the European Marketing Academy, Ontario, Canada.
______________(1983), Exposure Distribution Models in Advertising Media. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana, Illinois.
______________ and John D. Leckenby (1982), "A Test of the Direct/Indirect BBD and Other Exposure Distribution Models," in Proceedings of the American Academy of Advertising, 46-52.
Lancaster, Kent M. (1987), "Optimizing Advertising Media Plans Using ADOPT on the Microcomputer," Paper presented at the Fourth AMA Microcomputers in Marketing Workshop, University of Hawaii at Manoa Honolulu.
Leckenby, John D. and Heejin Kim (1994), "How Media Directors View Reach/Frequency Estimation: Now and a Decade Ago," Journal of Advertising Research, 34 (September/October), 9-21.
_______________ and Heejin Kim (1992), "Unresolved Issues in Media Reach/Frequency Models," in Proceedings of the American Academy of Advertising, 100-106.
_______________ and Kuen-Hee Ju (1990), "Advances in Media Decision Models," Current Issues and Research in Advertising, 311-357.
_______________ and Shizue Kishi (1984), "The Dirichlet Multinomial Distribution as a Magazine Exposure Model," Journal of Marketing Research, 21 (February), 100-106.
_______________ and Marsha M. Boyd (1984), "An Improved Beta Binomial Reach/Frequency Model for Magazines," Current Issues and Research in Advertising, 1-24.
Lee, Hae Kap (1988), Sequential Aggregation Advertising Media Models. Unpublished doctoral dissertation, University of Texas at Austin, Austin Texas.
Marx, Wendy (1996), "How to Make Web Ads More Effective," in NetMarketing, supplement to Advertising Age, December 1996, p. 1.
Mauldon, J. M. (1959), "A Generalization of the Beta-distribution," The Annals of Mathematical Statistics, 30 (June), 509-520.
Min's New Media Report (1997), " Will Click-Through Stick As A Web Ad Gauge?" Vol. 3, No. 23, September 1.
Meeker, Mary (1997), "An Update on Internet Usage Trends/Forecasts," Computer Technology Review, 17 (6), 20-21.
Metheringham, Richard A. (1964), "Measuring the Net Cumulative Coverage of a Print Campaign," Journal of Advertising Research, 4 (December), 23-28.
Neal, Mollie and Kate Maddox (1997), " Net Marketing: Using the Net: Direct Sales vs. Branding, " Business Marketing, June 1, 25-27.
Novak, Thomas P. And Donna L. Hoffman (1996), "New Metrics for New Media: Toward the Development of Web Measurement Standards," [URL: http://www2000.ogsm.vanderbilt.edu/novak/Web.standards/Webstand.html].
PR Newswire (1998), " Media Metrix Study Shows Netscape as Top Internet Site for the Year Among Business Users," January 30.
Rice, Marshall D. (1985), Television Exposure Distribution Models in Advertising Media, Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign, Urbana, IL.
Rust, Roland T. (1986), Advertising Media Models: A Practical Guide. Lexington, MA: Lexington Books.
______________ and Robert P. Leone (1984), "The Mixed Media Dirichlet-Multinomial Distribution: A Model for Evaluating Television-Magazine Advertising Schedules," Journal of Marketing Research, 21 (February), 89-99.
Simmons Market Research Bureau (1989). Simmons Technical Guide.
Telmar Media Systems, Inc. (1980), "Selected Systems for Media Planning and Buying," New York: Telmar Media System Inc.