Internet Reach/Frequency Estimation Accuracy by Data Collection Method
by
Hyo-Gyoo Kim
Doctoral
Student
hgkim@mail.utexas.edu
http://www.ciadvertising.org
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712
And
John D. Leckenby
Everett
D. Collier Centennial Chair in Communication
john.leckenby@mail.utexas.edu
http://www.ciadvertising.org
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712
paper to be presented to
2000 Annual Conference
American Academy of Advertising
Newport, Rhode Island
April 2000
Abstract
Six exposure distribution models usually used for magazines and television are tested and compared on their performance in the estimation of the exposure distributions and reach across two different Web audience data sets. Results of testing reach/frequency estimation methods show that, for the six models studied, all except one estimate reach/frequency are within acceptable limits of error. This finding verifies again Leckenby and Hong 's previous application (1998) of the existing exposure distribution models to the Web audience. Then, the relationship in terms of two kinds of error terms between two data sets across six models is investigated. Each of their correlations across two data sets is very weak and produces errors differently, according to the data sets. Additional MANOVA test results revealed again that the data collection method and test-model can also produce different error sizes in exposure distribution models.
1.
Introduction
Media
exposure estimation is important in that it offers the basis for calculating
the number of prospects reached with media vehicles at each level of frequency
of exposure. This information may then be used by planners to compare alternative
media schedules as regards "effective" reach, as well as other criteria,
such as cost per thousand (CPM).
As
shown in the results of Leckenby and Kim' s survey (1994), it is an ordinary
procedure for media planners to employ one or more reach/frequency exposure
distribution models in their decision-making in the development of media schedules.
In this survey, the majority (87.9%) of the respondents' agencies used at
least one or more computerized models to estimate reach and frequency distribution.
Furthermore,
an examination of the relationship between the history of media development
and estimation models shows the main concepts of the previous models were
adapted to each newly developed medium. Continuing such adaptation, researchers
have now tried to apply reach and frequency estimation models to the new Web
medium. Leckenby and Hong (1998) examined the possibility of applying six
existing media estimation models to the Web. They revealed a high feasibility
of application for the existing estimation models to the Web and found five
out of six models ranged within an acceptable error level, compared to other
previous studies.
But
at this time the direct application of the existing estimation models to the
Web has some limitations due to Web characteristics. The Web is somewhat different
from previous media in terms of "interactivity." Still, numerous
conflicts exist in the definition of the Web either as a one-to-one or a many-to-many
medium (Novak and Hoffman, 1996; Pavlik, 1996). If the former view is followed,
the importance of the reach/frequency estimation model diminishes. But if
the latter view is followed, the reach/frequency estimation model gains power.
Additionally, on the Web, one can access the medium and/or the advertisements
as much as desired, as in magazines and newspapers, but unlike television
and radio, where advertisements are presented only when the media are accessed.
Here the "duration" concept will be another problematic.
At
this point, the definitive view of what Web advertising is and what should
be measured must be decided. If Web advertising's objective is to generate
immediate sales or to draw certain actions like click, the reach/frequency
schedule of sites may not be critical. In this case, researchers should measure
the direct relationship between the amount of advertising budget and sales,
or the "click-through" rate. But if the goal is linked to "brand
building," then reach/frequency information might gain great importance.
Therefore,
the issues regarding "interactivity"--or whether the audience clicked
certain advertisements or not--and "duration"--or how long the consumers
are seeing the advertisements--are other dimensions to be solved, but these
are beyond the scope of this research.
First,
this study will expand the application scope of existing reach/frequency estimation
models to include the Web medium, as a follow-up comparison to Leckenby and
Hong's (1998) study. Six estimation models will be examined, using two data
sets with exactly the same schedule.
Second,
this study will investigate the relationship between two kinds of Web audience
data sets in terms of errors in exposure distribution and reach estimates.
This commission provides the opportunity to devise a better idea for data
collection methods on the Web.
2.
Literature Review
Most
previous studies regarding media reach and frequency exposure distribution
models have focused on the creation of new models or on improving existing
models, comparing different distribution models. There has been considerable
research conducted on the development of reach/frequency estimation models
for different media (Boyd and Leckenby, 1985; Chandon, 1976; Danaher, 1988,
1989, 1991, 1992a; Headen et al., 1979; Ju and Leckenby, 1989; Kim and Leckenby,
1993; Leckenby and Boyd, 1984; Leckenby and Hong, 1998; Leckenby and Kishi,
1982, 1984; Leckenby and Rice, 1985; Lee, 1988; Rice and Leckenby, 1986; Rust,
1985, 1986; Rust and Leone, 1984). Leckenby and Rice (1986) tried to explain
the "declining reach" phenomenon in exposure distribution models.
Recently, as was mentioned earlier, Leckenby and Hong (1998) tried to apply
previous distribution models to Web audience estimation for the first time.
Their results have opened the possibility of applying existing estimation
models to the newly developed Web as an advertising medium.
There
have been numerous developments in reach and frequency estimation models throughout
those studies, but at the same time, it can be seen that two main issues are
critical regarding reach and frequency exposure distribution model testing.
Most models were tested on only one data set, partly because of the availability
of data sets, and partly because of time consumption in data tabulation. Another
problem that prevents comparability of results across studies is the lack
of agreement on standard error definitions by which model performance is judged
in the tests. Therefore, when an attempt is made to determine which model
is superior in performance to another, it becomes extremely difficult because
authors use different error criteria (Leckenby and Ju, 1990). These are some
of the error definitions that have been used by various researchers to assess
model performance: Average Error in Reach, Average Error in the Distribution,
Average Percentage Error in Distribution, 95 Percent Confidence Interval,
+/- Percent of Actual Reach, Total Error in Distribution, Number of Over,
Under and Within +/- 5 Percent at Each Exposure Level.
The
use of different data sets across tests poses one set of problems, but the
use of different error criteria across tests makes it impossible to make any
comparison except on a rank order basis.
3.
Methodology
Broadly
speaking, the two most used methods of measurement on the Web are user-centric
and site-centric systems. The main difference between these two approaches
lies at the source of data collection. A site-centric tracking system collects
data from the site server's log. Therefore, site-centric measures do not provide
unduplicated counts of individuals, generally. Meanwhile, a user-centric system
collects data from the personal computers of its sample participants. Since
unique users of a site are required for reach and frequency estimation, user-centric
data is appropriate. Recently, the new concept of an ad-centric method was
developed, but its concept is almost the same as site-centric except for the
server used (FAST, 1999).
In
this study, the authors used two kinds of Web audience data sets, collected
by two companies, Media Metrix (MM) and Relevant Knowledge (RK). These two
companies were recently merged into MM, and they are well-known, user-centric,
Web data collection companies. During the merging period, the two companies
had to decide on one method to collect data, because each company had used
a different data collection method. MM used software-based data collection
and RK used a JAVA-based collection method. With MM, participants have to
install software to log their Web behavior and have to send their data to
the company by mail on a monthly basis. MM basically tried to emulate the
A.C. Nielsen Company in its data collection system of TV users. MM's first
tracking system name was "PC Meter" after Nielsen's "People
Meter."
Conversely,
RK's participants did not have to send their data by the traditional delivery
mail system because the JAVA-based program could retrieve the participants'
Web behavior daily automatically.
Data:
MM's data were collected during two weeks of March 1997. The sample of 7,162
respondents consisted of 4,560 (63.7%) males and 2,602 (36.3%) females. A
total of 5,407 people were measured in Week 1, while 5,530 were measured in
Week 2. The people duplicated in the measurement periods amounted to 3,775.
The
data set for RK contains the records of 725 unique visitors who were measured
on a daily basis during September and October 1997. Those 725 unique visitors
visited the top 25 sites, producing 13,039 visits. The respondents consisted
of 447 (61.7%) males and 278 (38.3%) females.
Models:
Six existing models will be used as the estimation methods in this study:
Binomial Distribution (BIN), Beta Binomial Distribution (BBD), Conditional
Beta Distribution (CBD), Morgenzstern's Sequential Aggregation Distribution
(MSAD), Dirichlet Multinomial Distribution (DMD), and Hofmans Beta Binomial
Distribution (HBBD). These six models have been broadly studied in academic
areas and examined in Leckenby and Hong's (1998) study.
Schedule:
This study examined error factors in multi-vehicle, two-insertion advertisements
on a total of 560 Web schedules for each of the six models. Vehicles were
randomly selected from top-visited Web sites of those times. Forty completely
randomized schedules were developed for each of fourteen schedule sets, where
a set consisted of from two to fifteen vehicles. The number of insertions
in each vehicle was limited to two, since observed data were available only
for one and two insertions. This provided exposure distributions in a size
range from four insertions total to thirty insertions total.
Definition
of error: Estimation accuracy can be a critically important attribute
of exposure distribution models. Accuracy has been defined as the ability
of an exposure distribution model to faithfully reproduce observed distributions
as derived from sample data.
But
in evaluating performances of different models, as we mentioned earlier, their
accuracy depends on the manner in which error is defined in the study. In
this study, two different error factors, average error in reach estimation
(AER) and average error in the exposure distribution (APE), were adopted from
previous studies (Danaher, 1992; Kishi and Leckenby, 1982; Leckenby and Kishi,
1984).
The
error in the reach estimates for the test schedules was defined as the absolute
difference between the observed and the estimated reach in terms of percentage.
The error in reach exposure level was simply defined as the absolute differences
between the observed and the estimated frequencies. The detailed error definition
will be found in previous studies (Leckenby and Kishi, 1982; Leckenby and
Hong, 1998).
4.
Results
Reach
Estimation: The performance of each of the six models was compared on
their accuracy of reach estimation. Table 1 contains the average percentage
errors in reach (AER) of each model across the 560 schedules. As can be observed
in Table 1, the CBD, MSAD, BBD, and HBBD models outperformed the other two
in reach estimation. Those four models were within 5% of error.
Generally,
these results are consistent with those found in Leckenby and Hong's study
(1998). In both studies, the BIN model performed worst, but the others were
within the acceptable limits of error. When only RK's results are considered,
its rank order was identical with the previous study, which had used MM's
data.
This
result verifies again the application of existing estimate models to the Web
medium.
Exposure
Distributions: The average percentage error in exposure distribution (APE)
across 560 schedules was obtained for the six models by dividing the sum of
the percentage errors by 560.
As
is shown below, all models had relatively large errors. Among the six models
tested, the CBD and MSAD models performed better than the other four, with
average percentage errors in exposure distributions of 15.59% and 16.74%,
respectively.
These
results revealed some differences with the previous study (Leckenby and Hong,
1998): First, the error sizes were bigger in this study, compared to the previous
ones, except for MSAD; second, the best-performing order was also different
except for the BIN model, which had the biggest error size in both studies.
But
at the same time, the APE of five out of the six models, except for BIN, was
within acceptable limits of error in both studies.
Table 1. Average Percentage Error in Reach (AER) and Average Percentage Error in Exposure Distribution (APE) (N=560)
|
Model |
Data |
AER |
APE |
||||
|
Mean |
Std. Dev. |
Mean |
Std. Dev. |
||||
|
BBD |
RK |
5.5 |
4.2 |
4.8 |
37.8 |
27.6 |
15.0 |
|
MM |
2.8 |
2.2 |
17.3 |
10.0 |
|||
|
BIN |
RK |
20.5 |
20.0 |
9.3 |
50.0 |
44.6 |
24.4 |
|
MM |
19.4 |
6.7 |
39.2 |
15.5 |
|||
|
CBD |
RK |
4.8 |
3.6 |
4.8 |
21.4 |
15.6 |
11.0 |
|
MM |
2.4 |
1.7 |
9.8 |
5.1 |
|||
|
DMD |
RK |
6.8 |
6.0 |
10.3 |
30.0 |
25.8 |
17.1 |
|
MM |
5.2 |
7.2 |
21.6 |
14.6 |
|||
|
HBBD |
RK |
3.9 |
4.3 |
2.8 |
36.0 |
27.0 |
15.1 |
|
MM |
4.7 |
3.6 |
17.9 |
9.2 |
|||
|
MSAD |
RK |
5.7 |
4.2 |
6.1 |
22.1 |
16.7 |
11.0 |
|
MM |
2.5 |
2.4 |
11.3 |
6.3 |
|||
It
can be seen that generally MM's data had smaller error size except the AER
for the HBBD model. And all of the differences of the six models in two error
terms were statistically significant at the probability level of .05. This
result can be understood partly due to the sample size, i.e., MM's data is
almost ten times bigger than RK's, and partly due to the different data collection
method. This can be investigated more deeply
Correlation
Test: Table
2 shows a total of 12 correlation coefficients of each of the 6 models in
terms of average error in reach (AER) and average error in exposure distribution
(APE).
Generally,
the correlations between each model's errors were very low. The highest correlation
of average error in reach (AER) lies between DMD models, and its correlation
coefficient is .310. And the highest correlation of average error in exposure
distribution (APE) is .335 between RK's and MM's data in the BIN model. But,
generally, the correlation coefficient of .3 is not considered a large one.
The correlations of the rest of the pairs are negligible, ranging from .10
to .24, or not significant at the probability level of .05.
|
Model
|
BBD
|
BIN
|
CBD
|
DMD
|
HBBD
|
MSAD
|
|
|
Correlation
|
AER
|
.10*
|
.13*
|
.01
|
.31*
|
.05
|
.04
|
|
APE
|
.07
|
.34*
|
.11*
|
.20*
|
-.06
|
.24*
|
|
Given
the paper on the Media Metrix's Web site (Media Metrix, 1998), these results
are somewhat interesting. They stressed that the two companies', Media Metrix
and Relevant Knowledge, audience estimations were almost the same, even though
different methods were used to collect data. They revealed the correlation
coefficient of the ratings between the two services to be .989.
This
is quite a different story from this study. Two data sets were compared in
terms of two kinds of errors, AER and APE, in six reach and frequency models.
It is unclear which kinds of definitions were used in that paper, but it seems
plausible that the differences in results came from the different error definitions.
MANOVA
Test: The MANOVA
test was used to see the relationship among the data collection methods (Methods)
and the models used (Models) and two kinds of errors (AER and APE).
Table 3. Standardized Coefficients and Inferential Statistics
|
MANOVA |
Wilks’ Lambda |
F (d.f.) |
Std. Discriminant Function Coefficients |
F |
|
|
Methods |
AER |
.77 |
1006.31* (1, 6708) |
-.65 |
143.50* |
|
APE |
1.33 |
1568.00* |
|||
|
Models |
AER |
.42 |
702.72* (5, 6708) |
1.11 |
1346.37* |
|
APE |
-.17 |
639.4 |
|||
|
Method * Models |
AER |
.92 |
53.50* (5, 6708) |
-1.18 |
16.04* |
|
APE |
1.33 |
33.68* |
|||
In Table 3, Wilks’ Lambda
and the F test show that "Methods," "Models," and "Methods
by Models" are all significant at the probability level of .05. This
can be interpreted thus: The "Methods" used affects the mean of
each error and the "Models" used also affect the mean of the errors.
The interactive effect of "Methods by Models" also affects each
error.
Each
factor’s standardized coefficients show which dependent variable contributes
to the overall differences. In the "Methods" factor, the 'APE (1.33)'
variable impacts more than the 'AER (-.65)' variable on overall differences.
In the "Models" factor, the result was the opposite. The 'AER (1.11)'
variable contributes more than the 'AER (-.17)' variable.
When
considering the "Models" and "Methods" at the same time,
both of the 'AER (-1.18)' and 'APE (1.33)' variables are almost equally impacted
overall differences.
5.
Conclusion
The
purpose of this research is to compare two differently collected data in terms
of errors in six different reach and frequency estimation models on Web schedules.
Well-known, user-centric data collection companies', Relevant Knowledge and
Media Metrix, data were used for this study. 560 Web media schedules were
constructed randomly, and multi-vehicle with two advertisement insertions
used for each vehicle.
For
this purpose, the performance of six exposure distribution models was tested
across exposure schedules and also the accuracy of their reach estimates was
evaluated. Five out of six models, mainly used for traditional media, i.e.,
magazines or television, were successfully adaptable to the Web medium. This
result verified again the previous study by Leckenby and Hong (1998).
Second
followed the examination of the relationship among the data collection method
and test models and two errors, the average percentage error in reach (AER)
and the average percentage error in exposure distribution (APE).
The
correlation coefficients were very low, generally. This implies that each
model across 560 media schedules produces errors differently, according to
the data, even though on the same schedules. Here it can be assumed that a
certain kind of data collection method would be better in estimating reach
and frequency in media schedules.
Finally,
those relationships were examined again, using the MANOVA technique. Here
it was found that the data collection method and test-model can also produce
different error sizes.
It
is clear, however, that this research has some limitations and simultaneously,
suggestions for further studies. Even though two data sets were used to compare
the media models, the time and period of data collection, the definition of
universe, and especially the sample size were different from each other. Therefore,
those elements could affect the conclusion that different data collection
method can affect reach and frequency estimation on the Web. The definition
of error will be also problematic but, it is believed, this should be decided
according to the purpose of the research and the media plan.
Bibliography
Boyd, Marsha and
John Leckenby (1985), "Random Duplication in Reach/Frequency Estimation,"
Current Issues and Research in Advertising, 96-113.
Chandon, Jean-Louis
(1986), A Comparative Study of Media Exposure Models, New York, NY:
Garland Publishing, Inc.
Danaher, Peter
(1988), "A Log Linear Model for Predicting Magazine Audiences,"
Journal of Marketing Research 25 (4), 356-62.
_________ (1989),
"An Approximate Log Linear Model for Predicting Magazine Audiences,"
Journal of Marketing Research 26 (4), 473-9.
_________ (1991),
"A Canonical Expansion Model for Multivariate Media Exposure Distributions:
A Generalization of the 'Duplication of Viewing Law'," Journal of
Marketing Research 28 (3), 361-67.
_________ (1992a),
"A Markov-Chain Model for Multivariate Magazine-Exposure Distributions,"
Journal of Business and Economic Statistics 10 (4), 401-7.
_________ (1992b),
"Some Statistical Modeling Problems in the Advertising Industry: A Look
at Media Exposure Distributions," The American Statistician 46,
254-60.
FAST (1999), "FAST
Principles of Online Media Audience Measurement," (URL: http://www.fastinfo.org/measurement/pages/index.cgi/audiencemeasurement).
Headen, Robert
S., Jay E. Klompmaker, and Roland Rust (1979), "The Duplication of Viewing
Law and Television Media Schedule Evaluation," Journal of Marketing
Research 16 (3), 333-40.
Hong, Jongpil (1998),
Advertising Media Models for Internet Reach/Frequency Estimation, Unpublished
doctoral dissertation, The University of Texas at Austin.
Ju, Kuen-Hee and
John Leckenby (1989), "Performance of a Simple Reach/Frequency Model,"
In Proceedings of the American Academy of Advertising, American Academy
of Advertising.
Kim, Heejin and
John Leckenby (1993), "A Test of the Canonical Expansion Reach/Frequency
Model," In Proceedings of the American Academy of Advertising,
American Academy of Advertising.
Kishi, Shizue and
John Leckenby (1981), "Error Factors in Exposure Distribution Models,"
In Proceedings of the American Academy of Advertising, American Academy
of Advertising.
_________ and _________
(1982), "A Test of the Direct/Indirect BBD and Other Exposure Distribution
Models," In Proceedings of the American Academy of Advertising,
American Academy of Advertising.
Leckenby, John
and Marsha Boyd (1984), "An Improved Beta Binomial Reach/Frequency Model
for Magazines," Current Issues and Research in Advertising, 1-24.
_________ and Jongpil
Hong (1998), "Using Reach/Frequency for Web Media Planning," Journal
of Advertising Research 38 (1), 7-20.
_________ and Heejin
Kim (1992), "Unsolved Issues in Media Reach/Frequency Models," In
Proceedings of the American Academy of Advertising. American Academy
of Advertising, 100-106.
_________ and Shizue
Kishi (1982a), "Performance of Four Exposure Distribution Models,"
Journal of Advertising Research 22 (2), 35-44.
_________ and _________
(1982b), "How Media Directors View Reach/Frequency Estimation,"
Journal of Advertising Research 22 (4).
_________ and _________
(1984), "The Dirichlet Multinomial Distribution as a Magazine Exposure
Model," Journal of Marketing Research 21 (1), 100-106.
_________ and Marshall
Rice (1985), "A Beta Binomial Network TV Exposure Model Using Limited
Data," Journal of Advertising 14 (3), 25-31.
_________ and _________
(1986), "The Declining Reach Phenomenon in Exposure Distribution Model,"
Journal of Advertising 15 (3), 13-20.
_________ and Nugent
Wedding (1982), Advertising Management, Columbus, OH: Grid Publishing
Inc..
Media Metrix Inc.
(1998), "A Comparison of World Wide Web Audience Estimates Using Two
Different Approaches," (URL: http://www.rkinc.com/Methodology/Convergence.html).
Novak, Thomas and
Donna Hoffman (1996), "New Metrics for New Media: Toward the Development
of Web Measurement Standards," (URL: http://www2000.ogsm.vanderbilt.edu/novak/Web.standards/Webstand.html).
Pavlik, John V.
(1996), New Media Technology: Cultural and Commercial Perspectives,
Boston, MA: Allyn & Bacon.
Rice, Marshall
and John Leckenby (1984), "Predicting Within-Vehicle Television Duplication,"
In Proceedings of the American Academy of Advertising, American Academy
of Advertising.
Rust, Ronald (1985),
"Selecting Network Television Advertising Schedules," Journal
of Business Research 13, 483-94.
_________ and Jay
Klompmaker (1981), "Improving the Estimation Procedure for the Beta Binomial
TV Exposure Model," Journal of Marketing Research 18 (4), 442-8.
_________ and Robert
Leone (1984), "The Mixed Media Dirichlet Multinomial Distribution: A
Model for Evaluating Television-Magazine Advertising Schedules," Journal
of Marketing Research 21 (1), 89-99.