Internet Reach/Frequency Estimation Accuracy by Data Collection Method

by


Hyo-Gyoo Kim

Doctoral Student
hgkim@mail.utexas.edu
http://www.ciadvertising.org
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712

And

John D. Leckenby

Everett D. Collier Centennial Chair in Communication
john.leckenby@mail.utexas.edu
http://www.ciadvertising.org
Department of Advertising
College of Communication
The University of Texas at Austin
Austin, Texas 78712

 

paper to be presented to

2000 Annual Conference

American Academy of Advertising

Newport, Rhode Island

April 2000


 

Internet Reach/Frequency Estimation Accuracy by Data Collection Method

Abstract

    Six exposure distribution models usually used for magazines and television are tested and compared on their performance in the estimation of the exposure distributions and reach across two different Web audience data sets. Results of testing reach/frequency estimation methods show that, for the six models studied, all except one estimate reach/frequency are within acceptable limits of error. This finding verifies again Leckenby and Hong 's previous application (1998) of the existing exposure distribution models to the Web audience. Then, the relationship in terms of two kinds of error terms between two data sets across six models is investigated. Each of their correlations across two data sets is very weak and produces errors differently, according to the data sets. Additional MANOVA test results revealed again that the data collection method and test-model can also produce different error sizes in exposure distribution models. 

 

1.             Introduction

Media exposure estimation is important in that it offers the basis for calculating the number of prospects reached with media vehicles at each level of frequency of exposure. This information may then be used by planners to compare alternative media schedules as regards "effective" reach, as well as other criteria, such as cost per thousand (CPM).

As shown in the results of Leckenby and Kim' s survey (1994), it is an ordinary procedure for media planners to employ one or more reach/frequency exposure distribution models in their decision-making in the development of media schedules. In this survey, the majority (87.9%) of the respondents' agencies used at least one or more computerized models to estimate reach and frequency distribution.

Furthermore, an examination of the relationship between the history of media development and estimation models shows the main concepts of the previous models were adapted to each newly developed medium.  Continuing such adaptation, researchers have now tried to apply reach and frequency estimation models to the new Web medium. Leckenby and Hong (1998) examined the possibility of applying six existing media estimation models to the Web. They revealed a high feasibility of application for the existing estimation models to the Web and found five out of six models ranged within an acceptable error level, compared to other previous studies.

But at this time the direct application of the existing estimation models to the Web has some limitations due to Web characteristics. The Web is somewhat different from previous media in terms of "interactivity." Still, numerous conflicts exist in the definition of the Web either as a one-to-one or a many-to-many medium (Novak and Hoffman, 1996; Pavlik, 1996). If the former view is followed, the importance of the reach/frequency estimation model diminishes. But if the latter view is followed, the reach/frequency estimation model gains power. Additionally, on the Web, one can access the medium and/or the advertisements as much as desired, as in magazines and newspapers, but unlike television and radio, where advertisements are presented only when the media are accessed. Here the "duration" concept will be another problematic.

At this point, the definitive view of what Web advertising is and what should be measured must be decided. If Web advertising's objective is to generate immediate sales or to draw certain actions like click, the reach/frequency schedule of sites may not be critical. In this case, researchers should measure the direct relationship between the amount of advertising budget and sales, or the "click-through" rate. But if the goal is linked to "brand building," then reach/frequency information might gain great importance.

Therefore, the issues regarding "interactivity"--or whether the audience clicked certain advertisements or not--and "duration"--or how long the consumers are seeing the advertisements--are other dimensions to be solved, but these are beyond the scope of this research.

First, this study will expand the application scope of existing reach/frequency estimation models to include the Web medium, as a follow-up comparison to Leckenby and Hong's (1998) study. Six estimation models will be examined, using two data sets with exactly the same schedule.

Second, this study will investigate the relationship between two kinds of Web audience data sets in terms of errors in exposure distribution and reach estimates. This commission provides the opportunity to devise a better idea for data collection methods on the Web.

2.             Literature Review

Most previous studies regarding media reach and frequency exposure distribution models have focused on the creation of new models or on improving existing models, comparing different distribution models. There has been considerable research conducted on the development of reach/frequency estimation models for different media (Boyd and Leckenby, 1985; Chandon, 1976; Danaher, 1988, 1989, 1991, 1992a; Headen et al., 1979; Ju and Leckenby, 1989; Kim and Leckenby, 1993; Leckenby and Boyd, 1984; Leckenby and Hong, 1998; Leckenby and Kishi, 1982, 1984; Leckenby and Rice, 1985; Lee, 1988; Rice and Leckenby, 1986; Rust, 1985, 1986; Rust and Leone, 1984). Leckenby and Rice (1986) tried to explain the "declining reach" phenomenon in exposure distribution models. Recently, as was mentioned earlier, Leckenby and Hong (1998) tried to apply previous distribution models to Web audience estimation for the first time. Their results have opened the possibility of applying existing estimation models to the newly developed Web as an advertising medium.

There have been numerous developments in reach and frequency estimation models throughout those studies, but at the same time, it can be seen that two main issues are critical regarding reach and frequency exposure distribution model testing. Most models were tested on only one data set, partly because of the availability of data sets, and partly because of time consumption in data tabulation. Another problem that prevents comparability of results across studies is the lack of agreement on standard error definitions by which model performance is judged in the tests. Therefore, when an attempt is made to determine which model is superior in performance to another, it becomes extremely difficult because authors use different error criteria (Leckenby and Ju, 1990). These are some of the error definitions that have been used by various researchers to assess model performance: Average Error in Reach, Average Error in the Distribution, Average Percentage Error in Distribution, 95 Percent Confidence Interval, +/- Percent of Actual Reach, Total Error in Distribution, Number of Over, Under and Within +/- 5 Percent at Each Exposure Level.

The use of different data sets across tests poses one set of problems, but the use of different error criteria across tests makes it impossible to make any comparison except on a rank order basis.

3.             Methodology

Broadly speaking, the two most used methods of measurement on the Web are user-centric and site-centric systems. The main difference between these two approaches lies at the source of data collection. A site-centric tracking system collects data from the site server's log. Therefore, site-centric measures do not provide unduplicated counts of individuals, generally. Meanwhile, a user-centric system collects data from the personal computers of its sample participants. Since unique users of a site are required for reach and frequency estimation, user-centric data is appropriate. Recently, the new concept of an ad-centric method was developed, but its concept is almost the same as site-centric except for the server used (FAST, 1999).

In this study, the authors used two kinds of Web audience data sets, collected by two companies, Media Metrix (MM) and Relevant Knowledge (RK). These two companies were recently merged into MM, and they are well-known, user-centric, Web data collection companies. During the merging period, the two companies had to decide on one method to collect data, because each company had used a different data collection method. MM used software-based data collection and RK used a JAVA-based collection method. With MM, participants have to install software to log their Web behavior and have to send their data to the company by mail on a monthly basis. MM basically tried to emulate the A.C. Nielsen Company in its data collection system of TV users. MM's first tracking system name was "PC Meter" after Nielsen's "People Meter."

Conversely, RK's participants did not have to send their data by the traditional delivery mail system because the JAVA-based program could retrieve the participants' Web behavior daily automatically.  

Data: MM's data were collected during two weeks of March 1997. The sample of 7,162 respondents consisted of 4,560 (63.7%) males and 2,602 (36.3%) females. A total of 5,407 people were measured in Week 1, while 5,530 were measured in Week 2. The people duplicated in the measurement periods amounted to 3,775.

The data set for RK contains the records of 725 unique visitors who were measured on a daily basis during September and October 1997. Those 725 unique visitors visited the top 25 sites, producing 13,039 visits. The respondents consisted of 447 (61.7%) males and 278 (38.3%) females.

Models: Six existing models will be used as the estimation methods in this study: Binomial Distribution (BIN), Beta Binomial Distribution (BBD), Conditional Beta Distribution (CBD), Morgenzstern's Sequential Aggregation Distribution (MSAD), Dirichlet Multinomial Distribution (DMD), and Hofmans Beta Binomial Distribution (HBBD). These six models have been broadly studied in academic areas and examined in Leckenby and Hong's (1998) study.

Schedule: This study examined error factors in multi-vehicle, two-insertion advertisements on a total of 560 Web schedules for each of the six models. Vehicles were randomly selected from top-visited Web sites of those times. Forty completely randomized schedules were developed for each of fourteen schedule sets, where a set consisted of from two to fifteen vehicles. The number of insertions in each vehicle was limited to two, since observed data were available only for one and two insertions. This provided exposure distributions in a size range from four insertions total to thirty insertions total.

Definition of error: Estimation accuracy can be a critically important attribute of exposure distribution models. Accuracy has been defined as the ability of an exposure distribution model to faithfully reproduce observed distributions as derived from sample data.

But in evaluating performances of different models, as we mentioned earlier, their accuracy depends on the manner in which error is defined in the study. In this study, two different error factors, average error in reach estimation (AER) and average error in the exposure distribution (APE), were adopted from previous studies (Danaher, 1992; Kishi and Leckenby, 1982; Leckenby and Kishi, 1984).

The error in the reach estimates for the test schedules was defined as the absolute difference between the observed and the estimated reach in terms of percentage. The error in reach exposure level was simply defined as the absolute differences between the observed and the estimated frequencies. The detailed error definition will be found in previous studies (Leckenby and Kishi, 1982; Leckenby and Hong, 1998).

4.             Results

Reach Estimation: The performance of each of the six models was compared on their accuracy of reach estimation. Table 1 contains the average percentage errors in reach (AER) of each model across the 560 schedules. As can be observed in Table 1, the CBD, MSAD, BBD, and HBBD models outperformed the other two in reach estimation. Those four models were within 5% of error.

Generally, these results are consistent with those found in Leckenby and Hong's study (1998). In both studies, the BIN model performed worst, but the others were within the acceptable limits of error. When only RK's results are considered, its rank order was identical with the previous study, which had used MM's data.

This result verifies again the application of existing estimate models to the Web medium.

Exposure Distributions: The average percentage error in exposure distribution (APE) across 560 schedules was obtained for the six models by dividing the sum of the percentage errors by 560.

As is shown below, all models had relatively large errors. Among the six models tested, the CBD and MSAD models performed better than the other four, with average percentage errors in exposure distributions of 15.59% and 16.74%, respectively.

These results revealed some differences with the previous study (Leckenby and Hong, 1998): First, the error sizes were bigger in this study, compared to the previous ones, except for MSAD; second, the best-performing order was also different except for the BIN model, which had the biggest error size in both studies.

But at the same time, the APE of five out of the six models, except for BIN, was within acceptable limits of error in both studies.

Table 1. Average Percentage Error in Reach (AER) and Average Percentage Error in Exposure Distribution (APE)  (N=560) 

Model

Data

AER

APE

Mean

Std. Dev.

Mean

Std. Dev.

BBD

RK

5.5

4.2

4.8

37.8

27.6

15.0

MM

2.8

2.2

17.3

10.0

BIN

RK

20.5

20.0

9.3

50.0

44.6

24.4

MM

19.4

6.7

39.2

15.5

CBD

RK

4.8

3.6

4.8

21.4

15.6

11.0

MM

2.4

1.7

9.8

5.1

DMD

RK

6.8

6.0

10.3

30.0

25.8

17.1

MM

5.2

7.2

21.6

14.6

HBBD

RK

3.9

4.3

2.8

36.0

27.0

15.1

MM

4.7

3.6

17.9

9.2

MSAD

RK

5.7

4.2

6.1

22.1

16.7

11.0

MM

2.5

2.4

11.3

6.3



It can be seen that generally MM's data had smaller error size except the AER for the HBBD model. And all of the differences of the six models in two error terms were statistically significant at the probability level of .05. This result can be understood partly due to the sample size, i.e., MM's data is almost ten times bigger than RK's, and partly due to the different data collection method. This can be investigated more deeply in the following analysis:

Correlation Test: Table 2 shows a total of 12 correlation coefficients of each of the 6 models in terms of average error in reach (AER) and average error in exposure distribution (APE).

Generally, the correlations between each model's errors were very low. The highest correlation of average error in reach (AER) lies between DMD models, and its correlation coefficient is .310. And the highest correlation of average error in exposure distribution (APE) is .335 between RK's and MM's data in the BIN model. But, generally, the correlation coefficient of .3 is not considered a large one. The correlations of the rest of the pairs are negligible, ranging from .10 to .24, or not significant at the probability level of .05.

  Table 2. Correlations Between RK's and MM's Data in Terms of Two Kinds of Error, AER and APE   (N=560) 

Model
BBD
BIN
CBD
DMD
HBBD
MSAD
Correlation
AER
.10*
.13*
.01
.31*
.05
.04
APE
.07
.34*
.11*
.20*
-.06
.24*

                                     * p ≤ .05

Given the paper on the Media Metrix's Web site (Media Metrix, 1998), these results are somewhat interesting. They stressed that the two companies', Media Metrix and Relevant Knowledge, audience estimations were almost the same, even though different methods were used to collect data. They revealed the correlation coefficient of the ratings between the two services to be .989.

This is quite a different story from this study. Two data sets were compared in terms of two kinds of errors, AER and APE, in six reach and frequency models. It is unclear which kinds of definitions were used in that paper, but it seems plausible that the differences in results came from the different error definitions.

MANOVA Test: The MANOVA test was used to see the relationship among the data collection methods (Methods) and the models used (Models) and two kinds of errors (AER and APE).

 Table 3. Standardized Coefficients and Inferential Statistics

MANOVA

Wilks’

Lambda

F

(d.f.)

Std. Discriminant Function Coefficients

F

Methods

AER

.77

1006.31*

(1, 6708)

-.65

143.50*

APE

1.33

1568.00*

Models

AER

.42

702.72*

(5, 6708)

1.11

1346.37*

APE

-.17

639.4

Method

* Models

AER

.92

53.50*

(5, 6708)

-1.18

16.04*

APE

1.33

33.68*

                                   * p ≤ .05


In Table 3, Wilks’ Lambda and the F test show that "Methods," "Models," and "Methods by Models" are all significant at the probability level of .05. This can be interpreted thus: The "Methods" used affects the mean of each error and the "Models" used also affect the mean of the errors. The interactive effect of "Methods by Models" also affects each error.

Each factor’s standardized coefficients show which dependent variable contributes to the overall differences. In the "Methods" factor, the 'APE (1.33)' variable impacts more than the 'AER (-.65)' variable on overall differences. In the "Models" factor, the result was the opposite. The 'AER (1.11)' variable contributes more than the 'AER (-.17)' variable.

When considering the "Models" and "Methods" at the same time, both of the 'AER (-1.18)' and 'APE (1.33)' variables are almost equally impacted overall differences.

5.             Conclusion

The purpose of this research is to compare two differently collected data in terms of errors in six different reach and frequency estimation models on Web schedules. Well-known, user-centric data collection companies', Relevant Knowledge and Media Metrix, data were used for this study. 560 Web media schedules were constructed randomly, and multi-vehicle with two advertisement insertions used for each vehicle.

For this purpose, the performance of six exposure distribution models was tested across exposure schedules and also the accuracy of their reach estimates was evaluated. Five out of six models, mainly used for traditional media, i.e., magazines or television, were successfully adaptable to the Web medium. This result verified again the previous study by Leckenby and Hong (1998).

Second followed the examination of the relationship among the data collection method and test models and two errors, the average percentage error in reach (AER) and the average percentage error in exposure distribution (APE).

The correlation coefficients were very low, generally. This implies that each model across 560 media schedules produces errors differently, according to the data, even though on the same schedules. Here it can be assumed that a certain kind of data collection method would be better in estimating reach and frequency in media schedules.

Finally, those relationships were examined again, using the MANOVA technique. Here it was found that the data collection method and test-model can also produce different error sizes.

It is clear, however, that this research has some limitations and simultaneously, suggestions for further studies. Even though two data sets were used to compare the media models, the time and period of data collection, the definition of universe, and especially the sample size were different from each other. Therefore, those elements could affect the conclusion that different data collection method can affect reach and frequency estimation on the Web. The definition of error will be also problematic but, it is believed, this should be decided according to the purpose of the research and the media plan.

Bibliography

 Boyd, Marsha and John Leckenby (1985), "Random Duplication in Reach/Frequency Estimation," Current Issues and Research in Advertising, 96-113.

 Chandon, Jean-Louis (1986), A Comparative Study of Media Exposure Models, New York, NY: Garland Publishing, Inc.

 Danaher, Peter (1988), "A Log Linear Model for Predicting Magazine Audiences," Journal of Marketing Research 25 (4), 356-62.

 _________ (1989), "An Approximate Log Linear Model for Predicting Magazine Audiences," Journal of Marketing Research 26 (4), 473-9.

 _________ (1991), "A Canonical Expansion Model for Multivariate Media Exposure Distributions: A Generalization of the 'Duplication of Viewing Law'," Journal of Marketing Research 28 (3), 361-67.

 _________ (1992a), "A Markov-Chain Model for Multivariate Magazine-Exposure Distributions," Journal of Business and Economic Statistics 10 (4), 401-7.

 _________ (1992b), "Some Statistical Modeling Problems in the Advertising Industry: A Look at Media Exposure Distributions," The American Statistician 46, 254-60.

 FAST (1999), "FAST Principles of Online Media Audience Measurement," (URL: http://www.fastinfo.org/measurement/pages/index.cgi/audiencemeasurement).

 Headen, Robert S., Jay E. Klompmaker, and Roland Rust (1979), "The Duplication of Viewing Law and Television Media Schedule Evaluation," Journal of Marketing Research 16 (3), 333-40.

 Hong, Jongpil (1998), Advertising Media Models for Internet Reach/Frequency Estimation, Unpublished doctoral dissertation, The University of Texas at Austin.

 Ju, Kuen-Hee and John Leckenby (1989), "Performance of a Simple Reach/Frequency Model," In Proceedings of the American Academy of Advertising, American Academy of Advertising.

 Kim, Heejin and John Leckenby (1993), "A Test of the Canonical Expansion Reach/Frequency Model," In Proceedings of the American Academy of Advertising, American Academy of Advertising.

 Kishi, Shizue and John Leckenby (1981), "Error Factors in Exposure Distribution Models," In Proceedings of the American Academy of Advertising, American Academy of Advertising.

 _________ and _________ (1982), "A Test of the Direct/Indirect BBD and Other Exposure Distribution Models," In Proceedings of the American Academy of Advertising, American Academy of Advertising.

 Leckenby, John and Marsha Boyd (1984), "An Improved Beta Binomial Reach/Frequency Model for Magazines," Current Issues and Research in Advertising, 1-24.

 _________ and Jongpil Hong (1998), "Using Reach/Frequency for Web Media Planning," Journal of Advertising Research 38 (1), 7-20.

 _________ and Heejin Kim (1992), "Unsolved Issues in Media Reach/Frequency Models," In Proceedings of the American Academy of Advertising. American Academy of Advertising, 100-106.

 _________ and Shizue Kishi (1982a), "Performance of Four Exposure Distribution Models," Journal of Advertising Research 22 (2), 35-44.

 _________ and _________ (1982b), "How Media Directors View Reach/Frequency Estimation," Journal of Advertising Research 22 (4).

 _________ and _________ (1984), "The Dirichlet Multinomial Distribution as a Magazine Exposure Model," Journal of Marketing Research 21 (1), 100-106.

 _________ and Marshall Rice (1985), "A Beta Binomial Network TV Exposure Model Using Limited Data," Journal of Advertising 14 (3), 25-31.

 _________ and _________ (1986), "The Declining Reach Phenomenon in Exposure Distribution Model," Journal of Advertising 15 (3), 13-20.

 _________ and Nugent Wedding (1982), Advertising Management, Columbus, OH: Grid Publishing Inc..

 Media Metrix Inc. (1998), "A Comparison of World Wide Web Audience Estimates Using Two Different Approaches," (URL: http://www.rkinc.com/Methodology/Convergence.html).

 Novak, Thomas and Donna Hoffman (1996), "New Metrics for New Media: Toward the Development of Web Measurement Standards," (URL: http://www2000.ogsm.vanderbilt.edu/novak/Web.standards/Webstand.html).

 Pavlik, John V. (1996), New Media Technology: Cultural and Commercial Perspectives, Boston, MA: Allyn & Bacon.

 Rice, Marshall and John Leckenby (1984), "Predicting Within-Vehicle Television Duplication," In Proceedings of the American Academy of Advertising, American Academy of Advertising.

 Rust, Ronald (1985), "Selecting Network Television Advertising Schedules," Journal of Business Research 13, 483-94.

 _________ and Jay Klompmaker (1981), "Improving the Estimation Procedure for the Beta Binomial TV Exposure Model," Journal of Marketing Research 18 (4), 442-8.

 _________ and Robert Leone (1984), "The Mixed Media Dirichlet Multinomial Distribution: A Model for Evaluating Television-Magazine Advertising Schedules," Journal of Marketing Research 21 (1), 89-99.