Data collection can be categorized under one of two descriptions: observational studies and designed experiments. This article will examine the differences between the two, highlighting the advantages and disadvantages of both.
An observational study looks at what has happened without trying to influence it in any way. Observational studies identify the characteristics of a population and the relationships between variables in the population. Data collection tools such as interviews, questionnaires, and surveys would be used in an observational study.
Observational studies of entire populations can be costly, time consuming, and even impossible in many cases. Therefore, a sampling of the population is used instead. The goal is to gather data from the sample that can be estimated as true for the entire population. Different sampling techniques can be used. These include random sampling, stratified sampling, systematic sampling, and cluster sampling.
Random sampling is exactly what the name implies. A random sample of the population is chosen. There are various tools that can be used to pick random samples, including charts, tables, and graphing calculators. Stratified sampling involves “separating the population into nonoverlapping groups…kth individual from the population is chosen, and cluster sampling involves “selecting all individuals within a randomly selected collection or group of individuals” (Sullivan III, p. 25).
There are two types of errors that can occur in observational studies. The two types of errors are sampling errors, and nonsampling errors. Sampling errors are errors that occur because a sample was used to estimate information about an entire population. Nonsampling errors are errors that occur as a direct result of the survey process. Examples of nonsampling errors include:
– Nonresponse – one or more individuals in the sample do not respond to the interview, questionnaire, or survey.
– Respondent Dishonesty – one or more individuals in the sample do not supply truthful responses.
– Poor Design – the interview, questionnaire, or survey is ambiguous or designed poorly, resulting in questionable data.
– Incomplete Frame – the list of individuals in a population is not complete.
– Data Entry – errors in data entry, such as typographical errors.
(Sullivan III, pp. 31-33)
A designed experiment attempts to influence results through the application of a specific treatment, and then tries to isolate the effects on a specific variable. Designed experiments identify the cause of relationships between variables by controlling one or more of the variables.
There are six steps that must be followed when conducting an experiment: identify the problem, determine the factors that affect the response variable, determine the number of experimental units, determine the level of the predictor variables, collect and process the data, and test the claim (Sullivan III, pp. 36-37).
The most common problem with using designed experiments is ethics. There are many times that it is not ethical to conduct a particular experiment, so an observational study must be used instead. For example, the researcher may wish to establish whether there is a link between alcohol consumption and cirrhosis of the liver. In order to determine if there is a link, the researcher would have to use a designed experiment. However, it would be unethical to conduct this kind of experiment, because it could actually cause cirrhosis in the subjects. Therefore, the researcher would instead use an observational study that examines people who consume alcohol and monitors their rate of liver cirrhosis. The observational study can not determine if there is a link between alcohol consumption and liver cirrhosis, but it can determine whether the two are related.
Sullivan, III, M. “Statistics: Informed Decisions Using Data.” Upper Saddle River, NJ: Prentice Hall.