What do we call the characteristic of the individual to be measured or observed?

Notes on Topic 1:
Basic Statistical Concepts

    Statistics, Science, and Observations

       ScienceScience is based on the empirical method for making observations - for systematically obtaining information. It consists of methods for making observations.


      Observations are the basic empirical "stuff" of science.


      Statistics is a set of methods and rules for organizing, summarizing and interpreting information.

      The methods and rules enable scientific researchers to describe and analyze the observations they have made. Statistical methods are tools for science.

      Science consists of methods for making observations;
      Statistics consists of methods for describing and analyzing the observations.

      Here are some of the "observations" we gathered in the survey we did on the first day of class in 1997 and 1998.

      Populations & Samples

      PopulationsA population is the set of all individuals of interest in a particular study. We will also refer to populations of scores.


      A sample is a set of individuals selected from a population, usually intended to represent the population in a study. We will also refer to samples of scores.

      The data we gathered in class are a "sample" of scores obtained with a sample of individuals. The population we sampled from is the population of UNC undergraduates.


      A Parameter is a value, usually a numerical value, that describes a Population. A Parameter may be obtained from a single measurement, or it may be derived from a set of measurements from the Population.


      A Statistic is a value, usually a numerical value, that describes a Sample. A Statistic may be obtained from a single measurement, or it may be derived from a set of measurements from the Sample.

      Here are some "statistics" computed from our sample of data:


      Data (plural) are measurements or observations. A data set is a collection of measurements or observations. A datum (singular) is a single measurement or observation and is commonly called a data-value, a score, or a raw score.

      Descriptive Statistics

      Descriptive Statistics are statistical procedures used to summarize, organize and simplify data. It is also the branch of statistical activity focusing on the use of such procedures. These procedures are the focus of chapters 1 through 5.

      Statistical Visualization

      Recently developed computational statistical procedures used to visually summarize, organize and simplify data. The statistical system we are using is named ViSta for "Visual Statistics", because it includes statistical visualiation.

      A statistical visualization of our data is shown below. It shows the relationship between GPA and Satisfaction with the UNC experience. Higher satisfaction is associated with higher GPA.

      Exploratory Statistics

      The process of exploring data by using descriptive and visualization methods to "see what the data seem to say". The branch of statistics that focuses on "seeing what the data seem to say" (Tukey, 19??).

      Inferential Statistics

      Inferential Statistics consist of techniques that allow us to study samples and then to make generalizations about the populations from which the samples were selected. It is also the branch of statistical activity focusing on the use of such procedures. These procedures are the focus of chapters 8 through the remainder of the text. The groundwork for statistical inference is laid in chapters 6 and 7.

      Sampling Error

      Sampling error is the discrepency, or amount of error, that exists between a sample statistic and the corresponding population parameter.

      The Scientific Method and the Design of Experiments

      Science attempts to discover orderliness in the universe - to discover regularity in changes. Something that can change is called a variable.


      A variable is a characteristic or condition that changes or has different values for different individuals. In the data we gathered, the variables include "Gender", "Age", etc.

      A constant is a characteristic or condition that does not vary, and is the same for every individual.

      The Correlational Method

      The scientific method in which two (or more) variables are observed without manipulation (i.e., as they exist naturally) to see if there is any relationship between them.

      The correlational method cannot establish cause-and-effect: Correlation is not causation!

      The data we gathered are an example of the correlational method. We can say that "Higher satisfaction is associated with higher GPA", but we can't say that "Higher GPA causes higher satisfaction" (or the converse).

      The Experimental Method

      The scientific method which can establish a cause-and-effect relationship between two (or more) variables. Some important points:
      1. The researcher manipulates one variable and observes what happens on the other. More than one variable may be manipulated or observed.
      2. To correctly establish cause-and-effect, the researcher must exercise some control over the experimental situation to ensure that some other variable(s) do(es) not influence the relationship being watched.
      3. Random Assignment can be used to eliminate other variables' influence on results.
      4. The experimental conditions must be identical, other than differing on values of the manipulated variable.

      Independent Variable (also called the predictor variable)

      The variable which is manipulated by the researcher. Dependent Variable (also called the response variable)The variable which is observed by the researcher for changes in order to access the effect of the treatment. (The treatment is the manipulation of the predictor variable). Confounding VariableAn uncontrolled variable that is unintentionally allowed to vary systematically with the independent variable. Confounds the results (bad, bad, bad!).

      The control group

      This is a condition of the independent variable that does not receive the experimental treatment. Usually, the control group receives either no treatment or a placebo treatment. The experimental groupThis is a condition of the independent variable that does receive an experimental treatment. There may be several experimental groups.

      The Quasi-Experimental Method

      Examines differences between pre-existing groups of subjects (such as men vs. women) or differences between groups of scores obtained at different times (before and after treatment).


      A hypothesis is a prediction about the outcome of an experiment. In experimental research, a hypothesis makes a prediction about how the manipulation of the independent (predictor) variable will affect the dependent (response) variable.


    Data are measurements of observations which involve categorizing, ordering or using number to characterize amount. Several levels of measurement are involved. These in turn determine what statistics can be computed. Measurements may also be discrete or continuous.

      1. Scales (Levels) of Measurement
      2. Nominal

        The nominal level of measurement labels observations so that they fall into different categories. Football jersey numbers and home street addresses are common examples.

        In ViSta, nominal variables are called "Category" variables.


        The ordinal level of measurement consists of categories that are ordered in a sequence. Order of finish in a race is a common example.

        In ViSta, ordinal variables are called "Ordinal" variables.


        The interval level of measurement consists of ordered categories where all of the categories are intervals of exactly the same size. Temperature is a common example. Here, equal differences between numbers reflect equal differences in magnitude of the observed variable.


        The ratio level of measurement is an interval scale with an absolute zero point. Length and weight are common examples. Here, ratio of numbers reflect ratios of variable magnitude.

        In ViSta, interval and ratio variables are called "Numeric" variables.

      3. Discrete and Continuous Variables
      4. DiscreteA discrete variable has separate, indivisible categories. No values can exist in between two neighboring categories. ContinuousA continuous variable has an infinite number of possible values falling between any two observed values.

      Mathematical Notation

      In statistical calculations you will constantly be required to add a set of values to find a specific total. We use algebraic expressions to represent the values being added. For example
      X means "Scores on a Variable.
      For example X = [1 2 3] refers to a variable with three observations which are 1, 2, and 3."
      We will use the greek letter Sigma to signify the summation process. Thus, we write

      Note that
      1. All calculations within parentheses are done first.
      2. Squaring, multiplying, and dividing are done second, and should be completed in order from left to right.
      3. Adding and subtracting (including summation) are third, and should be completed in order from left to right.

      The following term, which is called the "squared sum" works as shown:

      Because of the order of operations, the following term, which is called "the sum of squares", works as shown:

      Consider how the following summation equation works:

      On the other hand, the next summation equation works differently:

      Finally, consider how this last summation equation works:

    What is being measured or observed?

    The dependent variable is the variable that is being measured or tested in an experiment.

    What are observable and measurable characteristics?

    The answer is A) phenotype. A measurable or observable trait or characteristic of an organism is called a phenotype.

    What is the characteristic of interest being measured?

    A variable is a characteristic of interest to be measured for each unit in the sample.

    What is a measurable characteristic of a sample called?

    A measurable characteristic of a population, such as a mean or standard deviation, is called a parameter; but a measurable characteristic of a sample is called a statistic.


    Neuester Beitrag
