XLStat - Descriptive Statistics

Use this module to calculate a set of descriptive statistics for one or several categorical or quantitative variables, and to create graphical or semi-graphical displays used for exploratory data analysis.

List of statistics calculated for quantitative data (descriptors that take weighting into account are shown in bold):

  • No. of values used: number of values actually used in calculations, i.e. non-missing values with a weight not equal to 0
  • No. of values ignored: number of values ignored during calculations, i.e. missing values or values with a weight of 0
  • No. of min. val.: number of values equal to the minimum value
  • % of min. val.: percentage of the number of values equal to the minimum value
  • Minimum: minimum value
  • 1st quartile: value below which 25 % of the data are located
  • Median: value below which 50 % of the data are located
  • 3rd quartile: value below which 75 % of the data are located
  • Maximum: maximum value
  • Range: difference between the maximum and the minimum
  • Sum of the weights: for weighted data, the sum of the weights for values used in calculations
  • Total: sum of the values (may be weighted)
  • Mean: sum of the values (may be weighted), divided by the number of values used, or by the sum of the weights if the data are weighted
  • Geometric mean: mean that is barely affected by high values. The geometric mean is not defined for data containing negative or null values
  • Harmonic mean: mean that is barely affected by a few values that are much higher than the others, but is sensitive to much smaller values
  • Kurtosis (Pearson): coefficient that represents the peaked or flattened shape of a distribution compared to a Gaussian distribution
  • Skewness (Pearson): coefficient that represents the degree of skewness for a distribution compared to its mean
  • Kurtosis: kurtosis coefficient as calculated by Excel
  • Skewness: skewness coefficient as calculated by Excel
  • CV (standard deviation/mean): variation coefficient that measures the relative dispersion, obtained by dividing the standard deviation by the mean. This coefficient allows you to compare the dispersion of variables that have different units, or that have very different means
  • Sample variance: variance of the data, (in case of unweighted data, the denominator is n, i.e. the size of the sample)
  • Estimated variance: estimation of the variance for a population whose data makes up a sample (unbiased estimator: in case of unweighted data, the denominator is n-1, with n the size of the sample)
  • Standard deviation of a sample: square root of the variance of the data
  • Estimated standard deviation: square root of the estimation of the variance for the source data population
  • Mean absolute deviation: dispersion measure that indicates the average of the absolute values of the deviations for each value are compared to the mean
  • Standard deviation of the mean: square root of the ratio of the estimated variance to the number of values used in the calculation

    Charts created for quantitative variables

  • box plots
  • univariate scattergrams
  • collection of bivariate scattergrams
  • Q-Q plots
  • p-p plots
  • stem and leaf plots

    List of statistics calculated for categorical data

    Summary for all variables:
  • No. of categories: number of categories for the variable
  • Mode: the category that occurs most often, or that has the highest weight (if the data are weighted)
  • Mode frequency: for non-weighted data, frequency of the mode
  • Mode weight: for weighted data, weight of the mode
  • % mode: percentage of the mode
  • Rel. freq. mode: relative frequency of the mode


    Statistics table for each variable:
  • Frequency: for unweighted data, frequency of the category
  • Weight: for weighted data, weight of the category
  • %: percentage of the category
  • Rel. freq.: relative frequency of the category


    Charts created for categorical variables

  • histograms
  • pie charts

Copyright © 2008 Kovach Computing Services, Anglesey, Wales. All Rights Reserved. Portions copyright Addinsoft, Provalis Research, and Data Description Inc.

Last modified 25 January, 2008