Example: Basic Statistical Functions for Arrays

In these notes we provide some functions that calculate some basic statistics on arrays of doubles, namely:

  • helper functions for neatly printing an array of doubles
  • min/max
  • mean (i.e. average)
  • variance; informally, variance gives a measure for a set of numbers of how far away they are from the average
  • standard deviation; this is the square root of the variance, and we calculate that with the sqrt function from C’s standard math library, math.h

The assert macro is used in most of the following functions to ensure that the size of the passed-in array makes sense. If the expression passed to assert is true, then it does nothing. But if the expression is false, then it crashes the program on purpose with an error message.

We use assertions to check that something we think is true at a certain point in the program really is true.

Source Code

#include <stdio.h>
#include <math.h>
#include <assert.h>

// Prints the given array C-style, e.g. { 3.2, 1.9, -4.5 }.
// Careful: all values are printed with a single decimal point.
void print_array(double* arr, int n) {
    assert(n >= 0);
    if (n == 0) {
        printf("{}");
    } else if (n == 1) {
        printf("{ %.1f }", arr[0]);
    } else { // n >= 2
        printf("{ %.1f", arr[0]);
        for(int i = 1; i < n; i++) {
            printf(", %.1f", arr[i]);
        }
        printf(" }");
    }
}

// Same as print_array, but prints a newline at the end.
// Careful: all values are printed with a single decimal point.
void println_array(double* arr, int n) {
    print_array(arr, n);
    printf("\n");
}

// Returns the smallest value in arr.
// Assumes arr has at least 1 element.
double min(double* arr, int n) {
    assert(n > 0);
    double minSoFar = arr[0];
    for(int i = 1; i < n; i++) {
        if (arr[i] < minSoFar) {
            minSoFar = arr[i];
        }
    }
    return minSoFar;
}

// Returns the largest value in arr.
// Assumes arr has at least 1 element.
double max(double* arr, int n) {
    assert(n > 0);
    double maxSoFar = arr[0];
    for(int i = 1; i < n; i++) {
        if (arr[i] > maxSoFar) {
            maxSoFar = arr[i];
        }
    }
    return maxSoFar;
}

// Returns the sum of the elements in arr.
double sum(double* arr, int n) {
    assert(n >= 0);
    double result = 0.0;
    for(int i = 0; i < n; i++) {
        result += arr[i];
    }
    return result;
}

// Returns the average of the elements in arr.
double mean(double* arr, int n) {
    return sum(arr, n) / n;
}

// See: https://en.wikipedia.org/wiki/Variance#Discrete_random_variable
double variance(double* arr, int n) {
    double m = mean(arr, n);
    double result = 0.0;
    for(int i = 0; i < n; i++) {
        double t = arr[i] - m;
        result += t * t;
    }
    return result / n;
}

// Standard deviation is the square root of the variance.
// See: https://en.wikipedia.org/wiki/Standard_deviation#Discrete_random_variable
double std_dev(double* arr, int n) {
    return sqrt(variance(arr, n));
}

int main() {
    const int n = 4;
    double a[] = {6.4, 8, 7.2, 5.0};
    println_array(a, n);
    printf("\n");
    printf("     min  = %.1f\n", min(a, n));
    printf("     max  = %.1f\n", max(a, n));
    printf("    mean  = %.1f\n", mean(a, n));
    printf("variance  = %.1f\n", variance(a, n));
    printf("std. dev. = %.1f\n", std_dev(a, n));
}