Simple Statistics with PHP

Update: Added hint to sort the array beforehand.

I work as researcher and one of my day to day duties is processing measurement results and producing some nice graphs to make the results visible and understandable. Although most of the processing is very specific to particular experiment, general statistics functions are needed ever so often.

Out of habit, I usually use PHP to process my log file. Although PHP was certainly not intended for such purpose, I tend to use it very often for small command line scripts that just would take too long to write and debug in bash. I know, that probably python is the scripting language of choice for these applications, but I did not find the time to dig into python far enough to quickly hack together the necessary scripts.

Anyway, to make life of others easier in the future, you can find the functions that I use to calculate the median, the average, the standard deviation as well as the different quartiles below. These functions expect a sorted array containing one measurement result per value and will calculate the respective statistics over all values in the array.

The way I calculate the quartiles is based on this german source:http://www.univie.ac.at/ksa/elearning/cp/quantitative/quantitative-86.html.

function Median($Array) {
  return Quartile_50($Array);
}

function Quartile_25($Array) {
  return Quartile($Array, 0.25);
}

function Quartile_50($Array) {
  return Quartile($Array, 0.5);
}

function Quartile_75($Array) {
  return Quartile($Array, 0.75);
}

function Quartile($Array, $Quartile) {
  $pos = (count($Array) - 1) * $Quartile;

  $base = floor($pos);
  $rest = $pos - $base;

  if( isset($Array[$base+1]) ) {
    return $Array[$base] + $rest * ($Array[$base+1] - $Array[$base]);
  } else {
    return $Array[$base];
  }
}

function Average($Array) {
  return array_sum($Array) / count($Array);
}

function StdDev($Array) {
  if( count($Array) < 2 ) {
    return;
  }

  $avg = Average($Array);

  $sum = 0;
  foreach($Array as $value) {
    $sum += pow($value - $avg, 2);
  }

  return sqrt((1 / (count($Array) - 1)) * $sum);
}

2 Comments

  1. Fred

    Thanks a lot. It what I was looking for. Actually I will use the Highcharts library to display the data. Thanks. Fred

    Reply
  2. Luke

    Thanks for this! However, unless I am mistaken, the Quartile function will not work properly unless you first sort the input array. This can be done before passing the array but I found it useful to just add sort($Array) into the Quartile function directly.

    Reply

Leave a Reply to Fred Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.