Category Archives: MySQL

Adding median() to MySQL

For long-tailed data, the median is often more appropriate than the mean for characterizing central tendency. MySQL (and most relational databases) don’t have a built-in aggregate function for median like they do for mean, which MySQL calls avg(). Of course, smart people on the Internet have written User-Defined Functions (UDFs) that add this functionality.

Originally, someone posted this collection of statistics-related UDFs for MySQL 4, but they don’t work with MySQL 5.1, the current stable version. However, someone updated the median() function to make it compatible with version 5. The instructions at that link work for MySQL on the Mac with a couple modifications.

Here’s how to make it work:

  1. Download udf_median.cc.
  2. You probably need to have installed Apple’s Developer Tools (Xcode, etc.) to compile the UDF. See below, though — if you are running exactly the same version of MySQL as me, you can just download my compiled version.

    If you do have Xcode installed, you can open Terminal, navigate to udf_median.cc’s directory, and compile it with this command:

    g++ -o udf_median.so -shared -I/usr/local/mysql/include udf_median.cc
  3. Put the resulting shared library (udf_median.so) into MySQL’s lib/plugin directory:

    sudo cp udf_median.so /usr/local/mysql/lib/plugin/
  4. Connect to MySQL, probably as root. Run this query to load the function:

    CREATE AGGREGATE FUNCTION median RETURNS REAL SONAME 'udf_median.so';

If you’re running the 64-bit Intel (aka x86_64 or x64) Mac version of MySQL 5.1.40, you can also download my compiled udf_median.so and skip ahead to the cp step.