Nov
10
2009
Our research team at UC Berkeley wrote a paper this year about stated preferences for a mate and actual communication behavior in an online dating system. It’s called “Who’s right and who writes: People, profiles, contacts, and replies in online dating,” and I’ll be presenting it HICSS 43 in January. Here’s the abstract:
In this analysis of profiles and messaging behavior on a major online dating service, we find that, consistent with predictions of evolutionary psychology, women as compared to men state more restrictive preferences for their ideal date. Furthermore, women contact and reply to others more selectively than men. Additionally, we identify connections among messaging behavior, textual self-descriptions in dating profiles, and relationship-relevant traits such as neuroticism.
And here’s a figure from the paper showing what ages people say they’re seeking as well as the ages of the people they actually contact and reply to:

Nov
09
2009
For long-tailed data, the median is often more appropriate than the mean for characterizing central tendency. MySQL (and most relational databases) don’t have a built-in aggregate function for median like they do for mean, which MySQL calls avg(). Of course, smart people on the Internet have written User-Defined Functions (UDFs) that add this functionality.
Originally, someone posted this collection of statistics-related UDFs for MySQL 4, but they don’t work with MySQL 5.1, the current stable version. However, someone updated the median() function to make it compatible with version 5. The instructions at that link work for MySQL on the Mac with a couple modifications.
Here’s how to make it work:
- Download udf_median.cc.
-
You probably need to have installed Apple’s Developer Tools (Xcode, etc.) to compile the UDF. See below, though — if you are running exactly the same version of MySQL as me, you can just download my compiled version.
If you do have Xcode installed, you can open Terminal, navigate to udf_median.cc’s directory, and compile it with this command:
g++ -o udf_median.so -shared -I/usr/local/mysql/include udf_median.cc
- Put the resulting shared library (udf_median.so) into MySQL’s lib/plugin directory:
sudo cp udf_median.so /usr/local/mysql/lib/plugin/
- Connect to MySQL, probably as root. Run this query to load the function:
CREATE AGGREGATE FUNCTION median RETURNS REAL SONAME 'udf_median.so';
If you’re running the 64-bit Intel (aka x86_64 or x64) Mac version of MySQL 5.1.40, you can also download my compiled udf_median.so and skip ahead to the cp step.
Nov
06
2009
I’ve recently started working with R on 64-bit Macs with enough RAM to make a 64-bit binary worth using. You can download R compiled for the x86_64 platform here. (It will install a new GUI application called R64.app. Make sure you run that one.) But when I went to load one of the modules I use the most, RMySQL, which allows you to load data frames directly as the result of a MySQL query, I found that the x86_64 binary version from CRAN didn’t work. The solution?
- Download and install MySQL. Make sure you get the x86_64 version. As of this writing, it’s labeled “Mac OS X 10.5 (x86_64).” You don’t need to install the startup item for this to work, just the main package.
- Download the source package of RMySQL.
- Open a terminal and navigate to the directory where the RMySQL tar.gz file was downloaded.
-
Run the following command:
R CMD INSTALL --configure-args='--with-mysql-dir=/usr/local/mysql' \
RMySQL_0.7-4.tar.gz
That command is correct with the current version of RMySQL (0.7-4) and the default install location for MySQL (/usr/local/mysql). Change those values as appropriate if you have a different version or different install location.