Category Archives: Data

Prediction of Overall Mortality from Fitbit heart rate data

From what I can tell, the Fitbit API returns heart rate data at an effective temporal resolution of 9.98 seconds (min: 5 s, median: 10 s, max: 15 s). Curiously, you are more likely to get either a 5 or 15 s interval than a 10 s interval. Using Mathematica, as before, we can plot the distribution of times between samples returned by the Fitbit API,


fitbitHR
That is still (although just barely) usable for measuring heart rate recovery, the change in your heart rate some time t after you stop your exercise. For most things you can measure on a wearable, any one datapoint is next to useless; the key is to look at first and second derivatives, such as gradual trends in how your heart rate drops following a few minutes on the treadmill. The key medical study is probably the October 1999 article in NEJM, Heart-rate recovery immediately after exercise as a predictor of mortality. The conclusion of that paper is that “A delayed decrease in the heart rate during the first minute after graded exercise, which may be a reflection of decreased vagal activity, is a powerful predictor of overall mortality”. Their standard for a ‘delayed’ decrease was a drop of ≤ 12 beats per minute from the heart rate at peak exercise, measured 1 minute after cessation of exercise. Since Fitbit is probably not in the “mortality prediction” market, ~10 s temporal resolution is fine; for medical researchers, however, it would be nice to have slightly higher temporal resolution data.

Directional Quantile Envelopes – making sense of 2D and 3D point clouds

Imagine some large multidimensional dataset; one of the things you might wish to do is to find outliers, and more generally, say something statistically-defined about the structure of clusters of points within that space. One of my favorite techniques for doing that is to use directional quantile envelopes, developed and implemented by Anton Antonov and described here and here. In that post, Antonov considers a set of uniformly distributed directions and constructs the lines (or planes) that separate the points into quantiles; if you consider enough directions, and do this a few times, you are left with lines (or planes) that define a curve (or surface) that envelops some quantile q of your data. The figures show a cloud of points with some interesting structure and the surface for q = 0.7, with and without the data.

Beyond general data analytics, the directional quantile envelope approach has at least one more application, which is in image processing and segmentation. Imagine taking a picture of a locally smooth blob-like object in the presence of various (complicated) artifacts and noise. You could throw the usual approaches at this problem (gradient filter, distance transform, morphological operations, watershed, …), but in many of those approaches you end up having to empirically play with dozens of parameters until things “look nice”, which is unsettling. What you would really like to do is to detect/localize/reconstruct the emitting object in a statistically-defined, principled manner, and this is what Directional Quantile Envelopes allow you to do.

segmentation_7With a quantile envelope, you can compactly communicate what you did to the raw imaging data to get some final picture of a cell or organoid, rather than reporting an inscrutable succession of filters, convolutions, and adaptive nonlinear thesholding steps. The figure shows a cell nucleus imaged with a confocal microscope; in reality, the cell nucleus is quite smooth, but various imaging artifacts result in the appearance of “ears”, which can be detected as outliers via directional quantile envelopes.