Tuesday, December 10, 2013

Guest post: Where did the Chapman-Kolmogorov name come from?

This is a guest post by Peter Guttorp. Peter and I are working on updating Peter's book Stochastic Modeling of Scientific Data. To provide the reader with productive distractions, we are sprinkling the book with "historical boxes." Also, let's be honest, it keeps us entertained as well, because we are both math history junkies. Well, Peter does real research in the history of mathematics (e.g., here) , so he is a real historian, but I am definitely a junkie. While working on one of the background chapters, we put a historical box for Chapman and Kolmogorov next to the first appearance of the Chapman-Kolmogorov equation, but then wondered why we did this, because Peter knew that neither Chapman nor Kolmogorov were responsible for its discovery. This is, of course, not surprising in light of the Stigler's law, but then who gave this equation its modern name? Peter started digging and this is what he has found so far:
The origin of the Chapman-Kolmogorov equation dates at least back to Bachelier’s 1900 dissertation on the theory of speculation (but it may be earlier than that). Bachelier did not give it a name.  
In physics it is sometimes called Smoluchovsky’s equation, after Smoluchovsky’s 1906 rediscovery of Brownian motion (preceded by Bachelier and by Einstein (1905)). Smoluchovksy’s work was extended by Chapman (1928) to the case of nonstationary transition kernels. It was independently developed by Kolmogorov in his fundamental 1933 Mathematische Annalen paper. 
But where did the name come from? The first uses of the Chapman-Kolmogorov name are in Feller’s 1940 Transactions of the AMS paper, where he says that it is “known as the equation of Chapman and Kolmogoroff” . The 1940 thesis of Lundberg also uses this term, referring to Frechet (1938) and Kolmorogov (1931). Going to Frechet’s book, he does not give the equation a name, but designates it “as a mnemotechnical tool” with the letter I. 
In 1936, Feller writes in his Mathematische Annalen paper “Finally, from the rule of combination of probabilities, what Chapman and von Smoluchowski called the fundamental equality follows”. 
So what happened between 1936 and 1940? Did the name get invented in the corridors of the University of Stockholm (where Feller was a lecturer, Lundberg a PhD student, and Cramer the professor)?
If you know anything about the origin of the name Chapman-Kolmogorov equation, please get in touch with Peter and/or me. We are very curious!

Monday, October 14, 2013

Passing Armadillo matrices by value and by reference

Rcpp and RcppArmadillo make it relatively easy to integrate R code and  C++ code. In fact the resulting C++ code looks so much as R code that it is easy to forget some basic rules by which a C++ coder must play. This is especially true for someone like me, who wrote relatively little C++ code in the past.

Here is an example of a typical calculation that may get one in trouble. Suppose you have a function that takes two matrices from R, converts them into Armadillo matrices and passes these functions to other functions to do some repetitive calculations. For example, imagine passing transition and emission probability matrices to execute the forward-backward algorithm on multiple realizations of the same hidden Markov model.

Below you see my C++ code, saved in the file "matrix_code.cpp" that has two sets of functions: a) doStuff1(), twoMatrices1(); b) doStuff2(), twoMatrices2(). The functions twoMatrices1() and twoMatrices2() are exported to R via Rcpp attributes. The twoMatrices1() calls doStuff1() 100 times, while twoMatrices2() calls doStuff2() 100 times. The only difference between the two sets of functions is in that doStuff1() passes two Armadillo matrices by value, which requires copying these matrices. In contrast, doStuff2() passes Armadillo matrices by reference without calling copy-constructor under the hood.


// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <Rcpp.h>  
using namespace Rcpp; 
// First passing Armadillo matrices by value
void doStuff1(arma::mat c, arma::mat d){
  double x = c(1,1) + d(1,1);
// [[Rcpp::export]]
void twoMatrices1(NumericMatrix A, NumericMatrix B) {
    arma::mat M1(A.begin(), A.nrow(), A.ncol(), false);
    arma::mat M2(B.begin(), B.nrow(), B.ncol(), false);
 
    for (int i = 0; i < 100; i++){
      doStuff1(M1, M2);
    }
// Now passign Armadillo matrices by reference
void doStuff2(const arma::mat& c, const arma::mat& d){
  double x = c(1,1) + d(1,1);
}

// [[Rcpp::export]]
void twoMatrices2(NumericMatrix A, NumericMatrix B) {
    arma::mat M1(A.begin(), A.nrow(), A.ncol(), false);
    arma::mat M2(B.begin(), B.nrow(), B.ncol(), false);
 
    for (int i = 0; i < 100; i++){
      doStuff2(M1, M2);
    }
}


Now I am going to call twoMatrices1() and twoMatrices2() from R, using rbenchmark package to compare their speed. I am going to use fairly large matrices (100x100), so the cost of moving them around is significant.

library(RcppArmadillo)
library(rbenchmark)
 
sourceCpp("matrix_code.cpp")
 
x = matrix(1,100,100)
y = matrix(1,100,100)
 
res <- benchmark(twoMatrices1(x, y), twoMatrices2(x, y), 
                 columns = c("test", "replications", "elapsed", "relative"), 
                 order="relative", replications=1000)
print(res)
 
                test replications elapsed relative
2 twoMatrices2(x, y)         1000   0.006    1.000
1 twoMatrices1(x, y)         1000   0.829  138.167

Created by Pretty R at inside-R.org

As you can see, passing two 100x100 Armadillo matrices by value is more than 100 times slower than passing these matrices by reference. Perhaps not a very surprising outcome, but an important one to keep in mind when porting R code to C++.

Friday, October 11, 2013

Dissertation Acknowledgments

Jonathan Eisen posted his dissertation acknowledgments. I think this is a great idea. Maybe it is just me, but when I read someone's dissertation, not necessarily as a committee member, I really enjoy reading the acknowledgements. For me, these personal comments place the work in the context of "real life" and personal relationships. 

Here are my acknowledgments:

First and foremost, I would like to thank my mom and dad for creating many opportunities for my intellectual growth. Feeling my parents' constant love and support makes my scientific journey so much more meaningful! I am very thankful to my middle and high school mathematics teachers, Maya Radionovna Matihina and Svetlana Efimovna Gaydamak, for introducing me to the beautiful world of mathematical reasoning and abstract thinking.  I am very grateful to my undergraduate advisor, Galina Evgenievna Samkova, for patiently listening to my naive ideas and taking seriously my modest attempts to produce new mathematical results. Special thanks goes to Stephen Krone, who hooked me onto mathematical biology by regularly taking me to the bioinformatics seminars at the University of Idaho. My first scientific collaborators, Zaid Abdo and Paul Joyce, became my very good friends and influenced my scientific career more than they probably realize.    
UCLA Department of Biomathematics has been my second home for the last four years. I would like to thank the departmental staff, David Tomita, Martha Riemer, Wendy Hagar, and Fred Hughes, for taking care of administrative hurdles so well. During my first years at UCLA, I received a lot of advice and guidance from my fellow Biomathematics students. I am very grateful to Robert Rovetti for being a wonderful colleague and a good friend. Robert thoroughly read and constructively criticized Chapters 3 and 4 of this dissertation. Intellectual discussions with Benjamin Redelings, John O'Brien, and Alex Alekseyenko were certainly worth inhaling cigarette smoke and drinking really bad coffee from vending machines. 
All members of my dissertation committee played important roles in my graduate school development. I enjoyed collaborating with Christina Kitchen, who is very passionate about both statistics and biology. I am grateful to Janet Sinsheimer for giving me many opportunities to assist her in graduate and undergraduate level teaching. Elliot Landaw provided me with excellent academic guidance at the beginning of my studies and was extremely supportive throughout all these years. Ken Lange has frequently given me always timely and wise pieces of advice. Chapter 6 would not be possible without
his enthusiasm about Markov chain induced counting processes and his encouragement to pursue this line of research. 
The chair of my dissertation committee, Marc Suchard, is the greatest advisor one could wish for! I am very grateful to Marc for introducing me to interesting problems and for not revealing solutions to these problems, giving me an opportunity to become an independent researcher. It is very difficult to summarize everything I have learned from Marc in one paragraph. Perhaps the most important lesson that Marc taught me is how not to be afraid: not to be afraid to attack challenging problems, not to be afraid to seek rigorous solutions for these problems, and not to be afraid to go somewhere no one has gone before.    
This dissertation would never be finished without support of my wife, Yuliya, who patiently tolerated my "writing weekends'' and many other strange habits of a finishing Ph.D. student.