list of interesting articles

As my Blog has grown it it is becoming increasingly more unwieldily. To help improve its usability this page gives a collection of posts organized by topic.

Survival Analysis and Churn

  1. An introduction or “vulgurization” on Churn
  2. How to define and model Churn
  3. Survival Analysis more then just functions in R.
  4. Proportional Hazards (unpacking equation 12)
  5. Estimating Passive Churn or Life time Conversion.
  6. Weibull parameters for censored data using only SQL
  7. Plotting your Hazard

Posts 1 and 2 are probably the most useful. Post 3 and 4 provide some back ground into Kaplan Meier Cox proportional hazards that might be interesting to those that are interested in such things. Post 5 is a bit advanced (and sadly not the clearest thing I have written) but is a genuinely useful technique. Posts 6 and 7 are mostly just exercises that I found interesting; the brute force approach to optimization using SQL might be of general interest even if the problem in 6 is not.

Linear Models and Generalized Linear Models  

  1. A summary of Linear Models for BI
  2. Another summary of Linear Models for BI
  3. Quick and Dirty Model Selection
  4. Regularized Regression: Part A) SVD.
  5. Regularized Regression: Part B) Ridge Regression
  6. Regularized Regression: C) Bayesian Linear Regression
  7. L1 and L2 Norms
  8. In which we learn that about Bayesian Evidence, Change Point Detection, and latex.
  9. Estimating User Engagement (Errors-in-variables Regression)
  10. Over-Dispersion: How sure are you?

Posts 1-7 are a, sort of, short course on linear regression. Post 8 was my first blog post (and I still think one of the better ones). It shows how to calculate Bayesian evidence for a linear model and how to use that evidence to find change points in a KPI trajectory. Post 9 is an interesting math exercise. Finally post 10 presents a Bayesian way of accounting dispersion in a Probit/logit regression.

Markov chain Monte Carlo

  1. In which the author gets the Reversible Jump Markov Chain Monte Carlo algorithm and trans-dimension inversion off of his chest.
  2. Parallel Tempering / Replica Exchange
  3. Rotated MCMC Sampling and Ordered Categorical Regression
  4. Computationally Efficient Estimation and a Frustrating Distribution that is Fat in the Middle

Post 1 covers the important “when one of the things you don’t know is the number of things you don’t know” estimation procedures. Posts 2 and 3 give methods of improving MCMC convergence speed. Post 4 highlights the value of pre calculating as much of the likelihood function as possible in MCMC.

Split testing

  1. Bayesian AB/Split testing
  2. More on Bayesian Split testing
  3. Split testing with RJMCMC

Post 1 gives a Bayesian “t-test”. Post 2 expands that test to a few other scenarios (there might be some errors there I have not had time to go through it in detail and the equations get big). Post 3 is a natural extension use the rjMCMC techniques given through out this blog.

Posts I think are interesting but don’t really fit in anywhere

  1. Chess Visualization
  2. Rotating to the maximum correlated space
  3. A hybrid Optimizer

Post 1 is the most popular post on my blog by a long way. Post 2 is an improvement (thats right I said it) over principle component analysis. Post 3 gives a very useful general optimizer.

Neural Networks 

  1. Some thoughts on neural nets; the importance of random in back-propagation.
  2. Optimizing the choice of objective function
  3. Annealing Neural Networks: Another keeping it real post
  4. Bayesian Text Recognition

Posts 1 and 2 give a sort of general introduction to Neural Networks and optimizing their parameters. Post 3 presents two ideas on how to improve convergence speed; one works. Post 4 is an old one that I have never gotten around to finishings.