As my Blog has grown it it is becoming increasingly more unwieldily. To help improve its usability this page gives a collection of posts organized by topic.
Survival Analysis and Churn
- An introduction or “vulgurization” on Churn
- How to define and model Churn
- Survival Analysis more then just functions in R.
- Proportional Hazards (unpacking equation 12)
- Estimating Passive Churn or Life time Conversion.
- Weibull parameters for censored data using only SQL
- Plotting your Hazard
Posts 1 and 2 are probably the most useful. Post 3 and 4 provide some back ground into Kaplan Meier Cox proportional hazards that might be interesting to those that are interested in such things. Post 5 is a bit advanced (and sadly not the clearest thing I have written) but is a genuinely useful technique. Posts 6 and 7 are mostly just exercises that I found interesting; the brute force approach to optimization using SQL might be of general interest even if the problem in 6 is not.
Linear Models and Generalized Linear Models
- A summary of Linear Models for BI
- Another summary of Linear Models for BI
- Quick and Dirty Model Selection
- Regularized Regression: Part A) SVD.
- Regularized Regression: Part B) Ridge Regression
- Regularized Regression: C) Bayesian Linear Regression
- L1 and L2 Norms
- In which we learn that about Bayesian Evidence, Change Point Detection, and latex.
- Estimating User Engagement (Errors-in-variables Regression)
- Over-Dispersion: How sure are you?
Posts 1-7 are a, sort of, short course on linear regression. Post 8 was my first blog post (and I still think one of the better ones). It shows how to calculate Bayesian evidence for a linear model and how to use that evidence to find change points in a KPI trajectory. Post 9 is an interesting math exercise. Finally post 10 presents a Bayesian way of accounting dispersion in a Probit/logit regression.
Markov chain Monte Carlo
- In which the author gets the Reversible Jump Markov Chain Monte Carlo algorithm and trans-dimension inversion off of his chest.
- Parallel Tempering / Replica Exchange
- Rotated MCMC Sampling and Ordered Categorical Regression
- Computationally Efficient Estimation and a Frustrating Distribution that is Fat in the Middle
Post 1 covers the important “when one of the things you don’t know is the number of things you don’t know” estimation procedures. Posts 2 and 3 give methods of improving MCMC convergence speed. Post 4 highlights the value of pre calculating as much of the likelihood function as possible in MCMC.
Split testing
Post 1 gives a Bayesian “t-test”. Post 2 expands that test to a few other scenarios (there might be some errors there I have not had time to go through it in detail and the equations get big). Post 3 is a natural extension use the rjMCMC techniques given through out this blog.
Posts I think are interesting but don’t really fit in anywhere
Post 1 is the most popular post on my blog by a long way. Post 2 is an improvement (thats right I said it) over principle component analysis. Post 3 gives a very useful general optimizer.
Neural Networks
- Some thoughts on neural nets; the importance of random in back-propagation.
- Optimizing the choice of objective function
- Annealing Neural Networks: Another keeping it real post
- Bayesian Text Recognition
Posts 1 and 2 give a sort of general introduction to Neural Networks and optimizing their parameters. Post 3 presents two ideas on how to improve convergence speed; one works. Post 4 is an old one that I have never gotten around to finishings.