# Plotting your Hazard

As humans we know that Misfortune stalk us at every turn. But how much hazard are we  truly in?

This post answers that important question.

Recall that the hazard function [$h(t)$] is the ration of the event time pdf [$f(t)$] to the survival function [$S(t)$]; i.e., it the instantaneous probability of failure for an observation given that survived at least that long. Also recall that a Kaplan-Meier curve (KMC) is reasonable ways to estimate the survival function….

So what is the point of this post? Can’t the hazard function be gotten directly from the observed KMC? surprisingly no. The basic problem is that a KMC is not a smooth estimate of the $S(t)$ thus $f(t) = d/dt [1- S(t)]$ is not really defined.

The simplest way to get around this problem is to use the piece wise estimate $\hat{h}(t) = d_j / (r_j \Delta_j)$ where $t \in [t_{j-1}, t_{j})$, $r_j$ is the number of observations at risk in time interval $[t_{j-1}, t_{j})$,, $d_j$ is the number of “hanging ups” at time $t_j$, and $\Delta_j = t_{j} - t_{j-1}$.

In practice this method is not satisfying as its uncertainty [$\text{Var}(\hat{h}[t]) = \hat{h}(t)^2 (r_j - d_j)/(r_j d_j)$] tends to be large and estimates for adjacent segments tend to vary largely.

So here a Bayesian approach is proposed! Yes that is right I am finally moving in to a Bayesian survival analysis problem!

As noted before Bayesian methods tend to start with a likelihood function (yes ABC does not but it sort of does because it is based on defining an approximate likelihood function… do you feel smug for pointing that out? do you wake up in the night and hug yourself? No one cares keener!) . Here Order statistics are exploited to produce the likelihood function. The percentile of the $k$‘th of $n$ observation has a beta distribution because there are $k -1$ observations smaller and $n - k$ observations bigger. The percentiles can them be fit using a collections of weighted cdfs. The number of cdf and their parameters are found using rjMCMC. I know this is a fairly vague description (frankly it is a vague reminder) however I would like to get this post done and I beleive that if you have been reading the other posts this is not that far out of an idea.

To explore the idea I have made a simulation study. In the study a collection of observed exponential random variables  are used to create a estimated a hazard function. Truncated Gaussian distributions are used for the kernel of the rjMCMC inversion. In practice it would probably be better to use truncated generalized Gaussian distributions. However then the kernel could exactly fit the simulated data which is unlikely to be possible in practice. The choice of simulated exponential distributed data set is made for ease of interpretation as the exponential has a constant hazard function.  Figure 1 left shows simulated exponential random variables (circles) plotted vs their percentiles; Fig. 1 right gives a color histogram of the posterior hazard function. The posterior distribution is distinct from the true hazard function (the constant like $y = 1$) for much of the range zero to two however it is close. And it should be noted that the data set is constantly lower than the theoretical percentile for that range.

Figure 1: Left the simulated exponential random variables (circles) plotted vs their percentiles. The blue line is the theoretical values. Right is a color histogram of the posterior hazard function. The horizontal line is the hazard function of the simulation distribution; i.e., the “true value”.

That is it for this week. I hope this post is interesting As I had this idea in my head for a while before I could find time to write it.  The code for the simulated inversion is append.

Tune in next time for more Bayesian Business Intelligence! Same Bayesian Channel, Same Bayesian Time!

g.sim.data <-function(N = 1000, lambda = 1)
{
t  = rexp(N, rate = lambda )
t. = sort(t)

index = 1:N
cdf   = (1:N)/N

out = data.frame(t., index, cdf)
return(out)

}
g.sim.data(N= 100)

#  LL  j, P1, P2,  Mu1  Mu2, S1   S2
g.predict<-function(MOD=c(225.870633018 ,  3.000000000 ,  1.000000000 ,  0.408442587 ,  0.173129261 ,  0.006416128 ,  0.992600670 ,  1.333872850 ,  0.488384568 ,  1.279163793,   0.261276892 ), X = g.sim.data(N=100), plot.line=0 )
{
# upack the pars

j      = MOD[2]
z      = MOD[3:(j+2)]
mu     = MOD[(j+3):(2*j + 2)]
sigma  = MOD[(2*j + 3):(3*j + 2)]

#print(z)

# get the probs
z.star = c(z, 0)
prob = 1:j
for (i in 1:j)
{
prob[i] = z.star[i]-z.star[i+1]
}

if (min(prob) < 0 ) {return(-Inf) } #print(prob) t.     = X$t. n = nrow(X) k = 1:n alpha = k beta = n - k + 1 #print("-----------") #print(prob[1]) CDF.hat = prob[1]*(pnorm(t., mean = mu[1], sd=sigma[1]) - pnorm(0,mean = mu[1], sd=sigma[1]))/( 1- pnorm(0,mean = mu[1], sd=sigma[1]) ) if(j >= 2) { for ( i in 2:(j)) { #print("-----------") #print(prob[i]) temp = prob[i]*(pnorm(t., mean = mu[i], sd=sigma[i]) - pnorm(0,mean = mu[i], sd=sigma[i]))/( 1- pnorm(0,mean = mu[i], sd=sigma[i]) ) CDF.hat = CDF.hat + temp } } LL = dbeta(CDF.hat, shape1 = alpha, shape2 = beta, ncp = 0, log = TRUE) LL = sum(LL, na.rm=T) #print(CDF.hat) if (plot.line == 1 ) { plot(t., X$cdf)
lines(t., CDF.hat)
}

return(LL)
}
g.predict(plot.line=1)

g.perterb<-function(M=c(-Inf, 3, 0,  0.6, 0.2,   0, 0.5,1,   1, 0.5, 0.1 ), LB=c(0, -10, 0), UB = c(1, 10, 10), Qsd = c(0.01, 0.01, 0.01)  , data=g.sim.data(N=100)) { # unpacking hte parameters LL     = M[1] j      = M[2] z      = M[3:(j+2)] mu     = M[(j+3):(2*j + 2)] sigma  = M[(2*j + 3):(3*j + 2)]    #print(M)  # make the proposal model   z.prime      = z mu.prime     = mu sigma.prime  = sigma   # make the proposal model index  = sample(1:j, 1 ) if(index >1) {z.prime[index]     = z.prime[index]     + rnorm(1, mean = 0, sd= Qsd[1])} # add random noise to old model
mu.prime[index]                  = mu.prime[index]    + rnorm(1, mean = 0, sd= Qsd[2])
sigma.prime[index]               = sigma.prime[index] + rnorm(1, mean = 0, sd= Qsd[3])

SI = sort(z, index.return=T, decreasing = T)$ix z.prime = z.prime[SI] mu.prime = mu.prime[SI] sigma.prime = sigma.prime[SI] M.prime =c(LL, j, z.prime, mu.prime, sigma.prime) #print(M.prime) #print(z.prime) #print(mu.prime) #print(sigma.prime) #print(z.prime[index]) #print(mu.prime[index]) #print(sigma.prime[index]) if ((z.prime[index] >= LB[1]) & (z.prime[index] <= UB[1]) & # check the bounds (mu.prime[index] >= LB[2]) & (mu.prime[index] <= UB[2]) & (sigma.prime[index] >= LB[3]) & (sigma.prime[index] <= UB[3]) #&(sum(prob.prime ) <= 1) ) { LL.prime = g.predict(M.prime, X = data ) # compute loglikihood M.prime[1] = LL.prime # save LL r = runif(1) # random uniform MH = exp(LL.prime - LL) # Metropolis-hasting acceptance probability value if (r <= MH) { M = M.prime } } return(M) } g.birth<-function(M=c(-Inf, 3, 0, 0.6, 0.2, 0, 0.5,1, 1, 0.5, 0.1 ), LB=c(0, -10, 0), UB = c(1, 10, 10), Qsd = c(0.01, 0.01, 0.01) , data=g.sim.data(N=100)) { # unpacking hte parameters LL = M[1] j = M[2] z = M[3:(j+2)] mu = M[(j+3):(2*j + 2)] sigma = M[(2*j + 3):(3*j + 2)] #print(M) #print(z) if (j <= 14 ) { # make the proposal model LL.prime = LL j.prime = j + 1 z.prime = c(z , runif(1, min = LB[1], max = UB[1]) ) mu.prime = c(mu, runif(1, min = LB[2], max = UB[2]) ) sigma.prime = c(sigma , runif(1, min = LB[3], max = UB[3]) ) SI = sort(z.prime, index.return=T, decreasing = T)$ix       # resort the mods in prob
z.prime     = z.prime[SI]
mu.prime    = mu.prime[SI]
sigma.prime = sigma.prime[SI]

M.prime      = c(LL.prime, j.prime, z.prime, mu.prime, sigma.prime)
#print("------ gavin this line ------------")
#print(M.prime)
LL.prime     = g.predict(M.prime, X = data)                  # get predicted values
M.prime[1]   = LL.prime

r = runif(1)                                          # random uniform
MH = exp(LL.prime - LL)                               # Metropolis-hasting acceptance probability value note that the prior and proposal cancel
if (r <= MH) {M = M.prime}                            # if accepted
}
return(M)

}

g.death<-function(M=c(-Inf, 3, 0,  0.6, 0.2,   0, 0.5,1,   1, 0.5, 0.1 ), LB=c(0, -10, 0), UB = c(1, 10, 10), Qsd = c(0.01, 0.01, 0.01)  , data=g.sim.data(N=100)) { # unpacking hte parameters LL     = M[1] j      = M[2] z      = M[3:(j+2)] mu     = M[(j+3):(2*j + 2)] sigma  = M[(2*j + 3):(3*j + 2)] #print(M) if (j >= 3 )
{
# make the proposal model
index        = sample(2:j, 1)
LL.prime     = LL
j.prime      = j - 1
z.prime      = z[-index]
mu.prime     = mu[-index]
sigma.prime  = sigma[-index]
M.prime      = c(LL.prime, j.prime, z.prime, mu.prime, sigma.prime)
LL.prime     = g.predict(M.prime, X = data)                  # get predicted values
M.prime[1]   = LL.prime
r = runif(1)                                          # random uniform
MH = exp(LL.prime - LL)                               # Metropolis-hasting acceptance probability value note that the prior and proposal cancel
if (r <= MH) {M = M.prime}                            # if accepted
}
return(M)

}

g.explore<-function(old, d)
{
Qsd. = c(0.005, 0.005, 0.005)
LB.  = c(0, -10, 0)
UB.  = c(1, 10, 10)

move.type = sample(1:3, 1) # the type of move i.e., perterb, birth, death
#print("------ this line")
#print(move.type)

if (move.type == 1) {old = g.perterb(M=old, Qsd =Qsd., LB = LB., UB=UB., data= d)}
if (move.type == 2) {old = g.birth(M=old,  Qsd =Qsd., LB = LB., UB=UB., data = d)}
if (move.type == 3) {old = g.death(M=old, Qsd =Qsd., LB = LB., UB=UB., data = d )}

return(old)
}

g.rjMCMC<-function( Nsamp = 20000, BURN = 2000)
{

data = g.sim.data(N= 100, lambda = 1)

#x11(height=4, width = 5)
par(mfrow= c(1,2))
plot(data$t., data$cdf, xlab = "time", main = "Simulated Data", ylab= "Cumlative Probability", xlim=c(0, 5))
t.f = seq(0, 7, length.out = 1001)
lines(t.f,1-exp(-t.f)  , col="blue", lwd=3 )
points(data$t., data$cdf)

Mod.old  =c(-Inf, 3, 1,  0.6, 0.2,   0, 0.5,1,   1, 0.5, 0.1 )

for(i in 1:BURN) # the burn in
{
Mod.old = g.explore(old = Mod.old, d = data)
}
print(Mod.old)

REC = list()
for(i in 1:(Nsamp-1)) # the burn in
{
Mod.old = g.explore(old = Mod.old, d = data)
REC[[i]] = Mod.old
rownames(REC) = NULL
}

#print(table(REC[,2]))
#best = which.max(REC[,1])
#g.predict(MOD=REC[best,], X = data , plot.line=1)
#lines(data$t.,1-exp(-data$t.)  , col="blue", lwd=3 )
return(REC)

}

g.Haz<-function(N = 19999, REC= samps) { hist.mat = matrix(0, nrow= N, ncol = 101) t. = seq(0, 7, length.out = 101)      for (bi in 1:N)     {         #print(bi)     MOD = REC[[bi]]     # upack the pars          j      = MOD[2]     z      = MOD[3:(j+2)]     mu     = MOD[(j+3):(2*j + 2)]     sigma  = MOD[(2*j + 3):(3*j + 2)]               #print(z)          # get the probs      z.star = c(z, 0)     prob = 1:j     for (i in 1:j)         {         prob[i] = z.star[i]-z.star[i+1]         }          #print(prob)          PDF.hat  =  prob[1]*(dnorm(t., mean = mu[1], sd=sigma[1]))/( 1- pnorm(0,mean = mu[1], sd=sigma[1])  )          if(j >= 2)
{
for ( i in 2:(j))
{
temp =  prob[i]*(dnorm(t., mean = mu[i], sd=sigma[i]))/( 1- pnorm(0,mean = mu[i], sd=sigma[i])  )
PDF.hat  = PDF.hat + temp
}
}
if (min(PDF.hat) < 0 ) {print("problem pdf")}     #print("-----------")     #print(prob[1])     CDF.hat  =  prob[1]*(pnorm(t., mean = mu[1], sd=sigma[1]) -   pnorm(0,mean = mu[1], sd=sigma[1]))/( 1- pnorm(0,mean = mu[1], sd=sigma[1])  )          if(j >= 2)
{
for ( i in 2:(j))
{
temp =  prob[i]*(pnorm(t., mean = mu[i], sd=sigma[i]) - pnorm(0,mean = mu[i], sd=sigma[i]))/( 1- pnorm(0,mean = mu[i], sd=sigma[i])  )
CDF.hat  = CDF.hat + temp
}
}
if (max(CDF.hat) > 1 ) {print("problem cdf too big")}
if (max(CDF.hat) > 1 ) {print(prob)}

if (max(CDF.hat) > 1 ) {print(CDF.hat)}
if (min(CDF.hat) < 0 ) {print("problem cdf")}

S.hat = 1 - CDF.hat
if (min(S.hat) < 0 ) {print("problem s")}

h.hat = PDF.hat/S.hat

hist.mat[bi,] =  h.hat

}

plot(t., t., type= "n", xlab = "time", ylab = "hazard function", ylim=c(0,4 ), xlim= c(0, 5))

ncolor = 21
layers = matrix(0, nrow=ncolor, ncol=101)
for (i in 1:101)
{
q = quantile(hist.mat[,i], probs=seq(0, 1, length.out=ncolor))
layers[,i] = q
}
gcol = c("white", rev(heat.colors((ncolor-1)/2) ))
#       1            2           34        4  3 21

for (i in 1:((ncolor-1)/2))
{
PY = c(layers[i,], rev(layers[ncolor+1-i,]))
Px   = c(t., rev(t.))
polygon(Px, PY, col=gcol[i], border=F)
}
lines(t., layers[((ncolor-1)/2)+1,], col=gcol[((ncolor-1)/2)+1], lwd=2)

abline(h = 1)

#print(layers)

}

samps = g.rjMCMC()
g.Haz()