# Horseshoes, Hand grenades and Slow-dancing

Things where close counts!

Hi All. This morning I had an idea about a distribution for data where close was good enough. For example what if you are trying to model where users touch on a screen given that any place they touch a target is equivalent but any where off of the target is bad. There is no reason for the the distribution to be peaked like a Gaussian; so estimates from peaked distributions will have incorrect variances. Also It is nice for the function to be made of simple functions so that it is not computationally intensive to calculate the cdf / pdf.

The basic idea for distribution came from a Prof I had in my undergrad who used the presented cdf (with $a =1$) to plot limits for students. Note I was hoping to find a better web page to link to  but Google is obsessed with some yob that is a professor at Harvard with the same name. So I call this the “Close-enough” distribution. Enough rambling!

The CDF is a piece wise function $F(x) = [h(x)+2a]/(4a)$ where $h(x) = x$ if $x$ in ( $-a$, $a$), $h(x) = 2a - a^2/x$ if $x > a$, and $h(x) = -2a - a^2/x$ if $x < -a$.

The pdf is $f(x) = 1/(4a)$ where $x$ in ( $-a$, $a$), $f(x) = a/(4x^2)$ other wise.

I would like to like to provide a most likely estimate of $a$ however I am lazy and MCMC is a good way to get an estimate.Code for the MCMC done here is given below. The PPD of estimates for a simulated example of data drawn from the Close-enough distribution with true scale parameter given by the vertical line.

Code:

g.pdf <-function (a, x=seq(-11,11, length.out=10001) )
{
f1 = 1/(4*a) + 0*x
f2 = (a/4)*x^(-2)
f3 = (a/4)*x^(-2)

f = f1
f[x>a] = f2[x>a]
f[x<(-a)] = f3[x<(-a)]

#plot(c(-11,x,11), c(0,f,0), type=”n”, xlim=c(-10, 10), xaxs=”i”, xlab=”x”, ylab=”P(x)”  )
#polygon(c(-11,x,11), c(0,f,0), col=”grey”)

return(f)

}

g.inv.cdf <-function (a, P)
{
tv = 4*a*P – 2*a

if ((tv >= -a ) & (tv <= a ))
{
x = tv
}

if ((tv > a ))
{
x = -a^2/(4*a*(P-1))
}

if ((tv < -a ))
{
x = -a/(4*P)
}

return(x)
g.rced <-function (n= 1000, A=1)
{
out = 1:n

for (i in 1:n)
{
out[i] = g.inv.cdf(P= runif(1), a=A)

}
#out = out[out>-10]
#out = out[out<10]

h=hist(out, breaks=101, freq=F)
d= g.pdf(x=h\$mids, a=A)

lines(h\$mids, d, col=”blue”)
return(out)
}

g.mcmc <-function (N = 100, n.=100)
{
d = g.rced(n=n., A=1)

LL= -Inf
a = 1

mod  = c(LL, a)
rec  = data.frame(LL= 1:N, a= 1:N)
Q.sd = 0.1

for (i in 1:N )
{
print(i)
mod.new= mod
mod.new = mod.new + rnorm(1, sd= Q.sd, mean=0)

mod.new = sum(log(g.pdf(x=d, a = mod.new)))
A = exp(mod.new-mod)
r = runif(1)

if (r<=A)
{
mod = mod.new
}
rec[i,c(1,2)] = mod
}
#print(rec)
h=hist(rec[,2], breaks=101, plot = F)

plot(h\$mids, h\$dens, xlab=”a”, ylab=”P(a)”,type=”n”)
polygon(c(h\$mids, h\$mids, h\$mids), c(0,h\$dens,0), col=”grey”, border=F)
abline(v=1, lwd=2, col=”blue”)

}