Probability of someone being home

noelo2014 · Jun 26, 2014

I'm a salesman and I have a list of 100 addresses all equidistant from each other and from where I'm standing right now. (I know this is impossible in 3D space but just let's assume it is to make travel times equal). I only have time to call to 10 houses so I want to maximise the chances that I'll call to houses where someone is home.

For each address I know:

1. The time when someone was last known to be home (not necessarily the last time they were actually home)
2. The last time a call was made
How do I choose the 10 houses?

Simon Bridge · Jun 26, 2014

Good question - how have you been attempting the problem so far?
This information will help us to work out how best to help you.

You know the situation is not possible - so the salesman in in a multi-dimensional space?

noelo2014 · Jun 26, 2014

Simon Bridge said:

Good question - how have you been attempting the problem so far?
This information will help us to work out how best to help you.

You know the situation is not possible - so the salesman in in a multi-dimensional space?

Yes, I just put this in for fun, and so people wouldn't make the argument about factoring in the travel time between each house.

Intuitively I'd say that if the last unsuccessful call was recent (say 20 minutes ago) and the last known time they were home was not (say 10 years ago), then it's unlikely they'll be home compared to a house where the last time they were known to be home was 1 hour ago

This is actually a computer science problem for a program I'm writing where I have to scan servers and find ones that are alive. I just thought I'd pick the brains of any statistics experts out there and see if this type of problem has been solved already.

Simon Bridge · Jun 26, 2014

Actually all you need to do to avoid travel-time arguments is just to require the smallest number of houses visited.
You could also phrase it in terms of telemarketing - get the quota of replies with the fewest phone calls, and you have a data sheet from previous contacts.

But the problem statement is not detailed enough to admit a specific answer.
Presumably you would scan you fact-sheet and call those who are known to be home at the time of the call.

noelo2014 · Jun 26, 2014

Simon Bridge said:

Actually all you need to do to avoid travel-time arguments is just to require the smallest number of houses visited.
You could also phrase it in terms of telemarketing - get the quota of replies with the fewest phone calls, and you have a data sheet from previous contacts.

But the problem statement is not detailed enough to admit a specific answer.
Presumably you would scan you fact-sheet and call those who are known to be home at the time of the call.

Firstly the smallest number of houses visited would be zero which would mean zero travel-time...SOLVED! No the aim is to visit houses where people are in and not waste time traveling to empty houses

Then let's NOT assume that if a person is home at 2pm one day it means they'll be home at 2pm the next day. I know this is generally true in real-life but I'm trying to come up with a more general method.

Ok I'll try and re-phrase it because it's confusing me too and I want to get this part of the program done by today.

I have 100 phone numbers, I need to make some telemarketing calls with a view to making some sales. I have limited time. Therefore I need to call the numbers where people are likely at home. The only information I have associated with each number is two datetimes: The first of these datetimes is when someone was last known to be home at this number (ie. the time when the last call ended, or now if it's still in progress), the second datetime is the last time a call was made to this number, whether successful or not. For abbreviation, I'll call these values DT1 and DT2.

My task is to order the list based on this data alone such that the people most likely to be home are at the top of the list and the people least likely are at the bottom.

Now..

For a telephone number A, DT1 is 10 hours ago and DT2 is 2 hours ago. This tells me that person A was home 10 hours ago but wasn't home 2 hours ago.

For a telephone number B, DT1 is 16 hours ago and DT2 is 17 hours ago. This tells me that person B was at home 16 hours ago and was also home 17 hours ago. This would have been a successful call that lasted 1 hour and was made 17 hours ago.

Who's more likely to be home now, A or B?

These are 'normal' everyday cases. Now for an extreme case:

For a telephone number C, DT1 is 1 minute ago and DT2 is 3 minutes ago
And for a telephone number D, DT1 is 1 minute ago and DT2 is 10 years ago.

What does this tell us about C and D? We know both of them were definitely home 1 minute ago. there was a call with C that lasted exactly 2 minutes, and there was a call with D that lasted 10 years less 1 minute. Can we say that D is more likely to be home than C since they spent the last 10 years at home on one phone call? Not really because maybe they're housebound, but on the other hand maybe they were dying to finish that call and get outside so it's far more likely they're not at home now?

Yes this does tell us nothing, but if you think about it in terms of limits, as DT1 approaches the now, the likelyhood of the person being home now must surely go up, and as DT2 approaches the now, the likelyhood of a person being home must surely go down, assuming DT1<DT2.

I'll come back later

Simon Bridge · Jun 26, 2014

You need to have a model for the likelyhood that someone will be home given when they were last called and whether they were home then.

Since you insist that we cannot assume someone is mor elikely to be home at 2pm just because they were last called then, it means you have only equal a-priory likelyhood, irrespective of the data.
That seems self-defeating.

But it's your project - enjoy.

noelo2014 · Jun 26, 2014

I see your point, Simon, anyway I've programmed this function and went with the following idea (for anybody who's interested).

1. I find the max and min of the set of times when people were last known to be home, then normalized these values between 0 and 1

2. I find the max and min of the set of times when people were last called (successful or not), then normalized these values between 0 and 1 THEN subtracted each of these from 1, (so 0.3 becomes 0.7 etc)

I then found the average of the two values to arrive at the final score... the higher the score the more likely the person is to be home.

I'm going with this algorithm for my program because since I have no other data, it seems intuitively right. ok I could keep track of *all* the connection attempts and statistically analyze it but I want to keep things simple. This was a surprisingly difficult problem though, maybe I'm the first to solve it. Hope it helps someone. Can't spend any more time on it

This could be wrong though, only way of checking it is seeing how it works in real life

Probability of someone being home

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight