Probability That Receipt Of Email 1 AND Email 2 Was Random

Click For Summary

Discussion Overview

The discussion revolves around calculating the probability that the receipt of two virus emails, Virus Email 1 and Virus Email 2, was random, particularly in relation to two events: Event A (an appeal to an insurance company) and Event B (a letter of complaint regarding the first email). The scope includes elements of probability theory, real-life implications, and the correlation between events and email receipts.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • Steve describes a sequence of events leading to the receipt of two virus emails and seeks to calculate the probability of their receipt being random.
  • Some participants propose that the probability of receiving an email related to an event is not uniform and may depend on the timing relative to the event.
  • One participant emphasizes the importance of knowing the distribution of the random variable when calculating probabilities, cautioning against assumptions without empirical data.
  • Another participant suggests that the assumption of perfect correlation between the events and emails may not hold, indicating the need for data to estimate probabilities accurately.
  • Steve clarifies that he lacks data to support his calculations and is trying to determine the likelihood of the emails being sent by someone related to the insurance case versus being random spam.
  • A later reply states that the available data is insufficient for meaningful analysis and suggests consulting an internet security specialist for further investigation.

Areas of Agreement / Disagreement

Participants express differing views on the ability to calculate the probability of the emails being random, with some emphasizing the need for data and proper statistical methods, while others focus on the correlation between the events and the emails. No consensus is reached regarding the probability calculation or the nature of the email sources.

Contextual Notes

Limitations include the absence of empirical data to support claims about the correlation between events and email receipts, as well as the reliance on assumptions that may not be verifiable.

Steve3
Messages
4
Reaction score
0
An issue I’ll call Issue 1 arose and prompted Event A to occur. A few days after Event A occurred, I received a virus email I’ll call Virus Email 1. The content of Virus Email 1 referred to something that is a direct outcome of Event A. Therefore, Virus Email 1 is directly related to Event A. Later Event B occurred. Event B was about Issue 1 and about receiving Virus Email 1. Therefore, Event B is directly related to Event A. A few days after Event B occurred, I received another virus email I’ll call Virus Email 2. The content of Virus Email 2 was identical to Virus Email 1. Therefore, Virus Email 2 is directly related to Virus Email 1.

I would like to calculate the probability that the receipt of Virus Email 1 AND Virus Email 2 was random.

Some thoughts about the problem (right or wrong ?) ...
* The probability of receiving an email any day of the year (ignoring a leap year) is 1 out of 365.
* However, the probability of receiving an email directly related to an event after the event occurs is not 1 out of 365.
* The probability of receipt of an email although directly related to an event received after the event occurs being random increases as the number of days after the event occurs increases. In other words, the probability of the email being random is much higher if 100 days have passed since the event than if 3 days have passed since the event.
* So my problem reduces to how is the probability of receiving an email directly related to an event X number of days after the event occurs calculated.
* Intuitively, It seems that receiving both Virus Email 1 and Virus Email 2 has a bearing on calculating the probability of receiving both emails being random. Generally, if the probability for receiving Virus Email 1 was 1/50 and receiving Virus Email 2 was 1/50, the probability for receiving both virus emails would be 1/50 x 1/50. Intuitively, It seems that the probability of receiving both emails would be somewhat less than that because Virus Email 1 and Virus Email 2 are related in that Event A and Event B are related.

Thanks for any thoughts and or suggestions. Hoping to get a solution!

Steve
 
Physics news on Phys.org
Hi Steve,

Welcome to MHB! :)

It's a little hard to follow your situation but I'll make some general comments. I'm assuming this is a real life situation instead of an exercise from a textbook.

When we are calculating probabilities, we need to know the distribution of the random variable in question. Some events occur equally likely each day (uniform) while others do not (Poisson, geometric, normal, etc.). It can be dangerous to assume that something automatically follows a certain distribution without empirical data to back that up.

Also, how do you know that the emails and events are perfectly correlated? Maybe they have a strong correlation but don't follow each other exactly...

Anyway, nice to formula to use when we have some prior knowledge of an event is: $\displaystyle P(A|B)=\frac{P(A \cap B)}{P(B)}=\frac{P(A) \times P(B|A)}{P(B)}$.

This requires knowing the various probabilities though. What I think you should do, if this is a real problem, is work with some data to estimate the correlations and probability. You won't be able to work on this purely theoretically. The idea that the probability of both events occurring is $\dfrac{1}{50} \cdot \dfrac{1}{50}$ is only true if the events are independent. Since you think the events are perfectly correlated, they are definitely not independent.

Hope this gives you something to think about. :)
 
Hello Jameson,

Thanks for your reply to my poat!

I'm new as of today so I'm still just finding my way. Did I get your name right? I clicked on Reply To Thread to get here to reply back to you; was that correct?

Yes, this is a real-life situation. I appealed a decision by my insurance company to an independent review body (Event A) and three days later I received an email with a zip file attachment containing a virus (Virus Email 1). The content of the email said that I was to appear in court and my case would be heard by a judge in my absence if I don't appear. It said I needed to open the attachment for the details. I knew this was a fake because courts do not send out notices by email. I immediately suspected either the insurance company employee or her superior that made the decision in my case sent the email. Shortly after receiving Virus Email 1 I wrote a letter to the superior's boss (Event B) complaining about how the decision was made by the employee and her superior and complaining about Virus Email 1. Four days after that letter was sent, I received Virus Email 2, a duplicate of Virus Email 1. Checking the headers of both emails I found both emails had the same source but they were spoofed so I could not identify the sender. So now I want to determine if both emails were likely sent by the employee and or her supervisor or were randomly sent by a spammer and the dates of the emails were only coincidental with Event A and Event B.

As you can see, I have no data to work with to estimate the correlations and probability.

Does this make it more clear?
 
Yes, you got my name correctly as well as replied perfectly. :)

I am sorry you are in this pickle right now but to be straight with you, you don't have enough data to show anything and the data you have is based on non-verifiable assumptions. I'm a masters of statistics student and trying to leverage data into something meaningful is what I love to do, but you just don't have the right pieces.

In my opinion what you need is someone who specializes in internet security and could investigate further with what you have. Good luck.
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
3K
Replies
1
Views
2K
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
10
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
6
Views
3K
  • · Replies 10 ·
Replies
10
Views
5K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K