Distribution of difference of two random variables

Question

The problem is following:

Suppose we have a line segment with a length $L$ and we randomly choose (with uniform distribution) $n$ points on this segment, so we divide the main line segment into $n+1$ sub-segments.
Now we define a random variable which describes length of one of this sub-segment and we want to find it's distribution.
My approach to this problem was to firstly construct a random variable $X_i$ which realisations are points on a line segment, we know that it has uniform distribution:

$$ \rho_{X_i}(x) = 1/L,\;\; \text{if}\;\; 0\leq x\leq L\;\;\text{and}\;\; \rho_{X_i} = 0\;\;\text{elsewhere.} $$

Then we define a random variable $$ D_{i} = X_i - X_{i-1}\quad\mbox{with conditions that}\quad X_0 = 0,\ X_{n+1} = L $$ So we want to find distribution of a random variable $D_i$.
After short inspection of “Convolution of Probability”, I found out that distribution of SUM of the independent random variables is a convolution of distributions, but I don't know what is a formula for a difference.
My first guess was that it is a convolution but with a plus sign: $\int dy \rho (x+y)\rho (y)$ (which in a case of uniform distributions doesn't change anything), but the result is wrong - convolution of two constant functions is a constant function - and I expect something different (maybe an exponential distribution, idk).

I will be thankful for any advice.

The first issue with your approach is that there is no reason why $X_i - X_{i-1}$ must be positive. It's true that the pdf of each $X_i$ is $1/L$, but there is no ordering of these variables. What you might want to use is the order statistics, $X_{(i)}$ -- so for example, $X_{(1)} \equiv \min \{ X_i \}$. These have their own pdfs that will differ according to $i$. Have you worked with those before? — dmk, Commented May 8, 2024 at 12:44
This ought to give you almost everything you need: en.wikipedia.org/wiki/… — dmk, Commented May 8, 2024 at 13:47
Thanks for a hint. I have never worked with order statistics, in this article there is a ready formula for distribution of a random variable of $r$ order. I suppose after using this formula next steps are: $D_i = X_{(i)}-X_{(i-1)}$ and finding $\rho_{D_i}$ by applying convolution? Also I'm looking for some derivations of this formula, because I can't see it on wiki. — XaveryXavier, Commented May 8, 2024 at 15:48

dmk · Accepted Answer · 2024-05-08 18:14:59Z

The key bit from the Wikipedia article I mentioned above is its characterization of the difference of two order statistics from a $U(0,1)$ distribution: If $U_1, U_2, \ldots U_n \overset{\text{iid}}{\sim} U(0,1)$, then the difference between order statistics $U_{(k)} - U_{(j)} \sim \text{Beta}(k-j, n-(k-j)+1)$, where $k > j$.

So let's set $L=1$ for the moment. Since you are interested in consecutive order statistics, you'll have $D_i = X_{(i)} -X_{(i-1)} \sim \text{Beta}(1, n)$, for $2 \le i \le n$. If $i = 1$, we have

$$D_1 = X_{(1)} - X_{(0)} = X_{(1)} - 0 = \min\{X_i\},$$

and if $i = n+1$, we have

$$D_{n+1} = X_{(n+1)} - X_{(n)} = 1 - X_{(n)} = 1 - \max\{X_i\}. $$

Perhaps it's not too much of a surprise, given the symmetry of the $U(0,1)$ distribution, that these two edge cases actually have the same distribution. Therefore, we can manually compute the distribution of $\min\{X_i\}$ using straightforward methods:

\begin{align} P(D_1 \le d) &= P(\min\{X_i\} \le d) \\ &= P(\cup[X_i\le d]) \\ &= P(\cup [X_i>d]^\prime) = P([\cap (X_i > d)]^\prime) = 1 - P(\cap[X_i > d]) \\ &\overset{\text{iid}}{=} 1 - \Pi P(X_i > d) = 1 - \Pi [1 - F_X(d)] \\ &= 1 - (1 - d)^n \\ \end{align}

Since $F_{D_1}(d) = 1 - (1 - d)^n$, we have the pdf $f_{D_1}(d) = n(1-d)^{n-1}$; this is the pdf of a $\text{Beta}(1,n)$ variable. (It's not difficult to show that $D_{n+1}$ has the same distribution.)

Therefore, all of your $D_i$ have the same distribution: $\text{Beta}(1,n)$.

Wait! I just remembered: I set $L = 1$ :). Hopefully it's intuitive that your more generally defined $D_i$ is a simple transformation of $\text{Beta}(1,n)$ -- namely, if $W \sim \text{Beta}(1,n)$, then $ D_i = LW$ solves your problem :).

...

So. If you have the result from the Wikipedia page I cited, the problem becomes fairly straightforward. If you don't...

I'm pretty rusty on this stuff. I have some idea of a path through this problem; in fact, I started down one such path before I stumbled upon that very nice result on Wikipedia. But it might take a while to put those thoughts in order. I'll see what I can do, but hopefully the above is satisfactory for the moment :).

I would be hesitant to try convolution because, if you want to try something like $X - Y = X + (-Y)$, then you're gonna have to start defining new random variables, which will have new pdfs and new supports. So it probably won't be as nice as you expect -- regardless of how nice that is :).

Stack Exchange Network

Distribution of difference of two random variables

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Distribution of difference of two random variables

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions