AI Assistant
Transcript
00:02:210Paolo Guiotto: Okay, so…
00:17:80Paolo Guiotto: So, the topic, of today is, basically the, low… off… large.
00:28:200Paolo Guiotto: numbers.
00:33:890Paolo Guiotto: The law of large numbers, basically, is the theoretical result that justifies,
00:44:240Paolo Guiotto: The empirical interpretation of probability.
00:50:400Paolo Guiotto: In the sense that, for example, If we toss a coin… Well, we have,
01:00:880Paolo Guiotto: One half of possibilities to have had, one half of possibility to add tail.
01:09:910Paolo Guiotto: Well, of course, we cannot predict a single experiment, but if we repeat over a large number of times the same experiment of tossing a coin, we may say that 50% of times.
01:24:120Paolo Guiotto: we will have head, and 50% of time, we will have tail. This is the idea.
01:32:20Paolo Guiotto: This could be formalized in this way. So let's imagine that the experiment is represent the outcome, we should say the outcome is represented by a value like H or T, that stands for head or tail.
01:49:840Paolo Guiotto: But we have another customer here. But to treat numerically this, we do the following. We assume that the variable XK represents the outcome
02:07:120Paolo Guiotto: Of, the case.
02:11:800Paolo Guiotto: toss… of a coin.
02:19:310Paolo Guiotto: And we give the interpretation that XK
02:22:870Paolo Guiotto: Equals 0, for example, means, head.
02:28:890Paolo Guiotto: And XK equal 1 means state.
02:36:90Paolo Guiotto: From the probabilistic point of view, the probability that XK is 0 is one half.
02:46:10Paolo Guiotto: And the probability that XK is 1 is 1 part. So in practice, these variables are Bernoulli random variables, so…
02:57:330Paolo Guiotto: the XK… are… Bernoulli.
03:06:70Paolo Guiotto: random variables.
03:08:720Paolo Guiotto: The fact that each experiment is not affected by the previous ones is translated formally, assuming that these XK are independent random variables. So XK
03:25:520Paolo Guiotto: independent.
03:28:470Paolo Guiotto: random variables.
03:31:770Paolo Guiotto: Now,
03:35:00Paolo Guiotto: the average, experiment, can be translated by doing an average of these outcomes. So if we take a number, capital N of experiments, we assume that this number is big.
03:52:330Paolo Guiotto: And we do the average of the outcomes, so some 4K going from 1 to n of dxk.
04:01:220Paolo Guiotto: So, this is why we transformed these variables in numerical variables, such a way that we can do something like this.
04:09:690Paolo Guiotto: So we may expect that if the half of times we add more or less zero, and half of time we have ones, this average should be about 1 half.
04:21:820Paolo Guiotto: So we expect that when n is bigger, this quantity will approach the value 1 half for n going to plus infinity.
04:31:610Paolo Guiotto: Oh, let's see. Intuitively, or in a popular interpretation of probability, the probability is referring just to this fact, that when you do a repeated number of experiments, the average outcome
04:49:810Paolo Guiotto: is what, is what happens in reality. And this is exactly, what? This value 1 half.
04:58:870Paolo Guiotto: The value 1 half turns out to be the expected value, the common expected value of the XK.
05:06:430Paolo Guiotto: Because if you think about, this would be, since the variable is a Bernoulli variable, so it takes just two values, we have to do, if you want, the expectation that XK can be 0 times on the event where XK is 0,
05:24:240Paolo Guiotto: This is the variable, plus 1 times the indicator when the XK is 1. Now, of course, this product is 0, so we get 0 here, and therefore, it remains the expectation of the indicator where XK is 1, and that's the probability where XK is 1, which is equal to 1F.
05:46:30Paolo Guiotto: So,
05:48:650Paolo Guiotto: Now, however, to prove that this is true is non-trivial, and moreover, these quantities here are averages of random variables, so they are themselves random variables.
06:06:210Paolo Guiotto: So here at left, we have a sequence of random variables that converges in some suitable sense. We have seen there are different senses of convergence to the common expected value of the sequence.
06:19:610Paolo Guiotto: So, the way this sequence converges to is important, because it changed the meaning, the interpretation, the force of this result. However, the law of large number should be just this factor. So.
06:38:30Paolo Guiotto: The law of large numbers is to justify that when you do an average, Of, certain variables.
06:48:120Paolo Guiotto: XK. What DXK are?
06:51:910Paolo Guiotto: independent, And, identically.
07:00:850Paolo Guiotto: distributed.
07:05:450Paolo Guiotto: So, with the same distribution, here, for example, all the XK were Bernoulli random variable with the same distribution. You see that this distribution does not change with K.
07:18:460Paolo Guiotto: This thing will converge in some suitable sense to the common expected value of DXK,
07:25:700Paolo Guiotto: Well, actually, to write this, you need also that DXK are not just random variables, but they might… they… they must have the expectations, okay? So, which is some quant… some quantity, let's say, to some number, let's say, better…
07:43:180Paolo Guiotto: to M…
07:44:730Paolo Guiotto: which is constantly the value, expected value of the XK, so this is to say that it is independent of K, because they all have the same distribution, so they will have the same expected value.
07:58:960Paolo Guiotto: Provided that this XK be a sequence of variables for which the expectations is defined, so L1.
08:09:110Paolo Guiotto: So the law of large numbers should tell something like this, and
08:15:420Paolo Guiotto: Of course, the problem is, is that true under these conditions, and what is the…
08:22:70Paolo Guiotto: way of convergence. So basically, we have, typically two… we consider two families of, law of large numbers. The weak law, where the convergence is weak, in some sense.
08:36:700Paolo Guiotto: And for weak, we mean either in distribution or in probability. And strong low, where the convergence is either point-wise, almost sure, or in L1, okay? But apart for these differences, the message is always the same.
08:52:800Paolo Guiotto: So we start with the weak law, and let's say one of the classical weak law is the Chebyshev weak law.
09:03:630Paolo Guiotto: So, let's talk about the Chebyshev…
09:11:360Paolo Guiotto: Weak.
09:13:620Paolo Guiotto: low.
09:16:790Paolo Guiotto: As you understand, now, the stronger is the type of convergence, the more difficult will be to
09:26:160Paolo Guiotto: Prove to obtain the… the conclusion.
09:29:890Paolo Guiotto: So this is, this, proposition. We assume here…
09:35:600Paolo Guiotto: a little bit more restrictive assumptions on one side, and less restrictive on another. So we assume that DXK are in L2, because we need the variance, basically.
09:48:490Paolo Guiotto: omega, so it's a bit more than L1.
09:52:700Paolo Guiotto: independent, random variable.
09:58:300Paolo Guiotto: None by the boss?
10:00:00Paolo Guiotto: such that, huh?
10:02:20Paolo Guiotto: we relax a bit this setup. We do not need that they are identical, distributed.
10:12:520Paolo Guiotto: But we need just that they have the same mean value, so they could be differently distributed, such that the expected value of XK, which is well-defined, because if you are in L2, you are also in L1,
10:26:920Paolo Guiotto: is constantly equal to the value M, And, the variants…
10:34:270Paolo Guiotto: of the XK, we need to have a control of this quantity just bounded. It's bounded by some constant M for every K.
10:46:130Paolo Guiotto: So, the variance of… XK, as a, let's say, as a sequence, is bounded.
10:57:470Paolo Guiotto: These are positive numbers, bounded, so positive and bounded, but this is not a requirement because the variance is always positive.
11:06:740Paolo Guiotto: So, as you can see, it is… of course, if the variables are identically distributed, they have the same expectation, they have the same variance, so the two quantities are constant, and so these hypotheses are verified.
11:21:470Paolo Guiotto: But let's say that this is a little bit less restrictive. It says just they must have the same mean and bounded variance.
11:30:640Paolo Guiotto: Then we have the, this conclusion that if we do the average.
11:36:890Paolo Guiotto: 1 over n sum for k from 1 to n of dxk, so the average of DXK. This goes in probability.
11:46:970Paolo Guiotto: to the value n.
11:49:420Paolo Guiotto: And moreover, there is also an important bound, which is called the Chebyshev bound.
11:58:880Paolo Guiotto: Moreover, That, in fact, It's the Chebyshev boundary, which is proved there.
12:07:820Paolo Guiotto: And from Chebyshev bound, it follows the conclusion. The… Chebyshev… Buying one of the… Also…
12:21:160Paolo Guiotto: This says that you assess the probability that that average, 1 over n, sum for K going from 1…
12:29:420Paolo Guiotto: to N of DXK.
12:32:10Paolo Guiotto: Be away from the common mean value.
12:37:110Paolo Guiotto: for a value epsilon. Now, the probability that this happens is less or equal than 1 over epsilon square n squared, times the sum for k going from 1 to n of the variance of the XK.
13:04:570Paolo Guiotto: Okay, basically, as you'll see in a second, the Chebyshev bound implies the convergence in probability, no? Because, well, let's introduce a symbol, which is, we call XN average as just 1 over N.
13:21:360Paolo Guiotto: sum for K going from 1 to N of Sorry, yeah, there is XK.
13:29:570Paolo Guiotto: of the XK.
13:33:870Paolo Guiotto: So, what we are going to do is, we start, Let me start.
13:41:750Paolo Guiotto: Little ginger.
13:45:650Paolo Guiotto: the… Chebyshev… Bound.
13:52:450Paolo Guiotto: So we have to prove that the probability that this XN bar
13:58:130Paolo Guiotto: is away from the common mean value M.
14:01:690Paolo Guiotto: For at least this epsilon.
14:05:350Paolo Guiotto: So what we do is, we just rewrite, but, you see, there is this 1 over N,
14:14:780Paolo Guiotto: That can be carried on the right-hand side, saying that, the distance between some
14:25:490Paolo Guiotto: for K going from 1 to N of DXK minus NM.
14:33:870Paolo Guiotto: is greater or equal than epsilon n, this probability.
14:41:90Paolo Guiotto: Now,
14:43:60Paolo Guiotto: we put this M into the sum, because this is M plus M plus M plus M, and times, no, M plus M plus M…
14:53:960Paolo Guiotto: N times.
14:58:650Paolo Guiotto: And we give, to each XK and M, so this is transformed into the probability that the modulus of sum of a k going from 1 to n of the XK minus m
15:14:670Paolo Guiotto: So models of this greater or equal, then, epsilon n.
15:21:280Paolo Guiotto: Now, here we use the, the Chebyshev inequality.
15:26:400Paolo Guiotto: So the Chebyshev inequality
15:30:830Paolo Guiotto: says that if we have a positive random variable, if y is positive, it's just a measurable function.
15:41:170Paolo Guiotto: Then we have that the probability that Y is greater or equal than lambda, for, let's say, Alpha.
15:50:790Paolo Guiotto: Alpha is a positive number here. This is less or equal than 1 over alpha.
15:57:600Paolo Guiotto: the, expectation of Y. This is the… Stand up Chebyshev bound.
16:05:300Paolo Guiotto: But here, we use a refinement of this.
16:09:360Paolo Guiotto: here… we need… Refinement…
16:23:590Paolo Guiotto: of this… inequality.
16:27:720Paolo Guiotto: which is based, basically, on the same idea. So I say that probability that Y is greater than alpha. You remind that to prove the Chevyshian equality, we just say this is the expected value of the indicator Y greater or equal than alpha, okay?
16:46:960Paolo Guiotto: Now, the trick is that to think that there is a 1 here, In front of the indicator.
16:53:710Paolo Guiotto: And when you take an omega for which Y of omega is greater or equal than alpha, by dividing by alpha, this is equivalent, since alpha is positive, of saying that 1 is less or equal than Y omega.
17:08:890Paolo Guiotto: of alpha, then you put this into the expectation, and you get the inequality.
17:16:440Paolo Guiotto: Well, since Y over omega… y omega over alpha is greater or equal than 1, and everything here is positive, the same is for the square of this.
17:28:10Paolo Guiotto: So we get, or if you want, you square the inequality, and then you divide by alpha squared. So we get that 1 is less than Y squared divided by alpha squared.
17:38:830Paolo Guiotto: And so we plug this now into this expectation, so we get that this is less or equal, since everything is positive, then Y squared over alpha square indicator of Y greater or equal than alpha.
17:56:330Paolo Guiotto: So now we can carry outside this 1 over alpha squared, so we get 1 over alpha squared, expected value of Y squared times the indicator where Y is greater or equal than alpha.
18:10:980Paolo Guiotto: And this would be the Chebyshev inequality.
18:15:290Paolo Guiotto: Y probability greater or equal than alpha is less or equal than this. And now, since this one is less or equal than 1, we don't need to use… this is also less or equal than 1 over alpha squared, the expected value of Y squared.
18:32:420Paolo Guiotto: And this is what we use.
18:34:870Paolo Guiotto: So we will use this,
18:38:970Paolo Guiotto: this form of the Chevys-chevin equality. Probability that Y is greater or equal than alpha is less or equal than 1 over alpha squared, the expected value of Y squared. So, we apply this…
18:57:220Paolo Guiotto: this… 2?
18:59:690Paolo Guiotto: We have to assess the probability that,
19:03:130Paolo Guiotto: the modulus of the sum for k going from 1 to N, XK minus n, greater or equal than epsilon M. This is the Y.
19:12:410Paolo Guiotto: And this is the alpha.
19:15:380Paolo Guiotto: So… modulus sum for k going from 1 to N, XK minus N,
19:25:220Paolo Guiotto: rate would equal, then, n epsilon, or epsilon n.
19:33:280Paolo Guiotto: So this is less or equal than. This is our alpha, and this is our Y.
19:40:180Paolo Guiotto: So we have 1 over alpha squared, so epsilon squared n squared.
19:46:420Paolo Guiotto: times the expectation of the square of Y, so the square of absolute value of some k from 1 to N
19:56:450Paolo Guiotto: Xk minus N.
20:01:690Paolo Guiotto: And now we developed the square.
20:04:110Paolo Guiotto: So, this modular square is just an algebraic square. Here, we are dealing with real numbers, so there is not…
20:11:320Paolo Guiotto: The conjugate, so we have this is the square of this.
20:15:980Paolo Guiotto: How can we represent this square? Well, we could say that it is, since we have a sum, sum for k going from 1 to n of XK minus XK minus m.
20:30:890Paolo Guiotto: It is this multiplied by itself, so by a copy of the same sum, but let's use a second index, J, going from 1 to N of XJ minus M.
20:43:740Paolo Guiotto: So, since we have the product of two sums, we use the distributive property, and this is the sum of the two indexes, J and K, ranging from 1 to N, of these products, XK minus M times XJ minus M.
21:03:740Paolo Guiotto: All this is into the expectation. So we have, here, continuing down here, 1 over epsilon square n squared. The expectation of this double sum
21:18:630Paolo Guiotto: The expectation is linear, so we can carry outside the double sum, sum over J and K going from 1 to N, of the expectation of this product, XK minus M, XJ minus N.
21:36:340Paolo Guiotto: We are almost at the end, because now we notice that if we look at the expectation of XK minus M times XJ minus M,
21:48:920Paolo Guiotto: We remind that in our assumption, DXK are independent.
21:57:420Paolo Guiotto: So this, this expectation splits
22:01:590Paolo Guiotto: Provided the two indexes are different. So we have basically two cases. The index K is different from the index J, and this splits into the product of the two expectations, so expectation of XK minus M times X.
22:17:800Paolo Guiotto: Expectation of XJ minus N.
22:22:30Paolo Guiotto: But the expectation of X k minus M, XKJ minus M is 0, because this is equal to the expectation of XK,
22:32:610Paolo Guiotto: minus M out of the expectation, and that's equal to 0, because M is the common value for the expectation. This comes from the first hypothesis, say, here, hypothesis 1.
22:52:320Paolo Guiotto: They have the common, a common mean value.
22:55:880Paolo Guiotto: So, in particular, all these quantities are equal to 04K different from J.
23:03:40Paolo Guiotto: for K equals J, we cannot do that, but they become the expectation of XK minus M squared, and that's exactly the variance of XK.
23:18:620Paolo Guiotto: So, this means that in this sum here, the sum of these expectations.
23:25:580Paolo Guiotto: All terms with K different from J are 0. It remains only the terms with k equals J. So here, we continue saying that it is equal to
23:38:740Paolo Guiotto: 1 over epsilon square n squared sum for k going from K going from 1 to N, and for that case, the expectation of the product becomes the variance of the XK.
23:55:660Paolo Guiotto: And that's the Chebyshev bound, no? So, we proved that probability that the average extent
24:05:210Paolo Guiotto: minus n is larger than epsilon, is less or equal than 1 over epsilon square n squared sum for k equal 1 to n of variance of the XK.
24:22:960Paolo Guiotto: And this is the Chebyshev… Bound.
24:30:420Paolo Guiotto: And now we can conclude with the convergence in probability.
24:36:40Paolo Guiotto: So, from this,
24:42:400Paolo Guiotto: Xn, average, converges in probability to M.
24:47:890Paolo Guiotto: Because you know that the convergence in probability means that exactly that probability goes to zero when n goes to infinity.
24:58:440Paolo Guiotto: So we can say that the limiter in N
25:01:770Paolo Guiotto: of the probability that the distance between the average of the XK and M is greater or equal than epsilon.
25:11:890Paolo Guiotto: Well, this is less or equal than…
25:15:560Paolo Guiotto: The limit of this right-hand side, the limit when n goes to plus infinity.
25:22:250Paolo Guiotto: of 1 over epsilon square n squared sum for k going from 1 to N.
25:29:360Paolo Guiotto: of the variance of the XK.
25:33:70Paolo Guiotto: And here is where the assumption 2 comes.
25:37:220Paolo Guiotto: So, since this is bounded by a universal constant M, independent of this K, this is by assumption 2,
25:49:110Paolo Guiotto: Which is the second one.
25:55:140Paolo Guiotto: So we get that this sum, we can replace with M all the variances, and we have that this is less or equal than limit.
26:06:240Paolo Guiotto: for n going to plus infinity, 1 over epsilon square n squared.
26:11:910Paolo Guiotto: And that sum becomes the sum of m repeated n times, so it is N capital M, so we cancel 1M here, and so still we have the limit when n goes to plus infinity of M divided epsilon squared n, and this limit is equal to 0.
26:29:830Paolo Guiotto: And this finishes the proof.
26:33:310Paolo Guiotto: of the Chebyshev flow.
26:38:970Paolo Guiotto: Now, just… we don't… we don't do this kind of applications, and also because this is not the best possible way to do this.
26:49:700Paolo Guiotto: But just to let… let illustrate,
26:54:790Paolo Guiotto: how this Chebyshev bound, can be used. Typically, this is done in statistics. You want, suppose that you want to determine the order of the average, so the N, you have to take
27:13:380Paolo Guiotto: to have that this probability, the probability that the average is away from the mean, the common V value, is away from… by a fixed error, epsilon, with a certain small probability, no? So typically, example.
27:32:520Paolo Guiotto: So, assume that, You know, that the… XKR in L2… the common value
27:46:440Paolo Guiotto: of the XK is M.
27:50:560Paolo Guiotto: For this, this, type of application is not…
27:57:670Paolo Guiotto: you don't even need to know, but you know that the variance of the XK is, I don't know, for example, here, we take a value, specific value 2, like so.
28:08:840Paolo Guiotto: So, the problem is determine N,
28:12:800Paolo Guiotto: In such a… such a way…
28:20:670Paolo Guiotto: that… the probability that Xand
28:26:500Paolo Guiotto: the average of the X scale for K going from 1 to N. Mismatch this value M by more than 1%, so greater than, say, the 1% of M.
28:42:610Paolo Guiotto: So, M over 100. B less than 1%. B… less.
28:50:780Paolo Guiotto: That.
28:53:670Paolo Guiotto: One other 100.
28:56:380Paolo Guiotto: So, what could we do? So, we know that the probability that the average of the XN
29:03:500Paolo Guiotto: the distance between the average and the common mean value is greater equal than epsilon. This is bounded by 1 over epsilon squared, n squared, then we have the sum for k going from 1 to n of the variances of the XK. So this is the Chebyshev bound.
29:23:400Paolo Guiotto: Okay, so, if we, put this, this, value N inside.
29:33:680Paolo Guiotto: we have that the probability that the distance between XN and M is larger than, M.
29:43:730Paolo Guiotto: Over 100, so 100% of, you are away for… 100% of the mean value.
29:52:640Paolo Guiotto: This is less or equal, so this is the epsilon.
29:56:660Paolo Guiotto: 1 over, M over 100 square.
30:02:500Paolo Guiotto: N squared, the sum of the variances, k from 1 to N,
30:08:90Paolo Guiotto: And since it is given that these variances are all equal to 2, so we are assuming 2,
30:14:160Paolo Guiotto: And so this sum will be equal to 2N.
30:17:990Paolo Guiotto: So that's probability.
30:20:210Paolo Guiotto: Probability that, the average of the XK is away from N for about 1% of M,
30:30:990Paolo Guiotto: is bounded by this, so…
30:35:380Paolo Guiotto: we have, this N simplified with this N, so…
30:40:940Paolo Guiotto: We have a two times,
30:44:170Paolo Guiotto: 10 to 4 divided the M squared times M times N.
30:50:400Paolo Guiotto: She wanted this to be less than
30:53:680Paolo Guiotto: You want that this be less than 100%, huh?
31:01:730Paolo Guiotto: Well, you could impose that this one be less than 1 over 100.
31:08:930Paolo Guiotto: If this is true, the other one will be smaller than 1 over 100. And this,
31:14:50Paolo Guiotto: it's an inequality for the number n of element. You have to sum to get the
31:23:430Paolo Guiotto: value. So, this means that,
31:27:900Paolo Guiotto: So, this is… this is N greater or equal, let's say, less or equal.
31:35:260Paolo Guiotto: Then, say, 2 times, 10 to 6… divided the M squared.
31:43:00Paolo Guiotto: So it says you have to consider this order of sum, no?
31:49:300Paolo Guiotto: So this is used by statisticians, for example, to determine the size of a sample when you…
31:56:940Paolo Guiotto: We want to… Determine the size of a sample in the population. You have to interview to determine
32:03:880Paolo Guiotto: an average answer, which is the expected answer for a certain given error, this says you have to consider this number of people. Now, this number in this example with a real M, could be very large, so it's, there are more
32:27:760Paolo Guiotto: efficient bounds to do this job, but this is an example on… of,
32:33:740Paolo Guiotto: How this, bound could be used to determine,
32:38:660Paolo Guiotto: To determine the sample of a… the size of a sample.
32:46:30Paolo Guiotto: Now, let's illustrate a couple of remarkable applications of the weak low, and the first one is the so-called Monte Carlo
33:05:580Paolo Guiotto: method.
33:09:700Paolo Guiotto: to compute, integrals.
33:17:650Paolo Guiotto: And actually, it is mostly used in probability to compute expectations.
33:30:450Paolo Guiotto: I will, I will make easy on the case of integrals. Suppose that we want to compute,
33:37:960Paolo Guiotto: What was that?
33:42:410Paolo Guiotto: we… Mmm, to computer… an integral, say, from 0 to 1 of a function f of x dx.
33:56:740Paolo Guiotto: It is not important that the interval be 01, okay? You can translate and rescale it to every interval.
34:04:950Paolo Guiotto: Now, if the function is good, say, if F is continuous.
34:12:860Paolo Guiotto: Well, intuitively, we may expect that that integral F of X,
34:19:510Paolo Guiotto: You know how an integral arises. So in the construction of the integral, you start doing a division of the integration interval in parts, in certain number of parts, finite number of parts.
34:35:960Paolo Guiotto: Then, you could, say that to compute this, you, do a… you discretize this by a sum.
34:46:940Paolo Guiotto: where you evaluate the function, say that these parts are X0, that coincide with 0, X1, and in general, XK, XK plus 1, and the last one, let's call it XN.
35:02:580Paolo Guiotto: So, what we do is, if this is the function.
35:07:10Paolo Guiotto: on the interval from XK. XK plus 1, we replace the true area with an approximated area. That could be, for example, the area of this rectangle I'm coloring here. This area is…
35:23:50Paolo Guiotto: for example, the value of the function on the left point, so F at point XK, times the length of the basis, which is XK plus 1 minus XK, and then you sum up these quantities. So you do the sum for k going from literally 0 to n minus 1,
35:42:50Paolo Guiotto: FXK times XK plus 1 minus XK.
35:49:450Paolo Guiotto: Now, actually, this division is a division that depends on n, because n is the number of points you are dividing the interval, and the idea is that this formula becomes exact when you send N to infinity.
36:03:100Paolo Guiotto: So, formally, the result is integer 0 to 1, f of x. Dx is the limit when…
36:11:140Paolo Guiotto: and goes to plus infinity of these sums, sum for k going from 0 to n minus 1,
36:19:40Paolo Guiotto: F of… well, it is clear that the points, since they depend on n, they are n points, so they depend on n, so I should write something like X…
36:32:950Paolo Guiotto: Kn, XK plus 1N minus XKN. So they are points of the, nth division we are using.
36:46:140Paolo Guiotto: So the condition that makes this formula exact is that, provided
36:55:380Paolo Guiotto: We need that this subdivision is not any subdivision.
36:59:960Paolo Guiotto: But we add points, and we add to add points in such a way that the maximum distance between two consecutive points goes to zero.
37:08:960Paolo Guiotto: Otherwise, imagine that we have a lot of points that concentrate here, and we keep the first interval like that, it's here that we don't get an approximation of the area. So the condition is that, provided the deep maximum
37:25:740Paolo Guiotto: distance between two consecutive points, so XK plus 1N and XKN.
37:33:20Paolo Guiotto: So the maximum is referred to the points of the n subdivision. So for K going from 0 to n minus 1. So this quantity is now, depending on N.
37:47:390Paolo Guiotto: Suppose that we can compute this quantity, this goes to zero.
37:51:990Paolo Guiotto: Okay? Now, we accept this fact that can be proved if the function f is continuous. But intuitively, it's what is behind the idea of the Riemann integral, no? You do,
38:05:780Paolo Guiotto: the area by computing, by replacing the true area between the function and the axis by the areas of rectangles, and then you send the number of rectangles to infinity in a way that the basis of this rectangle shrinks down to zero.
38:23:850Paolo Guiotto: In particular, a particular way of doing that is dividing the interval 0, 1, in n equal parts. So if we divide the interval 0, 1,
38:35:100Paolo Guiotto: in n equal parts, this is easy, because each part has the same length, so they will have length 1 over n. So the first is 0, the second is 1 over n, the third is 2 over N, the fourth is 3 over n, and so on. The last one is 1, which is n over n.
38:55:790Paolo Guiotto: So the generic here in the middle is just K over N.
38:59:830Paolo Guiotto: So here you have the formula XKN is just K over N.
39:06:590Paolo Guiotto: 4K going, from, zero.
39:13:330Paolo Guiotto: 2.
39:14:270Paolo Guiotto: And… In this case, it is clear that if you do the maximum.
39:20:630Paolo Guiotto: between two consecutive points, so the distance between two consecutive points, that quantity is always the same, because this is our point XKN, the next one is k plus 1 over N. The distance between the two is the same number, 1 over N.
39:41:520Paolo Guiotto: So, all these distances are always equal to 1 over n independently of K, and that maximum is 1 over N.
39:51:280Paolo Guiotto: And therefore, this goes to 0 when n goes to infinity. So this particular subdivision verifies the requirement that must be verified according to this theorem. So we can say that
40:07:730Paolo Guiotto: From this, it follows that the integral from 0 to 1 of the function f
40:13:830Paolo Guiotto: is the limit for n going to plus infinity.
40:18:420Paolo Guiotto: Off.
40:19:360Paolo Guiotto: F of DXK, no? You see the formula.
40:23:880Paolo Guiotto: F of DXK, but XK is K over N, so F of K over N…
40:30:840Paolo Guiotto: times, then we have the length of, the distance between, the difference between two consecutive XK, which is equal to 1 over N.
40:40:540Paolo Guiotto: Oh, sorry, there is a sum for k going from 0 to n minus 1.
40:45:250Paolo Guiotto: And this is the limiter for n going to plus infinity. Look what we have here. 1 over n sum for k going from 0 to n minus 1 of F of K over N.
41:01:930Paolo Guiotto: Now, if you look at this quantity, this is an average, because it is 1 over n, then there is the sum of certain numbers. So it sounds like an average. And here, it is where it pumps the idea to use the law of large numbers.
41:18:320Paolo Guiotto: To transform this limit into an average of random variables.
41:24:470Paolo Guiotto: Now, imagine that since these values here, these values are values distributed in the interval 0, 1 in a, let's say, in an equal way, no? Because we are taking the division of 0, 1 in n equal parts.
41:44:650Paolo Guiotto: Okay?
41:45:890Paolo Guiotto: So, we may imagine that, this could be…
41:53:290Paolo Guiotto: Replaced by variables, random variables, that takes values in 01 in a uniform way.
42:02:690Paolo Guiotto: Because, you see, these points are uniformly distributed in the interval 0, 1. I'm not saying any specific definition of what does it mean uniformly distribution. But you can say that if you fix an interval, say, in 01,
42:18:110Paolo Guiotto: the number of points is proportional to the length of the interval, because they are uniformly distributed, no? And this reminds of the uniform distribution, because if you take the values of a uniform random variable, distributed in 0, 1,
42:34:870Paolo Guiotto: That variable takes value in an interval proportionally to the length of the interval, no? If U is uniform on the interval 0, 1,
42:48:980Paolo Guiotto: then the probability that U belongs to some subinterval AB of 01,
42:56:830Paolo Guiotto: In this case, it's exactly, well, it is proportionate, it's… in this case, it is directly equal to the length of the interval, because, as we know, the total measure of the interval 0, 1 is 1.
43:09:940Paolo Guiotto: So, a uniform random variable would be distributed uniformly as well as these points, k over n. So the idea is, so let's take… well, let's use the same letter, so let's call X this variable.
43:29:30Paolo Guiotto: So, let's call X. So, letter XK now.
43:34:990Paolo Guiotto: XK, B. Independent, random variables.
43:43:30Paolo Guiotto: With, a distribution, uniform distribution, In the interval 0, 1.
43:53:720Paolo Guiotto: Now, in this case, these XK are, we notice that this XK, XK,
44:01:720Paolo Guiotto: are, of course, contained into L2, omega, the common expected value.
44:11:370Paolo Guiotto: of the XK, Is constantly equal to,
44:21:280Paolo Guiotto: One-off, but it's… this is not just the random variable we are… Hmm…
44:31:900Paolo Guiotto: Now, if we want to use the same notation, sorry.
44:36:300Paolo Guiotto: So, let's use the letter Y for this variable.
44:42:70Paolo Guiotto: So let's use this. It is not this the variable to which we apply the law of large numbers, so let's call YK…
44:52:940Paolo Guiotto: Uniform distributed, uniformly distributed in 01, so probability that,
44:58:200Paolo Guiotto: Y belongs… YK belongs to AB is B minus J, so YKB independent, random variable, uniformly distributed in 01.
45:11:630Paolo Guiotto: And, so what are the X, K, R, F of these, Waiki.
45:22:270Paolo Guiotto: define XK as F of this YK.
45:29:560Paolo Guiotto: Now, what… what about this XK? Well, you can say that since F is supposed to be continuous on the interval 0, 1,
45:43:360Paolo Guiotto: In particular, this F is bounded, F is… Bounded.
45:50:860Paolo Guiotto: So, this means that modulus of XK, which is the modulus of F,
45:56:310Paolo Guiotto: YK, what is… whatever is this YK, this is bounded by a constant, which is the infinity norm of F.
46:06:160Paolo Guiotto: So the… this is to say that DXK are actually in L infinity, so they are clearly contended into L2.
46:15:710Paolo Guiotto: And moreover, the common expected value, they have a common expected value, because if you do the expectation of XK, this is the expectation of F
46:28:20Paolo Guiotto: White Cade.
46:31:80Paolo Guiotto: But what is this? You remind that YK is a uniform, informally distributed random variable.
46:38:920Paolo Guiotto: So this would be, the integral between, let's say, formally, it is integral on R of F of Y times the density of YK,
46:52:740Paolo Guiotto: Y in DY. These are absolutely continuous random variable, they have a density. But the density of this YK, since these are uniformly distributed in 01, is the indicator of 01,
47:07:270Paolo Guiotto: divided by the length of the interval, so 1 in this case. So this becomes the integral on R of F of Y times the indicator, 01,
47:20:110Paolo Guiotto: Why?
47:22:240Paolo Guiotto: And this is the integral from 0 to 1 of F of Y dY. So, as you can see, they have a common mean value, which is the integral of F between 0, 1, which is the quantity we want to compute.
47:41:20Paolo Guiotto: About the variants, the variance, since they are bounded.
47:46:740Paolo Guiotto: The variance of XK is, is, controlled by, is, bounded.
47:56:150Paolo Guiotto: Because…
47:59:630Paolo Guiotto: the XK are in Linfinity, so they are globally bounded, there's also the variance, which is the, if you want, it is the expected value of XK
48:11:690Paolo Guiotto: square minus M squared. So whatever it is, this is less or equal than the expected value of XK squared, but this is bounded by the infinite norm of F.
48:26:360Paolo Guiotto: squared, so this is controlled by the infinity norm of F.
48:30:750Paolo Guiotto: square. This is the value of the constant M that bounds the variances of these random variables.
48:39:950Paolo Guiotto: So, as you can see, these variables, XK,
48:42:980Paolo Guiotto: are, independent, because, they are functions of independent random variable. So, and moreover…
48:58:820Paolo Guiotto: the, XKR… independent. So they are independent.
49:06:410Paolo Guiotto: They, are in L2, because they are in L infinity. They have a common expected value, M, which is the integral between 0, 1 of the function f, and they have a bound on the variance, which is provided by the square of the infinity norm of F.
49:24:480Paolo Guiotto: So, this implies that, according to the Chebyshev bound, the probability that the average of this XN
49:36:160Paolo Guiotto: differs from M, which is the integral 0 to 1 of the function f, F of YDY.
49:46:160Paolo Guiotto: For more than epsilon.
49:48:880Paolo Guiotto: So this is the value of M.
49:51:400Paolo Guiotto: This quantity is bounded by 1 over epsilon squared.
49:57:860Paolo Guiotto: So we say that, since, we have n squared, the sum of the variances of the XK,
50:08:490Paolo Guiotto: But these are bounded by this number, so they are bounded by…
50:13:860Paolo Guiotto: the infinity norm of F squared.
50:17:960Paolo Guiotto: So this will be bounded by 1 over epsilon squared. N squared times… each of these 10 is bounded by a constant, so the sum of them will be bounded by n times that constant for this case.
50:33:530Paolo Guiotto: is the infinity norm of F squared, and therefore we get this. So, the probability that
50:42:730Paolo Guiotto: the average over DXN, so 1 over N, sum for K going from 0 to, so from 1 to N.
50:54:610Paolo Guiotto: of the XK. DXK are F of YK, these are the XK, minus the integral between 01 of F of Y
51:06:660Paolo Guiotto: DY. Greater than epsilon, is controlled by…
51:12:540Paolo Guiotto: 1 over epsilon squared n times the infinity norm of F squared.
51:20:720Paolo Guiotto: So let's write this at numerator.
51:24:10Paolo Guiotto: Now, what this formula says.
51:27:710Paolo Guiotto: This says something, curious, that if you want to compute this integral, it's a numerical integral has nothing to do with probability integrals, of course.
51:37:470Paolo Guiotto: You could take your functioning.
51:40:130Paolo Guiotto: evaluate at random the function, at points delivered by uniform random values. So, picking at random values in the interval 0, 1, with the rule that these numbers must be,
51:54:170Paolo Guiotto: pick it in an independent way, and with uniform distribution. So, you should take a grant with the same probability of picking a number here or there, because that's uniform. You do the average of these values.
52:10:70Paolo Guiotto: And apparently surprisingly, because you are taking points randomly, you know?
52:15:270Paolo Guiotto: This average is a good approximation of this integral.
52:22:560Paolo Guiotto: why this is a new method, somehow is considered a new method of computing integrals. Because normally, in analysis, when you compute an integral, there are lots of formulas, approximation formulas for the integral based on,
52:38:770Paolo Guiotto: Specifically for computing this area, so they are moved by a geometric idea.
52:47:70Paolo Guiotto: Why this formula here says that you could compute that integral by generating at random numbers, for example, by a computer.
52:57:220Paolo Guiotto: You generate the… and you can do that because medical software allows to do that, you generate…
53:04:830Paolo Guiotto: randomly numbers, with the unique rule that they must be taken uniformly in 01, so this base here. So if the range is indica from A to B, you will have to take numbers between A and B.
53:17:380Paolo Guiotto: uniform way. You're evaluating the function of this number, you do the arithmetic average of these values, and you get an approximation of the integral.
53:27:340Paolo Guiotto: He says, how much is the probability that you are away from that value, and it gives this bound. It's not a very good bound, because it goes to 0 as 1 over n, so it's considered a slow bound.
53:44:330Paolo Guiotto: But it introduces… it gives a different way to compute the integrals and then expectations. So this is, for example, used… this idea is used to compute numerically the expectations in probability.
54:00:190Paolo Guiotto: By doing this kind of… averages.
54:05:540Paolo Guiotto: So this is a nice, application.
54:10:320Paolo Guiotto: A second nice application is, maybe,
54:20:660Paolo Guiotto: But let's say that also this is, it's… it's somehow interesting. Is the, Bernstein proof of what is called the Weiostrass approximation theorem? So, what is the wire struss
54:42:190Paolo Guiotto: approximations.
54:45:420Paolo Guiotto: urine.
54:49:220Paolo Guiotto: So the VISA approximation theorem says the following, that if you have a continuous function on a closed and bounded interval, AB,
55:00:290Paolo Guiotto: So, every F continues on a closing bounded interval is… uniform.
55:12:700Paolo Guiotto: limit.
55:16:40Paolo Guiotto: of polynomials.
55:22:720Paolo Guiotto: So, what does it mean? So, that is…
55:28:680Paolo Guiotto: there exists a sequence, let's say, call PN, of… polynomials.
55:41:610Paolo Guiotto: Such that this sequence PN converges in infinity normal.
55:48:610Paolo Guiotto: to F, and the infinity norm here is the maximum of the modulus, because we are dealing with continuous functions.
55:57:150Paolo Guiotto: Now, the classical proof of YSS theorem
56:02:550Paolo Guiotto: Does not provide exactly an explicit formula for the polynomials.
56:08:680Paolo Guiotto: So it proceeds by consecutive approximations, and then you have not really a direct formula to say the polynomial is this one.
56:20:550Paolo Guiotto: Now, in the early, in the first half of the last century, a Russian mathematician, Bernstein.
56:32:900Paolo Guiotto: found an exact formula, so a formula that you can see, and it's nice also to see this formula, to build these polynomials. So, the theorem he proved is the following. So this is called the
56:47:850Paolo Guiotto: Bernstein.
56:53:480Paolo Guiotto: bias us.
57:00:550Paolo Guiotto: So it says, we assume that the function is continuous on 0, 1, just to simplify the formula, but if you have any interval AB, it's just a question of rescaling and translating the variables. So let F be continuous in 01, and define
57:22:650Paolo Guiotto: PN of X as this polynomial. It is the sum for k going from 0 to n of the binomial coefficient n over k, the value of the function f
57:36:210Paolo Guiotto: at point K over N, here is the fraction, k over n. X to K
57:42:450Paolo Guiotto: times 1 minus X to N minus K.
57:47:740Paolo Guiotto: So, as you can see, you have an analytical, explicit formula to,
57:53:400Paolo Guiotto: to write these polynomials. Now, it turns out we have…
58:01:830Paolo Guiotto: That this sequence of polynomials converges in uniform norm 2F.
58:09:940Paolo Guiotto: Now, this is a pure analysis result.
58:14:390Paolo Guiotto: You have a function, continuous functions. You have certain polynomials defined through this function.
58:21:760Paolo Guiotto: And this says that you have this uniform approximation. So you don't see, again.
58:27:20Paolo Guiotto: In any way, yeah, you may see that there is zero one, so maybe there is… you may suspect that there is some probability behind, but…
58:34:870Paolo Guiotto: In fact, I could write for every interval AB just by using a rescaling of the interval, no? You just map 01 into AB, so you have to translate and rescale. So you can write this in any interval.
58:48:430Paolo Guiotto: Now, let's see why this comes as a consequence of the, again, of the weak law of large numbers we have seen above.
58:59:50Paolo Guiotto: So, to prove that, to prove that, that there is this convergence, so PN converges
59:09:730Paolo Guiotto: to F in uniform norm, it means that we have to show that the distance between PN and F in uniform norm goes to 0. That is, we need to assess the maximum for X in 0, 1,
59:30:250Paolo Guiotto: between models of P and X, minus F of X.
59:37:100Paolo Guiotto: Ensure that this maximum goes to zero.
59:42:350Paolo Guiotto: Now, the crucial point is that…
59:45:790Paolo Guiotto: we can see this PN of X as an expected value of a suitable random variable. Well, if you look at that formula.
59:59:450Paolo Guiotto: you may imagine what is this random variable, because you see that there is F evaluated at these points, k over n.
00:09:70Paolo Guiotto: So I may think that this could be an expectation of F evaluated at that point, K over N, and provided these numbers here are probabilities. So this is the key point. Now, the idea is the following. Let…
00:30:320Paolo Guiotto: letter, XK… B, Just a Bernoulli.
00:44:390Paolo Guiotto: independent, random variable, with the…
00:55:530Paolo Guiotto: probability that XK is equal to 0, equal to X,
01:03:30Paolo Guiotto: And probability that XK is equal to 1, equal…
01:08:950Paolo Guiotto: Sorry, this is equal to 1 minus X.
01:12:900Paolo Guiotto: It's indifferent, but just to have the same formula above. Equal to X.
01:22:100Paolo Guiotto: Okay, now, let's, mmm…
01:26:610Paolo Guiotto: The goal is to compute the expected value
01:32:830Paolo Guiotto: of F evaluated on the average of these XK. So this is expected value of F evaluated in 1 over n sum for K.
01:47:950Paolo Guiotto: Going from 0 to… from 1 to N.
01:52:630Paolo Guiotto: of the XK.
01:55:860Paolo Guiotto: Now, you understand that since this XK day takes only two values, 0 and 1, So this sum…
02:04:850Paolo Guiotto: Is the sum of 0 and 1, sir?
02:07:750Paolo Guiotto: So it can take value 0 if all the XK are 0, it can take value 1 if only one of the XK is 1 and the other are 0, and so on. So this takes values
02:19:390Paolo Guiotto: 0, 1, 2, until all the XK are equal to 1, and then we have value n. So that quantity takes all values.
02:28:860Paolo Guiotto: between, 0 and then. So we could split that, probability, that expectation in this way. We could say, that this is the expectation of…
02:41:200Paolo Guiotto: Since the random variable is discrete, we can split this into sum for J going from 0 to N.
02:51:180Paolo Guiotto: F of 1 over N, say.
02:57:620Paolo Guiotto: indicate… well, let's put outside indicator, where that sum for k going from 1 to n of DXK is the value J. When the sum is J in… inside the… for that,
03:12:220Paolo Guiotto: Okay, let's put J here. I will get different letters. When the sum of dxk is J, in the argument of the function, instead of the sum, you have J, no?
03:25:520Paolo Guiotto: So, we can represent that function in this way.
03:31:330Paolo Guiotto: And therefore, now, carrying outside the sum, carrying outside these numbers, because they… they are constants, not depend anymore on the random variable, we get sum for J going from 0 to N, F of
03:46:200Paolo Guiotto: J over N, expected value of the indicator of sum for k going from 1 to n of DXK equal to J.
03:59:820Paolo Guiotto: Which is the probability that that sum for k going from 1 to n of the XK is equal to J.
04:11:170Paolo Guiotto: But now, we can do this calculation by combinatorial arguments, or if we don't want to do that calculation, we can do this.
04:23:750Paolo Guiotto: So let's see what is this random variable here.
04:27:890Paolo Guiotto: We can do the… we can characterize this random value by computing its characteristic function. So, let's call this SN, the sum of these variables. So we have that the characteristic function of SN at point C is the expected value of e to i
04:48:680Paolo Guiotto: CSN.
04:51:950Paolo Guiotto: Now, Sn is the sum of the XK, and so the exponential splits into the product of the exponentials, so product for k going from 1 to n of e to ick.
05:07:20Paolo Guiotto: But now, DXK are independent, so by independence, we can carry the product outside, no, the expectation of a product of
05:18:320Paolo Guiotto: functions of independent random variables is the product of the expectations, so product for k from 1 to n of the expected value of e to iCXK. Now, this expectation is easy because the XK
05:36:10Paolo Guiotto: takes only two values, so we will have, when XK is 0, the exponential is e to 0, 1, so 1 times the expected value of the indicator when xk is 0, which is the probability that XK is equal to 0.
05:55:730Paolo Guiotto: Plus, when XK is 1, we get e toxic times the probability that XK is equal to 1.
06:05:290Paolo Guiotto: So, these are probabilities for k going from 1 to n of… now, the probability that XK is 0 is 1 minus X.
06:16:630Paolo Guiotto: And the probability that XK is 1 is X, so we get 1 minus X plus XE2IC.
06:26:140Paolo Guiotto: This multiplied by itself n times. So this is, 1…
06:34:330Paolo Guiotto: minus X plus XE to iC to power n.
06:40:680Paolo Guiotto: Now, this is the characteristic function of the FSN. End.
06:50:800Paolo Guiotto: So, I want,
07:02:300Paolo Guiotto: Yeah, what do we do with this?
07:10:690Paolo Guiotto: Why I did this?
07:13:600Paolo Guiotto: I wanted to compute these probabilities.
07:20:350Paolo Guiotto: And,
07:28:760Paolo Guiotto: Yeah.
07:29:850Paolo Guiotto: Because, on the other side, this thing…
07:35:380Paolo Guiotto: this, expectation. I could compute in another way.
07:39:320Paolo Guiotto: Which is, expectation of E to IXCSN.
07:45:40Paolo Guiotto: I could say that this is equal to what? I know that SN
07:49:820Paolo Guiotto: being the sum of the XK, no? This variable takes only value 0, 1, 2, 2N, right? So I could say, I get e to ic times 0 times the probability that SN is 0.
08:06:180Paolo Guiotto: plus E2IC1, so E2IC times 1, the probability that SN is 1, and so on. So, in general, I get the… this is the sum.
08:19:160Paolo Guiotto: for k going from 0 to n of e to IC.
08:25:380Paolo Guiotto: And probability that S is equal to SK… S… S… K, S, N…
08:36:50Paolo Guiotto: is equal to… I'm sorry, this is wrong as K.
08:43:220Paolo Guiotto: is equal to K.
08:45:120Paolo Guiotto: So, now I know that this must be equal to this.
08:50:569Paolo Guiotto: So, to get these probabilities, I should equate the coefficient of
08:59:300Paolo Guiotto: E to iC to power k. In fact, if we use the binomial coefficient expansion here, we have sum for k going from 0 to n, the binomial coefficient n over k.
09:14:420Paolo Guiotto: Then we have this to power n minus k, that's right, 1 minus X to power n minus k times this to power k, so X to K e to ick.
09:31:40Paolo Guiotto: And so you see that this guy here…
09:35:210Paolo Guiotto: This guy is the same of this, because it is the coefficient of E to IXK, so we have that the probability.
09:44:100Paolo Guiotto: we get from this that the probability that SN is equal to K, it is equal to the binomial coefficient n over k1 minus X to n minus kx to K.
10:00:760Paolo Guiotto: Okay. Of course, you could have done directly, let's say, by using combinatorics, because SN is KE, since Sn is the sum of the…
10:14:400Paolo Guiotto: say J goes from 0 to n of dxj, dxj, to have that this sum is exactly K, you need that only K of this XJ are 1, and the remaining N minus K are 0.
10:32:880Paolo Guiotto: And so this happens with a certain number of combinations, which is the binomial coefficient, and the probability that you have one single time, k once and n minus K0 is exactly that number, no?
10:47:570Paolo Guiotto: If you fit one sequence of values of X, the fact that this is the probability, they are 1. This is the probability, they are 0. To have the value K, you need that K of them are 1, and the remaining are 0.
11:04:940Paolo Guiotto: So, if you fix one sequence of indexes.
11:08:780Paolo Guiotto: with the k values here equal 1, and the remaining equal 0, the probability is this one, because they are independent today. The first is 1, the second is zero, the third is 1, the fourth is 1, and so on.
11:22:20Paolo Guiotto: You put each time an X, a factor X, whenever XK is XJ is 0, and a factor 1 minus X when it is,
11:30:750Paolo Guiotto: when it is 0 and when it is 1. And then you have to count the number of ways you can do that. However, we did this by using the characteristic function. So let's go back to the formula. So, we were computing this expectation, so we get a return into that.
11:50:220Paolo Guiotto: Returning.
11:52:790Paolo Guiotto: the expectation.
11:54:810Paolo Guiotto: of F, of the average Xena, We have, huh?
12:03:480Paolo Guiotto: So, the expectation, off.
12:06:00Paolo Guiotto: F of the average XN is equal to, we say, the…
12:12:410Paolo Guiotto: sum. J goes from 0 to n, f of j over n.
12:18:120Paolo Guiotto: sum, J going from 0 to N, F of J over N. Then we add the probability times the probability that Sn is equal to J.
12:28:810Paolo Guiotto: probability that SN is equal to J. We computed and we obtained the data. This is the formula, this is with the index J.
12:39:410Paolo Guiotto: N over J1 minus X to N minus J times X to J.
12:46:470Paolo Guiotto: And so we get this formula. So, this average is the sum for J going from 0 to N, F of J over N, the binomial coefficient n over j.
12:59:980Paolo Guiotto: minus X to N minus J, X to J.
13:04:740Paolo Guiotto: And this is exactly what we call the… PN… of X.
13:12:40Paolo Guiotto: So the expected value of this F of XN is PN of X.
13:18:620Paolo Guiotto: So… If we want to do the modulus of PN of X, minus F of X,
13:30:250Paolo Guiotto: which is the quantity that then we have to take the supremum and show that it goes to zero. Well, this is, because of that formula, models of expectation of F of Xn.
13:44:70Paolo Guiotto: Average.
13:45:560Paolo Guiotto: minus F of X.
13:50:110Paolo Guiotto: Now, the trick is now to put everything under a unique expectation. I can always think this as the expected value of the constant F of X.
14:02:40Paolo Guiotto: and merge these two expectations into Unique 1. So, expectation of F, of XN, Average minus F of X.
14:17:440Paolo Guiotto: And maybe carrying also the modulus inside, so less or equal than expected value of the modulus FXN, average minus F of X.
14:28:660Paolo Guiotto: So we have this bound.
14:32:870Paolo Guiotto: models of…
14:50:430Paolo Guiotto: Now, we want to show that
14:53:220Paolo Guiotto: we want to get a bound of this that goes to 0 when M goes to infinity, and it is independent of X in such a way that we can take the maximum and show the conclusion. So now let's focus on this bound.
15:06:730Paolo Guiotto: Now, the idea is that here, this distance between FXN average and f of x.
15:18:160Paolo Guiotto: Well, the idea is that what XN average does… XN average is the average of DXN, so it is 1 over n sum for k going from 1 to N of DXK.
15:31:440Paolo Guiotto: DXK are the Bernoulli random variables.
15:36:10Paolo Guiotto: And, so what happens to these averages? This is an average to which we could apply the, the law of large numbers. If, if we can do that, we can do it here, because the XK are Bernoulli, remind, are Bernoulli…
15:55:370Paolo Guiotto: are independent, Bernoulli.
16:02:720Paolo Guiotto: And, of course, since they are Bernoulli, they are… they take two values, so they are bounded, they take so many values 0, 1,
16:11:260Paolo Guiotto: And they have the same mean, because if you do the expectation of the XK, this is… so you reminded that the XK was defined as a Bernoulli random variable with the probability that XK is 0 is 1 minus X, probability that XK is 1 is X. So when you do the expected value, you have 0 times the probability that X
16:35:210Paolo Guiotto: K is 0, 1 minus X, so this is 0.
16:38:430Paolo Guiotto: plus 1 times X, so the, the expected value is 0 times 1 minus X, plus 1 times X. These are the probabilities that XK is 0
16:53:310Paolo Guiotto: and the XK is 1.
16:57:30Paolo Guiotto: So this, is equal to X. So they have… all have the same average, and the variance of the XK
17:07:460Paolo Guiotto: is easy to compute, because, since, this is the expected value of XK,
17:14:880Paolo Guiotto: Square minus the, the mean value squared.
17:20:580Paolo Guiotto: So this is what now? Again, we have two values, 0 and 1, so it will be 0 squared times the probability that XK is 0, who cares, plus 1 squared times the probability that XK is 1, so this is equal to X.
17:39:00Paolo Guiotto: So, minus X squared, we get X minus X squared, it is equal to X times, so 1 minus X.
17:48:430Paolo Guiotto: Which is, clearly a finite and even constant. So, it means that, the, weak law
17:59:270Paolo Guiotto: of log numbers applies.
18:06:340Paolo Guiotto: And the average of DXN converges in probability to the common value, so, which is X.
18:16:220Paolo Guiotto: The common mean value.
18:17:880Paolo Guiotto: Which is X. So, the idea is now that
18:21:230Paolo Guiotto: Since the average of this XN goes to X, and we have to assess that expectation, so this should go somehow to X.
18:31:290Paolo Guiotto: minus f of x should go to 0, you see?
18:35:200Paolo Guiotto: The problem is that the conversions we have is in probability.
18:40:310Paolo Guiotto: So, it's not an almost true convergence for which we could apply the dominated convergence here. But we have to work a bit, because this means that the probability that XN average is away from X for
18:57:320Paolo Guiotto: more than epsilon, is going to zero. This is the quantity that goes to zero, and we have also the bound, which is less or equal than
19:05:300Paolo Guiotto: 1 over epsilon squared. So, since, formally the general bound is,
19:13:100Paolo Guiotto: 1, so that's write here… 1 over epsilon square, n squared, the sum for k going from 0 to… from 1 to n of the variances of DXK, so this is the Chebyshev bound.
19:28:990Paolo Guiotto: But since the variances are constantly, constantly equal to X times 1 minus X,
19:37:550Paolo Guiotto: So this, simplifies to… 1 over epsilon squared, n squared times NX1 minus X.
19:48:460Paolo Guiotto: So, this is the bound we have.
19:50:900Paolo Guiotto: So the probability that XN
19:55:170Paolo Guiotto: average is away from X for more than epsilon, is controlled by, say, X times 1 minus X divided epsilon squared n. This is the bound we can use.
20:10:970Paolo Guiotto: So we don't…
20:12:210Paolo Guiotto: we know that Xn goes in probability 2X in the sense that this probability is more. So now, this is the idea. So, when we take the expected value of modulus F, Xn average.
20:27:960Paolo Guiotto: minus F of X,
20:30:870Paolo Guiotto: which I remind you, it is a bound of the distance between the polynomial PNX, the Bernstein polynomial, and the function F. Well, we can use this… this probability when xn is away from X.
20:47:930Paolo Guiotto: So I can split this expectation into two sub-expectations, say, say, split into. One is
20:54:890Paolo Guiotto: On the event where Xn is far, Xn average is far from X, so this greater or equal than epsilon, times the same thing, modules of F
21:07:00Paolo Guiotto: X and average minus F of X.
21:13:180Paolo Guiotto: Plus another, which is the expectation on the part where the distance between the two is smaller.
21:23:430Paolo Guiotto: And the same modules F, X, and bar.
21:27:850Paolo Guiotto: minus F of X.
21:30:990Paolo Guiotto: These two quantities now are small for different reasons.
21:36:360Paolo Guiotto: And also, we need to find at the end a bound that is independent of X, because remind that we are bounding this quantity here, and we want to take the maximum of X, okay?
21:50:550Paolo Guiotto: So we have also to keep in mind this. Let's take the first case, the first expectation. So, expectation indicator modulus Xn minus X greater or equal than epsilon, so on this part of omegas, of modulus of this difference.
22:09:560Paolo Guiotto: Now, why this quantity here should be small?
22:14:600Paolo Guiotto: Well, this should be small not because of this, in principle, because here we are on omegas.
22:22:840Paolo Guiotto: where these two points are far. Not so far, because F shouldn't… will be small at the end. But let's say they are not close, so I cannot say that this distance will be small, even if F is continuous, because it's just F evaluated at two points, they are far away, each from beyond.
22:42:300Paolo Guiotto: So, this is a helpless to show that this expectation is more, so I will throw away.
22:49:890Paolo Guiotto: And I will throw away with the infinity bound, so I can say that this is less than modulus F at the average, whatever it is, plus modulus of F of X, whatever it is. These two quantities are bounded by 2 times the infinity norm of F.
23:09:940Paolo Guiotto: So I throw away this, because it is basically useless. So I can say that this is less or equal than 2.
23:17:500Paolo Guiotto: infinity norm of F, and then it remains the expectation of the indicator, Indicator Modulus XN.
23:26:250Paolo Guiotto: bar minus little x greater than epsilon. But this expectation is a probability, because the expectation of an indicator of a set is the probability of that event, modulus Xn bar minus X greater or equal than epsilon. And this is more because of the Chebyshev
23:45:830Paolo Guiotto: Bound.
23:48:20Paolo Guiotto: And the bound has been written here. It is bounded by X times 1 minus X over epsilon square n. So, right here, X1 minus X over epsilon square n.
24:02:720Paolo Guiotto: So, the first bound is that, say.
24:06:700Paolo Guiotto: let… let's give a name. Expectation 1, Expectation 2 to these two guys. So, expectation 1… is bounded by…
24:17:380Paolo Guiotto: to infinity norm of F, times X1 minus X divided by epsilon square n.
24:30:440Paolo Guiotto: Now let's pass through the expectation, too.
24:33:730Paolo Guiotto: Expectation 2 is the expected value. Here is where the distance between the average and X is small, is smaller than epsilon.
24:45:720Paolo Guiotto: models FXN minus F of X.
24:51:290Paolo Guiotto: Now, this time, I am on the event where these two points are closing.
24:58:640Paolo Guiotto: So, here it will be the continuity to play a role, because I have F at this point, minus F at this other point. So, I can understand that this would be small.
25:11:50Paolo Guiotto: Because these two points are provided we choose epsilon small enough, because of the continuity. So, well, actually, what we need is something that is a consequence of the continuity. Here.
25:27:80Paolo Guiotto: we need, a consequence… of… continuity.
25:41:120Paolo Guiotto: I'm sure that mathematicians have seen this, I don't know if the others, which is called, the…
25:48:560Paolo Guiotto: uniform continuity.
25:54:750Paolo Guiotto: But it's something that you can intuitively understand.
25:59:250Paolo Guiotto: If, let's say, let's write this property here. Of course, I will not prove this comes from the properties of continuous function. If you have a continuous function on a closed and bounded interval, say AB,
26:14:90Paolo Guiotto: You have this, then… F is uniformly continuous.
26:23:750Paolo Guiotto: In the sense that, huh?
26:37:670Paolo Guiotto: Now, let's, because I'm using epsilon here with this meaning, so in the sense that, for every
26:46:470Paolo Guiotto: Here, I'm inverting letters, so let's use another letter. For every row positive.
26:54:170Paolo Guiotto: there exists an epsilon, positive, such that whenever you have the two points, X and Y,
27:04:730Paolo Guiotto: that are at distance epsilon, less than epsilon, the corresponding values of F, so F of X and F of Y, are at distance less than your fixed raw.
27:18:700Paolo Guiotto: Now, what is the difference between continuity and uniformly continuity? That continuity says, if you say continuity, continuity is a property of a function at some point.
27:28:810Paolo Guiotto: So it says, when you have a close to death point.
27:32:960Paolo Guiotto: Okay, so you have the same property, f of x, you have a distance square of y less than rho for every x close to Y less than epsilon. The point is that this epsilon
27:46:150Paolo Guiotto: depends on the point where you check the continuity. You change pointing, you don't have an epsilon which is the same for all points Y.
27:56:520Paolo Guiotto: While uniform continuity said that there is a universal epsilon that is the same for all points of the interval, so such that whenever you are two points far closer.
28:10:410Paolo Guiotto: Doesn't matter where they are, the corresponding F will be closed in the same way. So that's the specification, uniform, because they are uniform closed, no matter where you take points. So this property is stronger than continuity, because
28:28:180Paolo Guiotto: If you remove the assumption that you are in a closed and abundant interval, it is no longer true. But, however, we use this. So, in this case, we get that… so, suppose that… so, if we choose…
28:49:830Paolo Guiotto: Epsilon. Such that… Models of f of x minus F of Y.
29:00:10Paolo Guiotto: is smaller than raw.
29:03:20Paolo Guiotto: for every…
29:05:330Paolo Guiotto: X minus y less than epsilon, we can say that the term E2, which is the expectation of indicator when X and average and x are a distance less than epsilon.
29:20:800Paolo Guiotto: of modulus FXN average minus F of X.
29:26:380Paolo Guiotto: This distance… now, since this quantity is less than raw.
29:31:640Paolo Guiotto: When this is less than epsilon, we can say that this is less than raw times the expectation of the indicator.
29:40:450Paolo Guiotto: But now this will be less or equal than drop.
29:43:530Paolo Guiotto: And since this raw is arbitrary, we can choose raw small enough to make this C2, as small as we like.
29:52:340Paolo Guiotto: So, in conclusion…
29:58:70Paolo Guiotto: Returning back to the initial estimation, between PNX and FX, this distance is bounded by
30:08:740Paolo Guiotto: So this was, bounded by the sum of these two expectations.
30:14:160Paolo Guiotto: E1 plus E2. E1 is bounded by 2 infinity norm of F.
30:22:750Paolo Guiotto: X times 1 minus X divided epsilon square n.
30:28:100Paolo Guiotto: It too is bounded by raw.
30:30:650Paolo Guiotto: Where… raw positive.
30:35:900Paolo Guiotto: Ease.
30:37:750Paolo Guiotto: Can be taken.
30:41:440Paolo Guiotto: can… B.
30:43:740Paolo Guiotto: Bacon.
30:47:790Paolo Guiotto: arbitrarily small.
30:52:360Paolo Guiotto: And epsilon is depending on raw. So, the game will be, you fix raw, you determine epsilon, and these holes.
31:02:900Paolo Guiotto: Okay, now, last step, we take the maximum, so the infinity norm, PN minus F, infinity norm, is the maximum.
31:13:580Paolo Guiotto: on X in 01,
31:16:200Paolo Guiotto: Remind that for us, X is between 0 and 1 of modulus PNX minus F of X.
31:27:170Paolo Guiotto: But this is bounded by this quantity, so you see that X is still present here.
31:33:920Paolo Guiotto: So we have less or equal than 2 infinity norm of F divided epsilon squared n times the maximum.
31:43:620Paolo Guiotto: in 01 of this quantity, X times 1 minus X, which is, however, a constant. You can also determine exactly the value, because this is a parabola.
31:56:540Paolo Guiotto: made like that, at zero. It's symmetric, the maximum is taken at 1 half. So this maximum value is 1x is 1 half is, by the way, is 1 fourth, but if you want, this is less or equal than 1, it doesn't matter. Plus raw.
32:13:370Paolo Guiotto: So, you have that, the distance in infinity norm between P and F.
32:19:70Paolo Guiotto: and F is bounded by two infinity norms.
32:24:10Paolo Guiotto: of F, let's keep the bound 1, epsilon squared n plus rho.
32:30:860Paolo Guiotto: And now the conclusion follows, because since raw can be taken a bit early, what you do is, let's say, first you do the limit, so the limit in N,
32:43:580Paolo Guiotto: of these quantities, PN.
32:46:250Paolo Guiotto: minus F infinity. When you send N to infinity, this quantity here goes to zero. So it remains that this is less or equal than raw, and since raw
33:02:920Paolo Guiotto: ease.
33:03:960Paolo Guiotto: Annie.
33:06:270Paolo Guiotto: positive.
33:10:520Paolo Guiotto: number… Real number.
33:16:460Paolo Guiotto: If you are less or equal than raw, you cannot be positive, no?
33:21:590Paolo Guiotto: Otherwise, you would take half of the quantity. So, this implies that this quantity, which is positive.
33:29:140Paolo Guiotto: or zero must be necessarily equal to zero. So we get this. The limit of PN minus F in Phentenum is 0, which is finally the conclusion.
33:44:30Paolo Guiotto: Okay, so this, proves this, second remarkable application of the, weak,
33:54:810Paolo Guiotto: Low of large number.
33:57:310Paolo Guiotto: Our time is over, so we stop here.
34:01:830Paolo Guiotto: Well, I suggest you… To… to review this proof carefully, because it contains lots of ideas.
34:12:550Paolo Guiotto: So it's, it's a good idea to, to spend some time in thinking about, where we use the, the, different, convergences, and so on.
34:28:460Paolo Guiotto: So let's stop.
34:30:630Paolo Guiotto: The recording…