Before we go into the proofs, here are a few definitions and concept to know so you don't get confused when we talk about the proof of this formula. I am giving fairly simple explanations here, just enough so you can understand what it is. If you want to learn more about it, you can find resources about the topic in the "More Sources" page.

**Standard Deviation:** is used to measure how spread out the numbers in a sample are. Usually denoted by . The standard deviation is the square root of variance.

**Variance:** the average of the squared difference from the mean. Usually denoted by 2. Variance is the square of the standard deviation. Used in calculation more often because variance is much easier to manipulate without the loss of data. It is also used because it weighs outliers much more than standard deviation which is important when used by investors.

**Theta:** Is used in statistics to represent any unknown parameter of interest. In a continuous probability function, it can be used as the likelihood that event X occurs.

**Expected Value:** the weighted average of a probability function. Denoted as E(X). In simple terms, the most likely event to occur in a probability function.

**Likelihood Function:** is used to predict how close a statistical model is. Denoted as L(θ).

**Score:** is the gradient, or vectors of the partial derivatives, of ln(L(θ))where L(θ)is a likelihood function of some parameter