A list of puns related to "Multivariate Laplace distribution"
Iβm trying to make a model to predict a probability distribution conditional on a set of predictors. Iβve tried a Mixture Density Network, but the covariance doesnβt seem to be captured - the bivariate distributions just look like a two gaussians plopped on top of each other, no correlation. Is there a more appropriate model to use here? Or should an MDN work and Iβm just implementing it wrong?
I did find this textbook that seems to suggest the answer is yes; http://www.statslab.cam.ac.uk/~nickl/Site/__files/FULLPDF.pdf
However, it's obviously quite technical; I'm currently an undergrad and I'm sure I haven't covered enough of the prerequisites to understanding typical advanced treatments of the topic. I just find the topic really interesting.
But is there a simplified overview of this that's more accessible? Is there anything particularly interesting that happens when we switch from finitely many dimensions to infinitely many, for probability distributions in particular?
I am trying to better understand the conditional and marginal distributions of the normal probability distribution function: https://online.stat.psu.edu/stat505/lesson/6/6.1
"Any distribution for a subset of variables from a multivariate normal, conditional on known values for another subset of variables, is a multivariate normal distribution."
Suppose I have data corresponding to 3 variables : Var_1 , Var_2 and Var_3. I am interested in predicting Var_3 using Var_1 and Var_2.
Suppose I fit a multivariate normal distribution to this data - doesn't the multivariate normal distribution have special properties such that the conditional distribution of any of the variables within the multivariate normal distribution will also form a normal distribution? Suppose I want to predict the value of Var_3 when Var_1 = a AND Var_2 = b.
Couldn't I just "fix" the values of the other two variables and construct a conditional distribution for the response variable Prob (Var_3 | Var_1 = a and Var_2 = b) ? Shouldn't "Prob (Var_3 | Var_1 = a and Var_2 = b) " have a normal distribution? Could I not then generate a distribution (e.g. histogram) of acceptable values of this response variable given the "fixed" values of the other two variables? I think I should be able to sample from Prob (Var_3 | Var_1 = a and Var_2 = b) given that I have chosen a multivariate normal distribution? Then, I could take the Expected Value of " Prob (Var_3 | Var_1 = a and Var_2 = b) " to answer my question? E.g when "Var_1 = a and Var_2 = b", Var_3 is most likely to be equal to "c"?
https://imgur.com/a/4aTDkR1
Would this be considered a "generative model"? Is this a correct strategy in general? Does it make mathematical sense?
Note: I know that I could just fit a regular regression model to this problem, but I am trying to better understand how probability distribution functions work.
Thanks
In Differential Privacy, the required noise addition is often achieved by sampling values from the Laplace distribution (ie, the 'Laplace mechanism').
This means we usually think about the average relative error of a counting query as: [var(Lap(scale))] / [number of records]. One thing to note is that if you divide both the scale and the number of records by the same amount (ie, applying a record partitioning or subsampling trick to reduce sensitivity), you shouldn't see any improvement in that error metric: [var(Lap(scale/k))] / ([number of records]/k) = [var(Lap(scale))] / [number of records].
However, I've been wondering whether variance (or even mean absolute difference) is actually the correct way to think about the impact of noise addition. The Laplace distribution is a long tail distribution and, for small scale parameters, it's sharply skewed towards the origin. Ie, lots of small noise values, a few very large noise values. This is interesting because these noisy query results feed into other algorithms that build models, post-process, etc, to produce a final privatized analytic or data product... and this post processing may be more or less tolerant of different distributions of added noise. For example, if most noise values are very small, and only a few randomly sampled values are very large, it can be possible to use publicly known properties of the data space to do smoothing and reduce the impact of the large noise values. While that same trick might not be successful with a less sharply skewed distribution of noise values, even if the average noise value stayed the same.
So hopefully that's enough interesting motivation to justify me posting a fairly mundane question to r/math: Does anyone know the equation for the median absolute difference of the Laplace distribution?
I'm trying to expand my stats skills by predicting the basket share for a customer from a few predictor variables, and I want to do it the right way. I definitely do not want to grab the data, jam it through the algorithm without consideration (I'm using Excel), and belch the results.
The main problem I see at this point is that my data is (edit: updated to be "the residuals are" as I was imprecise in the original post) not normally distributed. When I look at the normal probability plot, it doesn't follow the 45-degree line. The distribution most closely resembles the heavy-tailedness example from this site.
Ok, assuming that this is a heavy-tail distribution, I don't know how to fix it. Googling how to fix a heavy tail distribution has led to a bunch of snarky, unhelpful "answers" on StackOverflow. I think I need to transform the variables somehow so that they are normally distributed. I did see a suggestion for a log transform.
This sorta intuitively makes sense. One variable I'm using is total sales, which has a few really large customers that may be considered outliers. Using a log transform dampens the effect of these big guys.
Aside from that, what else can I do? How do I determine if each predictor variable is normally distributed or not?
Thanks for any advice you can give.
I'm in a Bayesian modeling situation that I imagine is quite common, but I can't seem to find a mainstream distribution for this circumstance. Apologies if this is what they call a newbie question.
I want to set a prior on a random variable C which is a vector of binary-valued variables. I have a good understanding of Cov(C_i, C_j) as well as Var(C_i). I feel like there should be a pretty standard way of encoding this into a prior but the distributions I've found such as Multivariate Bernoulli seem pretty scarcely used, at least in the sense that it isn't implemented in any of the common MCMC packages.
My instinct is just to implement it myself, but I feel like by virtue of it being uncommon there is likely a better all around choice. Could be a cognitive bias tho
What would you do in this circumstance? Is there a common choice?
I am trying to teach myself about the multivariate normal distribution and I am struggling to understand some basic things about it.
To show my confusion, I use the famous Iris Flower dataset (I will use the R programming language for some basic scripts). The Iris Flower dataset has 5 columns and 150 rows. Each row contains the measurements for an individual flower (i.e. there are 150 flowers). The columns contain the measurements of the "Petal Width", the "Petal Length", the "Sepal Length" , the "Sepal Width" and the "Type of Flower" (three types of flowers, categorical variable).
Suppose I Just take the Petal Length variable. I want to see if the Petal Length follows a (univariate) normal distribution. I think this can be easily done using different strategies (R code below):
#load the iris data and isolate the petal length
data(iris)
var1 = iris$Petal.Length
#visually check if the distribution of the petal length looks like a "bell curve" plot(density(var1))
#look at the quantile-quantile
plot qqnorm(var1)
#use statistics (e.g. the shapiro-wilks test) to check for normality
shapiro.test(var1)
#if the data is normally distributed, we can find out the mean and the variance
mean(var1)
var(var1)
Similarly, I can repeat this for the remaining variables in the iris data. However, this task becomes a lot more complicated when you consider the multivariate distribution of the iris data : https://en.wikipedia.org/wiki/Multivariate_normal_distribution . When dealing with the multivariate distribution, there is now a "vector of means" and a "variance-covariance matrix". This means that there are more complex relationships within the data - some parts of the data might have a normal distribution whereas some parts of the data might not be normally distributed.
After spending some time researching how to determine if a dataset follows a multivariate distribution , I found out about something called the Mardia test, which apparently uses the "skewness" and the "kurtosis" to determine if the data is normally distributed (high skewness and high kurtosis means the data is not normally distributed). I tried running the following code in R to perform the Marida test on the iris data:
library(MVN)
data(iris)
data = iris[,-5]
result = mvn(data)
result
The results of this are confusing. I am not sure
... keep reading on reddit β‘I was just looking at a machine learning package for Python/R called 'Prophet' by Facebook which is making a bit of noise in the machine learning/data science world due to its simplicity, especially in Python. Here's a summary:
Time series algorithm, automatically standardises time and predicted variable
Automatically de-trending on three components -- long term (over the entire data?), monthly/weekly/days of the week, and holidays (comes with holidays data for different regions)
By default, it fits 25 linear or logistic models over the first 80% time of the data
By default, assumes Laplace distribution for the covariates
Prior to reading this, it always bothered me when papers don't mention what type of distribution each of the covariates are, unless of course, graphically shown individually (so stuff like Poisson distribution is easier to tell). Now that I'm reading above linked article, it took me by surprise that the package assumes Laplace distribution for all the covariates by default.
So then my question is two-fold. Is it ok to assume normal distribution if the paper doesn't mention anything for each of the covariates (some papers mention them for few to many covariates, and I think some of them may be obvious so no need to mention, though maybe not obvious to all)? On top of this, why would one choose to assume Laplace instead of normal distribution? What would be the advantages/disadvantages of such decision, and what would be the result on the estimates/errors/bias?
u(x,y) = 3x^(2)y -y^(3). Show that u_xx + u_yy = 0. (βu = 0).
u(x,y) = ln(x^2 + y^(2)). Show that βu = 0.
Derive the Laplace operator for polar coordinates (using other words). Basically show that the bottom is true for u(x,y) when [;x = r\cos(\theta), y = r\sin(\theta);].
I am not planning on using this operator or any of these techniques on the test or quizzes, I am just wondering what these problems say about this operator and any interesting properties that one should know about it.
Hi there!
Lets say we for instance have a 3-dimensional normal distribution:
(X1, X2, X3) ~ N(mean, cov)
The pdf is:
f(x1, x2, x3) = P(X1 = x1, X2 = x2, X3 = x3)
and the cdf is:
F(x1, x2, x3) = P(X1 <= x1, X2 <= x2, X3 <= x3)
To calculate the cdf and pdf of a multivariate normal distribution is readily available in most programming languages.
What I need however, is something more flexible. I need a function which I will call Flex, that can take both single possible values of a variable (like a pdf does) and a range of possible values (like a cdf does). Here is an example:
Flex(0.5 <=> 0.9, 0.2, 0.3 <=> 0.7) = P(0.5 <= X1 < 0.9, X2 = 0.2, 0.3 <= X3 <= 0.7)
In this example, I want X1 and X3 to be in a certain interval possible of values, and X2 to have exactly the value 0.2.
Now, I realize that the resulting value is not that interpretable, but that does not matter. I have a gaussian mixture model, and I need to figure out which cluster an observation is most likely to belong in, when what we know about the observation is something like:
x1 is somewhere between 0.2 and 0.7, x2 is exactly 0.3 and so on.
If anyone know if something like this exists, or if it is possible to implement, that would be very helpful!
Hi, I want to take the partial derivative of this multivariate gaussian cumulative distribution function with respect to beta_1 (which is a single element of the beta vector). X_1 is a n times z matrix, X_2 is a p times z matrix, beta is a z times 1 vector , H is a p times n matrix, F is a p times 1 vector and T is a symmetric, positive-definite p times p matrix. In the univariate case the solution is straightforward with the chain rule, but I'm a bit struggling with the generalized chain rule in this case.
https://preview.redd.it/48vd98m2ong61.png?width=296&format=png&auto=webp&s=d2ab5a60151339408c958f2c16adfa9eec4f661b
Can someone please explain to me how this works?
Suppose i have 3 variables: earnings, savings, debt
I have these 3 variables recorded for 1000 people.
How do you check if this data is normally distributed? For a single variable, I could use the kolmogorov-smirnov test. But how would you check if these 3 variables are jointly normally distributed?
Assuming that this data is normally distributed, how do you calculate the joint multivariate normal distribution of this data? For a single variable, assuming a normal distribution, i could take all the observations:
Mu = sum(xi)/n .. for all values of n
Sigma = sqrt((sum(xi-mu)^2) / n) .... for all values of n
But if there are 3 variables:
Mu vector: (mu1, mu2, mu3)
Sigma-covariance matrix: (sig11, sig12, sig13, sig21, sig22, sig23, sig31, sig32, sig33)
Is this how you would define the multivariate distribution for this example?
Thanks
What fraction of all points in a Euclidean space lie within (rather than outside of) the n-ball whose center is the orogin and which is tangent to the point p represented by Cartesian coordinates:
vector(ΞΈ) =(ΞΈ^1, ΞΈ^2, ΞΈ^3, ΞΈ^4, ΞΈ^5, ... ΞΈ^n )
representing sigmas in the multivariate normal distribution in n dimensions as illustrated at the top of the link?
Journal of the American Chemical SocietyDOI: 10.1021/jacs.0c09015
BeleΜn Lerma-Berlanga, Carolina R. Ganivet, Neyvis Almora-Barrios, Sergio Tatay, Yong Peng, Josep Albero, Oscar Fabelo, Javier GonzaΜlez-Platas, Hermenegildo GarciΜa, Natalia M. Padial, and Carlos MartiΜ-Gastaldo
https://ift.tt/2Lhfb9T
In hierarchical / pooled models there is the intuition that slope and intercept are often correlated within each sub-group.
So why does "simple" regression, i.e one intercept and one slope (and no random intercepts) treat the intercept and slope as uncorrelated?
In other words, why do Bayesian approaches model regression as (for example):
y ~ normal(mean, sigma)
mean = intercept + slope* x
intercept ~ normal(0,10)
slope ~normal(0,10)
sigma ~ exp(1)
instead of :
y ~ normal(mean, sigma)
mean = intercept + slope* x
[intercept, slope] ~ multivariatenormal(.....)
sigma ~ exp(1)
Thanks
Hi! So I know this question is pretty obvious to some, but suppose we know the distribution of
P(a = i, b = j, c = k) for i, j, k \in {0, 1}. Furthermore suppose we know the marginal distribution of P_a that happens to be strictly positive. If we want to calculate the conditional distribution b and c given a, so P_{b, c | a}, we can just simply divide each value in our original joint distribution with the corresponding marginal distribution value of a, right?
Hey all, Is there a way to statistically interpolate between different multivariate Gaussian distributions? I think for mean vectors linear interpolation might work, but not sure for the covariance matrices. At the most basic level, given two distributions and two "weights" adding up to 1, I would like to find out the "weighted mixture" of two distributions. Can you point me relevant research areas or papers? Thank you.
1.How to separate mixture of two or more multivariate distributions.
2.In a multivariate sample data, which is mixture of many distributions, has some categorical columns as well, in that case how to separate them.
Hi, I'm trying to reimplement the Bayesian model from this paper. They mention in the Supplemental Information that they assume a multivariate prior on the weights -- I know how to deal with the mean vector, but they say that "The covariance matrix is defined by an Inverse-Gamma distribution with the two hyperparameters (a, b). The simulation sets the initial values of the two hyperparameters as (a0 = 1, b0 = 5)." I'm trying to do this in PyMC3, and I don't see how to define the covariance matrix with this distribution (is the inverse-wishart really what I want?)? I would also give PyStan a shot if someone knew how to do this there. This is my first foray into Bayesian modeling, so any help would be hugely appreciated.
Don't get me wrong guys, I made some progress but I need to get full mark.
https://preview.redd.it/caea33ipy0g51.png?width=677&format=png&auto=webp&s=e50a499696bacb5d2e7abca102f712b8b65415a7
https://preview.redd.it/nolfp0cqy0g51.png?width=654&format=png&auto=webp&s=6aa759469670a039e5f21d5bba2ba3e8f28cd00e
https://preview.redd.it/ulx0q69ry0g51.png?width=592&format=png&auto=webp&s=06540ab6cf5336f9b5f937a89da8f8fa9d50b48f
Can someone please explain to me how this works?
Suppose i have 3 variables: earnings, savings, debt
I have these 3 variables recorded for 1000 people.
How do you check if this data is normally distributed? For a single variable, I could use the kolmogorov-smirnov test. But how would you check if these 3 variables are jointly normally distributed?
Assuming that this data is normally distributed, how do you calculate the joint multivariate normal distribution of this data? For a single variable, assuming a normal distribution, i could take all the observations:
Mu = sum(xi)/n .. for all values of n
Sigma = sqrt((sum(xi-mu)^2) / n) .... for all values of n
But if there are 3 variables:
Mu vector: (mu1, mu2, mu3)
Sigma-covariance matrix: (sig11, sig12, sig13, sig21, sig22, sig23, sig31, sig32, sig33)
Is this how you would define the multivariate distribution for this example?
Thanks
I am trying to teach myself about the multivariate normal distribution and I am struggling to understand some basic things about it.
To show my confusion, I use the famous Iris Flower dataset (I will use the R programming language for some basic scripts). The Iris Flower dataset has 5 columns and 150 rows. Each row contains the measurements for an individual flower (i.e. there are 150 flowers). The columns contain the measurements of the "Petal Width", the "Petal Length", the "Sepal Length" , the "Sepal Width" and the "Type of Flower" (three types of flowers, categorical variable).
Suppose I Just take the Petal Length variable. I want to see if the Petal Length follows a (univariate) normal distribution. I think this can be easily done using different strategies (R code below):
#load the iris data and isolate the petal length
data(iris)
var1 = iris$Petal.Length
#visually check if the distribution of the petal length looks like a "bell curve" plot(density(var1))
#look at the quantile-quantile
plot qqnorm(var1)
#use statistics (e.g. the shapiro-wilks test) to check for normality
shapiro.test(var1)
#if the data is normally distributed, we can find out the mean and the variance
mean(var1)
var(var1)
Similarly, I can repeat this for the remaining variables in the iris data. However, this task becomes a lot more complicated when you consider the multivariate distribution of the iris data : https://en.wikipedia.org/wiki/Multivariate_normal_distribution . When dealing with the multivariate distribution, there is now a "vector of means" and a "variance-covariance matrix". This means that there are more complex relationships within the data - some parts of the data might have a normal distribution whereas some parts of the data might not be normally distributed.
After spending some time researching how to determine if a dataset follows a multivariate distribution , I found out about something called the Mardia test, which apparently uses the "skewness" and the "kurtosis" to determine if the data is normally distributed (high skewness and high kurtosis means the data is not normally distributed). I tried running the following code in R to perform the Marida test on the iris data:
library(MVN)
data(iris)
data = iris[,-5]
result = mvn(data)
result
The results of this are confusing. I
... keep reading on reddit β‘If we have X_1,X_2,...,X_n where all of them are univariate normal distributed and we also assume that they are independent, is the vector X=(X_1,...,X_n) multivariate normal distributed?
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.