29 Hilarious Multivariate t distribution Puns

[Q] Why does simple Bayesian regression (i.e. y = Bx + c) not use a multivariate distribution prior for intercept and slope? Why are intercepts and slopes only treated as correlated in hierarchical/pooled/random models when modelling within-group variation?

In hierarchical / pooled models there is the intuition that slope and intercept are often correlated within each sub-group.
So why does "simple" regression, i.e one intercept and one slope (and no random intercepts) treat the intercept and slope as uncorrelated?

In other words, why do Bayesian approaches model regression as (for example):

y ~ normal(mean, sigma)
mean = intercept + slope* x
intercept ~ normal(0,10)
slope ~normal(0,10)
sigma ~ exp(1)

instead of :

y ~ normal(mean, sigma)
mean = intercept + slope* x
[intercept, slope] ~ multivariatenormal(.....)
sigma ~ exp(1)

Thanks

👍︎ 23

💬︎

👤︎ u/gmgmgmgmgmgm

📅︎ Nov 25 2019

🚨︎ report

Separate mixture of multivariate normal distributions

1.How to separate mixture of two or more multivariate distributions.

2.In a multivariate sample data, which is mixture of many distributions, has some categorical columns as well, in that case how to separate them.

👍︎ 4

💬︎

👤︎ u/eyeswideshhh

📅︎ Apr 25 2020

🚨︎ report

Interpolation between multivariate Gaussian distributions

Hey all, Is there a way to statistically interpolate between different multivariate Gaussian distributions? I think for mean vectors linear interpolation might work, but not sure for the covariance matrices. At the most basic level, given two distributions and two "weights" adding up to 1, I would like to find out the "weighted mixture" of two distributions. Can you point me relevant research areas or papers? Thank you.

👍︎ 3

💬︎

👤︎ u/cheeky_bastard__

📅︎ Feb 20 2020

🚨︎ report

Understanding Maths and Intuition behind Multivariate Gaussian Distribution | Machine Learning Fundamentals youtu.be/6W2mkOfzitk

👍︎ 2

💬︎

👤︎ u/prakhar21

📅︎ Aug 10 2020

🚨︎ report

Use inverse matrix gamma distribution as prior for covariance matrix of multivariate normal (in Python)

Hi, I'm trying to reimplement the Bayesian model from this paper. They mention in the Supplemental Information that they assume a multivariate prior on the weights -- I know how to deal with the mean vector, but they say that "The covariance matrix is defined by an Inverse-Gamma distribution with the two hyperparameters (a, b). The simulation sets the initial values of the two hyperparameters as (a0 = 1, b0 = 5)." I'm trying to do this in PyMC3, and I don't see how to define the covariance matrix with this distribution (is the inverse-wishart really what I want?)? I would also give PyStan a shot if someone knew how to do this there. This is my first foray into Bayesian modeling, so any help would be hugely appreciated.

👍︎ 37

💬︎

👤︎ u/squirreltalk

📅︎ Jun 09 2019

🚨︎ report

[OC] Visualization of 1D slices of 2D Multivariate Normal Distribution v.redd.it/833o2cu1pva21

👍︎ 57

💬︎

👤︎ u/EricJEarley

📅︎ Jan 17 2019

🚨︎ report

[OC] Visualization of 1D slices of 2D Multivariate Normal Distribution (v2) v.redd.it/0dgo57eue0b21

👍︎ 43

💬︎

👤︎ u/EricJEarley

📅︎ Jan 17 2019

🚨︎ report

Multivariate hypergeometric distribution problem.

I have N number of population , some individuals are flawed( missing parts) . each individual can be seen as a list of letters like [a,b,c,d,e,.., f] of length K , some of the population are considered flawed if they don’t contain certain letters . If we know the number of flawed population, for example :

There are X population don’t contain letter b (Ex:[a,c,d,e,..,f]),

Y population don’t contain letter c (Ex:[a,b,d,e,..,f]),

M population don’t contain city d (Ex:[a,b,c,e,..,f]),

R population don’t contain city e (Ex:[a,b,c,d,..,f])

. . .

Where X,Y,M,R ...are integers.

What is the probability of getting a random sample of 20 population(lists ) that all missing a certain letter, or all missing certain 2 letters ,..., or all missing all the letters between a and f (a and f are excepted)?

👍︎ 3

💬︎

👤︎ u/Beginner4ever

📅︎ Jun 26 2019

🚨︎ report

Computing cumulative multivariate distribution in high dimensions accurately, in reasonable time.

I'm trying to compute the CDF for the multivariate distribution for high dimensions (N > 1000). All known algorithms are exponential in complexity, and the alternative is Monte Carlo methods. Monte Carlo is not suitable, since you can't really trust the convergence, and can't quantify asymptotically what the error is. I've read through all the literature there is, and can't find a reasonable way to compute the CDF in high dimension at a known precision.

Does anyone know of any approximation technique that can compute this accurately in high dimension with reasonable runtime and error?

👍︎ 8

💬︎

👤︎ u/afro_donkey

📅︎ Sep 02 2018

🚨︎ report

[Statistics] Question about the multivariate normal distribution

I was looking through the definition of the multivariate normal distribution. Based on the definition, is it correct to say that given n normally distributed and not necessarily independent random variables, if its linear combination a_1x_1 + ...+a_nx_n is univariate normal, then x_1,..,x_n is jointly normal and the reverse is also true?

If so, is there some way to show why this is true? It's not obvious to me why these two facts imply each other.

👍︎ 2

💬︎

👤︎ u/askscithrowaway1234

📅︎ Jan 08 2020

🚨︎ report

Are there any multivariate distributions that are constrained to have sum 0 and are identical in all dimensions?

Ie a distribution of the form f(x1,...,xn) where the xi are interchangeable and sum_i xi =0 for all regions of nonzero density? I guess one could take a standard distribution like a multivariate normal and then integrate it along the hyperplane to get a restricted PDF, but is there an easier way?

👍︎ 3

💬︎

👤︎ u/Radon-Nikodym

📅︎ Feb 22 2019

🚨︎ report

[University Statistics] Multivariate distribution coefficient?

I do not know what I'm looking for exactly but I'm trying to figure out if something like this exists. I've taken some statistics but it didn't mention this at the level I took it.

I'm looking for something that would do something along the lines of...

Say you have a matrix.. you know the sum of all cells is 100.. the matrix is 5x5. If every single cell had the value 4, it would be perfectly distributed, and the 'coefficient' I'm thinking about would be equal to 1.. like 100% evenly distributed.

If all 100 was in one single cell, then it would be 0, or approaching 0, like 0% distribution.

Does something like this exist?.. I found multivariate normal distribution but I'm not sure that's what I'm looking for, and if I am, I am not sure how it works.

If it s the correct thing, would someone please explain the steps on how to use it?.. say for a 2x2 matrix with max value 16

Thank you!

👍︎ 2

💬︎

👤︎ u/ixenroh

📅︎ Mar 28 2020

🚨︎ report

Can Multivariate Folded Normal Distribution handle Covariance?

Hello fellow math nerds,

I am interested us extending the folded normal distribution to multivariate applications. I found a paper by Chakraborty and Chatterjee from 2013 which details the derivation of the PDF and mean for a multivariate folded normal distribution. However, there are a few details in the derivation that make me think the derivation would not work for multivariate normal distributions with covariance (i.e. non-zero elements in the non-diagonals of the covariance matrix)

Suppose 𝛴 is the covariance matrix for a p-dimensional multivariate normal distribution. On page 4, Notation 2.2 defines:

𝛴_s = 𝛬_s * 𝛴 * 𝛬_s*^(T)*

where

𝛬_s = diag(s_1,s_2,...,s_p), with s_i = ±1, ∀ 1≤ i ≤ p

I think this ends up being mathematically the same as simply multiplying the covariance matrix by the identity matrix, which would result in only the variances along the diagonals remaining.

My concern is that using the methods in this paper would result in calculating the same mean (equation 3.7) for two different multivariate distributions: one with no covariance, and one with covariance. However, my intuition tells me that the true means of the resulting multivariate folded normal distribution would differ.

Can anyone confirm if this is the case? Or am I missing something?

Thank you!

👍︎ 2

💬︎

👤︎ u/EricJEarley

📅︎ Feb 13 2019

🚨︎ report

Comparing multivariate distributions

So, I have two labeled datasets, both have very similar means, very similar medians, but I can tell by visualizing (2D) that their distributions are not identical. Initially I wanted to use a KS-test, but read it isn’t ideal for multivariate analysis because there are 2^d-1 independent ways of ordering a cumulative distribution function in d dimensions. So I was wondering if there were any implementations you guys have worked with?

Thanks!

👍︎ 3

💬︎

👤︎ u/PhD_BME_job

📅︎ Nov 21 2018

🚨︎ report

Estimating a multivariate distribution from data

I have a dataset where 3 separate (but correlated) measurements are taken at every location. I'm trying to find a good way to use this information to estimate a multivariate distribution to describe the probability that a given combination of these 3 measurements will occur.

I know that I can create a 3-parameter multivariate normal random distribution without too much trouble, but the measurements dont quite follow normal distributions so I would like something more general.

Any thoughts on how I could accomplish this?

👍︎ 6

💬︎

👤︎ u/jkool702

📅︎ Feb 08 2017

🚨︎ report

PPO using Multivariate Normal Distribution

Hi Folks,

I am trying to implement PPO in a continuous action space, where there are two possible actions to take. For this I want to model the actions as Gaussian distribution and because they have a correlation I am using Multivariate Normal Distribution, where there is a vector of means of size actions x 1 and covariance matrix of size action x action that my NNs need to select. The main issue comes with the implementation, I am using Pytorch distribution package, in particular the https://pytorch.org/docs/stable/distributions.html#multivariatenormal multivariate normal function. The issue I am facing is regarding the batch training, I am not able to go through the batch. For getting just a sample the following line works,

MultivariateNormal(self.l3(x).squeeze(), scale_tril = torch.diag((F.elu(self.l4(x))+1).squeeze()))

But iterating over a batch seems unfeasible. Has anyone seen an implementation using this function or any other implementation using PPO with continuous actions in pytorch? I've seen implementation using categorical distribution, in which you can directly plug the logits in it making everything a lot more easier.

Many thanks!

👍︎ 2

💬︎

👤︎ u/kashemirus

📅︎ May 16 2019

🚨︎ report

Creating multivariate skewed distributions

I'm looking to create negatively skewed distributions, up to 5 between -6,6 (skewness=2) that correlate with one another. I'm largely unconcerned with the means of the distributions. I've found the SN package but that might be more in-depth than I am looking for. Any suggestions or help would be greatly appreciated!

👍︎ 2

💬︎

👤︎ u/Crow_Shit

📅︎ Mar 09 2016

🚨︎ report

[College stats] Multivariate probability distribution - I cannot for the life of me figure out what I did wrong with this integration

so f(y1,y2) = 3y1 , 0<=y2<=y1<=1.

In order to find F(1/2,1/3), I integrate 3y1 from 0 to 1/3, and then from 0 to 1/2. The first integration I yielded a result of 1/6, and then a second integration I yielded 1/12.

But the answer says it is 0.1065. Someone pls help I'm dying

👍︎ 2

💬︎

👤︎ u/ohhmyg

📅︎ Jun 11 2016

🚨︎ report

Is there a multivariate version of the beta distribution?

I found this post that seems to describe what I'm looking for, but there was no answer.

I'm building an occupancy model and I want to allow the detection probability (p.d) to vary among species. Ideally, p.d for each species would be drawn from one distribution. From my understanding of the Dirichlet distribution, it's multivariate, but for any given multivariate draw, the sum is [0,1]. Instead, I want each component of the draw (i.e., each p.d) to be [0,1]. So like the poster in link said, I'm looking for essentially something that behaves like a multivariate normal distribution (complete with covariance even), but bounded between 0 and 1.

Anyone have any ideas? Or do I have the Dirichlet distribution wrong in my head?

👍︎ 2

💬︎

👤︎ u/liometopum

📅︎ Dec 18 2015

🚨︎ report

Calculating the cumulative probability density of a multivariate normal distribution across the space containing only probability densities greater than d?

I was wondering if anyone knew how to calculate the cumulative probability density of a multivariate normal distribution across the space containing only probability density greater than a given parameter. Essentially, I want the integral over all densities greater than d. I believe this is easy for a univariate distribution, and relatively easy for a bivariate distribution with no covariance, but I can't think of how one would do it for a bivariate distribution with covariance. Is this known? Any help would be greatly appreciated - this has been bothering me quite a bit.

Thanks!

👍︎ 3

💬︎

👤︎ u/knockturnal

📅︎ Feb 20 2014

🚨︎ report

ELI5: Why would I ever want to use the multivariate normal distribution

I don't get it. Can't I just draw from separate univariate normals?

👍︎ 2

💬︎

👤︎ u/quaternion

📅︎ May 14 2012

🚨︎ report

Fitting multivariate normal distributions waterprogramming.wordpres…

👍︎ 6

💬︎

👤︎ u/one_eyed_golfer

📅︎ Sep 22 2016

🚨︎ report

The Matrix Cookbook - A summary of properties of and calculations with matrices, including applications to multivariate probability distributions (xpost from MachineLearning) matrixcookbook.com/

👍︎ 13

💬︎

👤︎ u/dY_dX

📅︎ Feb 01 2011

🚨︎ report

Multivariate Normal distributions

> Suppose X is distributed '[; N_n(\mu,\Sigma) ;]'. Let '[; \overline{X} = n^{-1}\sum_{i=1}^nX_i ;]'.

I am asked to find the distribution of '[; \overline{X} ;]', if all of its component random variables '[; X_i ;]' have the same mean '[; \mu ;]'.

I'm not quite sure what exactly I'm suppose to be doing. Can anyone nudge me in the right direction?

edit: I cant seem to get the latex to display properly. This is what the first line is suppose to look like: http://imgur.com/n0CDz

👍︎ 2

💬︎

👤︎ u/chewitard

📅︎ Oct 31 2011

🚨︎ report

[University Calc 2] Multivariate limit I couldn't solve.

Hi,

I'm trying to prove or disprove,

https://imgur.com/1CNZFKt

First of all, I tried disproving by finding 2 paths that will give different limits, but without success.

If I'm trying to prove the statement, The cosine tells me that the squeeze lemma might be helpful here, but then I'll have to show that,

https://imgur.com/bfEAyu9

and I have no idea how to prove that. (Tried squeeze lemma again but couldn't find tight bounds.)

I would really appreciate every help from you guys :))

** BTW, English is not my native language so I'm sorry for the grammar error in the post.

👍︎ 2

💬︎

👤︎ u/WhiteAlpha289

📅︎ Sep 18 2020

🚨︎ report

🤡🤡🤡bUt wE nEeD vOlUmE🤡🤡🤡 lol watch and learn the line at the bottom shows distribution. People are holding notice it stays flat while prices change? New buyers stepping into the fold isn’t even needed here. It’s just a waiting game at this point.

👍︎ 43

💬︎

👤︎ u/Memestockinvestor

📅︎ Nov 30 2021

🚨︎ report

Hyperparameter Tuning on XGBoost Multivariate Time Series Classification

I have learned the following hyperparameters are recommended for XGBoost tuning: max depth, subsample, col sample by level, col sample by tree, min child weight, lambda, alpha, n estimators and learning rate.

If I am dealing with a multivariate time series classification with 300k samples and 4, 50, 400, and 900 extracted features, are there special considerations I have to make regarding hyperparameter tuning? Should I still tune the same hyperparameters as mentioned above? Please advise on any recommendations you may have. Thank you!

👍︎ 3

💬︎

👤︎ u/cytbg

📅︎ Nov 09 2021

🚨︎ report

Why don't papers remove multivariate outliers?

This is something that bothers me a little. On the one hand, statistics classes tell us to remove multivariate outliers from our regression models. On the other, I hardly ever see this in practice in empirical papers (outside statistics papers). There's usually no mention of it in papers.

Edit: Thanks to all who responded. This is turning out to be a really insightful thread and I am learning lots.

👍︎ 28

💬︎

👤︎ u/lightsnooze

📅︎ Jul 22 2019

🚨︎ report

"Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN", Freirich et al 2018 arxiv.org/abs/1808.01960

👍︎ 9

💬︎

👤︎ u/gwern

📅︎ Aug 17 2018

🚨︎ report