A list of puns related to "Covariance And Contravariance Of Vectors"
I've been peripherally aware of these adjectives 'contravariant' and 'covariant' for a while now, and I recently decided to have a stab at figuring out what they mean.
They have been explained to me by means of this diagram: http://i.imgur.com/l6spv.jpg
Here's what I get from it.
The covariant components of v have lower indices and are the distances, in the directions of the basis vectors, to meet lines perpendicular to the basis vectors and going through the tip of v.
The contravariant components of v have upper indices and are the distances obtained by travelling in directions parallel to both basis vectors.
But, /r/physics, what the bloody hell is going on? What's the point of all this? Why is it necessary to be able to talk about the components in these two different ways?
I have been told that it happens that ||v||^2 = v^j v_j (with summation implied), where v_j = g_(ij)v^i and g_(ij) = e_i . e_j but I don't really see it, though.
I'm just starting graduate school and I've never had any experience with covariance and contravariance, and I've had very limited exposure to tensor analysis. I've tried multiple times reading up on these subjects on my own but have had no personal breakthroughs in understand what these concepts are all about. Any good analogies or online resources that would help?
I have used both in (undergraduate) math and physics classes and I can recite you the formal definitions (variance with change of basis transformations or their inverses), but I am struggling a bit with how to think of covariant and contravariant vectors on an intuitive level, or how to think of them geometrically. Does anybody have a good explanation or a good model for thinking about the distinction between covariant and contravariant vectors?
Thanks.
How do I find the limits on this example
The probability is the continous funcion f(x,y)=e^{-x} with the restriction 0<y<x, 0 otherwise, I found the marginal densities as follows:
f_X(x) = integral of f(x,y) from y=0 to x = xe^{-x}
f_Y(y)= integral of f(x,y) from x=y to infinity = e^{-y} (which I am not completely sure the limits are correct but the result is correct)
Then on the next step I try to find E[X] and E[Y] to find the covariance Cov(X,Y)=E(XY)-E(X)E(Y) which according to the book should be equal to 1.
I have tried finding the expectations changing the limits but I am not sure where I made a mistake.
E(X)= integral of x*xe^(-x) from x=y to infinity = (y^2+2y+2)e^(-y)
E(Y)= integral of y*e^(-y) from y=0 to x = 1-e^(-x)(x+1)
E(XY)= int_0^x{ int_y^infty { x*y*e^(-x) dx } dy} = 3-e^(-x)(x^2+3x+3)
I also have tried changing the limits of the integration of the variable x to (0,1) and (y,1) but without success, is the formula I am applying wrong?
Hi. It would be a great to see a video on covariant and contravariant tenors for a better visualization and understanding. Can you please give it a go. Thanks a ton in advance.
Got my psychometrics midterm coming up and I just need a five-year-old explanation of why this is.
This is my vague and quite likely incorrect understanding so far: covariance just looks at how X varies with Y (and vice versa), and not the variance from the rest of the datasets of X and Y. So the variance of X that varies along with Y is just a part of the entire variance of X and Y and so it would be smaller.
I also see that the largest the covariance X and Y could possibly be is equal to SD(X) * SD(Y) (for a perfect correlation of r = 1), so I'm also trying to find some reasoning out of that way.
Am I on the right track? Completely on the wrong track? Somewhere in between?
I have n vectors of dimension m where m ~= 10^4 and n ~= m^2. Calculating the covariance matrix in the usual way is too slow. If I only want to know approximately the first couple of eigenvectors, can I speed this up?
When studying a concept in stats, I really enjoy picking apart the equation that defines it. Thinking about a statistic mathematically really helps me understand exactly what that number actually represents, and how it relates both quantitatively and conceptually to other statistics.
I've recently been trying to do this with covariance, defined (for samples) as:
sigma[ (x-x_mean) * (y-y_mean) ]
--------------------------------
n - 1
I was struck to see that this is pretty much exactly the same formula as variance, except that rather than multiplying an observation's deviation from the mean by itself, you multiply it by the deviation of the second variable.
My question(s?) is, why do we multiply here? How should I interpret this relationship? What would it mean if we added instead of multiplied?
Understanding why we multiply in the case of covariance, I think, should also help me understand more deeply what the variance statistic (sigma^(2)) represents too.
I am using this example from the book "Linear Algebra and Its Applications" to produce the sample covariance matrix.
https://imgur.com/a/oceFrtw
This is my R-code
M = cbind(c(1,2,1), c(4,2,13), c(7,8,1), c(8,4,5))
m = apply(M, 1, mean)
X = M-m
X%*%t(X)/3
> [,1] [,2] [,3]
>[1,] 10 6 0
>[2,] 6 8 -8
>[3,] 0 -8 32
This produces the same output as the book. But when I use the built in Cov to calculate the sample covariance I get something else.
cov(M)
> [,1] [,2] [,3] [,4]
>[1,] 0.3333333 -2.166667 1.333333 -0.8333333
>[2,] -2.1666667 34.333333 -22.166667 -1.3333333
>[3,] 1.3333333 -22.166667 14.333333 1.1666667
>[4,] -0.8333333 -1.333333 1.166667 4.3333333
How come?
In a VAE, the reparameterization trick (z = \mu + epsilon * \sigma) is an affine transformation of a standard Gaussian variable with diagonal covariance and as such, it should still result in a latent variable that has diagonal covariance. The KL term should serve to drive the density of this latent variable close to the standard Gaussian prior.
Now suppose we turn off the KL term by multiplying it with zero. All that would be left is the conditional likelikood. Below, there's an image of the density of the latent variable, trained in such a regime. The covariance here is clearly not diagonal
My question, then, is the following:
The latent variable is still produced as an affine transformation of a standard Gaussian variable with diagonal covariance, which would simply scale and shift the original variable. How is it that it exhibits such structure as below then?
https://preview.redd.it/6vv6sh2wc5f21.png?width=640&format=png&auto=webp&s=1611e343902279467bea88782e36de3164e1d121
I know covariance( X, Y) could represent the variation trend of variable X and Y, but what does a covariance matrix show us for the covariance matrix of and 1-D array?
Thanks a lot
Why is [; \mathrm{cov} \bigg(\frac{X-E(X)}{\sqrt{var(X)}},\frac{Y-E(Y)}{\sqrt{var(Y)}}\bigg) = \mathrm{cov} \bigg(\frac{X}{\sqrt{var(X)}},\frac{Y}{\sqrt{var(Y)}}\bigg);] ?
I was watching youtube video to understand what Covariance and Standard Deviation mean.
the guy in the video divided by (n-1), but my book says denominator is n????
Why is it different????
Can you explain with some examples???
Thanks in advance
Edit: title should say multilinear functions
Hi! I've been working with multilinear machine learning models recently, and trying to construct the correlation between items, given that you let other items vary.
In other words, given a multilinear function f, I want to calculate the correlation between f(a, X, Y, ...) and f(b, X, Y, ...) where X and Y are discrete distributions of known values, for several values of b (I want to find the most correlated items b, given an a and the function f).
To do this efficiently, I construct a covariance matrix over the parts of the function that varies.
For example, if f(a, x) is the dot product between vectors a and x, it boils down to computing the covariance matrix of X, and if f(a, x ,y) is the sum of the elementwise products of vectors a, x, and y, the corresponding covariance matrix can be computed from the individual covariance matrices of X and Y.
So far I've been doing this "by hand" on a function to function basis, and now I'm wondering if there's a general way to go from multilinear function formulation to covariance computation (where you let certain inputs be fixed).
Hi everyone,
We have have 4 numbered balls (1 to 4). After each draw, the ball is put back in.
Let X be the number of draws required to get a ball that was already drawn
Let Y = X1-X2, where X1 and X2 are the number of draws required to get a ball that was already drawn, so X2 is the second time it happens.
For example : 1121
X=2
Y = 4-2 = 2
I need to prove that Cov(X, Y ) = β753/2048
So far, I've calculated the probability of every possible values of X, Y and (X,Y). X = {2,3,4,5} and Y = {1,2,3,4}
My question is how do we calculate the expectation of a pair (X,Y)?
I know that E[X] = sum of (x*P(x))
But how do we do E[(X,Y)]? Is it sum of (x,y)*P[(x,y)] and we apply the product distributivity on (x,y)?
Thanks!
Hey guys, I'm trying to implement a 2D parzen window on a cluster of data to estimate the pdf. I'm doing this for school and one of the requirements is to use a Gaussian window with covariance Ο2=400Ο2=400.
I decided to use the gaussian_kde class provided by scipy.stats. However, I'm not sure what value of bandwidth to provide. I see documentation about Scott's rule and Silverman's rule but I was wondering how to incorporate the Ο2=400Ο2=400requirement into this parameter.
In other words, what is the relationship between the covariance of the Gaussian parzen window and the bandwidth parameter of the gaussian_kde class?
Any insight would be great, thank you!!
Hi all, self-teaching and not a strong math background so please bear with me.
Have carefully watched videos on using least squares method to fit a line, and I can get to r this way. I think I've got my head wrapped around the least squares method pretty well -- like I get the way it is used to partition model error. (I've used Brandon Foltz for this.)
However in my text book (Andy Field), r is derived using "covariance":
r= covariance sub(xy)Β Γ· s-subx *s-suby
I can tell these two approaches are related, but I'm having trouble connecting them.
I've watched a couple of videos on the relationship between covariance and correlation, but I'm still not getting it.
Any ideas/resources?
Thank you.
So far my favourite class has been Linear Algebra, it was linear algebra for math majors so the focus wasn't learning how to operate matrices, and we worked on fields other than R and C.
My question is, are there any interesting applications of linear algebra that make extensive use of fields other than R, or vector spaces other than R^n and matrices over the real numbers?
I'm having trouble understanding how covariance measures the relatedness between two variables. What is covariance and how does it relate two variables?
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.