A list of puns related to "Log normal distribution"
Ok, so I've got a lognormal distribution with mean = e and sd = 19.9.
I'm trying to find P(X > 5) and I'm using R's plnorm command like so :
> plnorm(5, meanlog=exp(1), sdlog=19.9, lower.tail=FALSE)
[1] 0.5222179
But apparently the correct value I'm looking for should be close to 0.09 so I'm WAY off.
Can anyone help?
Hello Friends
I have started working with Python recently for a personal project. I am not a statistician, nor a computer scientists. I have collected the data for an entire population. I take the log of that data and I plot it on a normal distribution. (The Log data fits quite well to a normal distribution!) Log-Normal Distribution.
Everyday, I get a new a value. I take the log of that value, and python gives me the CDF of that value relative to the Log values of the population now as a normal distribution.
Example:
If the value for today is 12 (Log(e)12=2.4849066498) -- it gives me a CDF of 0.54. Can I take this Cumulative probability distribution value at face value (0.54) or 54%, or do I have to do some sort of conversion (because my data has been converted to log values) to get the correct CDF, or is this the correct cdf?
Mean is 3300 Standard deviation is 1074
How would I go about working this or any percentile out?
Thanks in advance :)
I can't find the answer to the following. "A lognormal distribution is being used to model the pollution of SO2. The mean concentration is 1.9 and the standard deviation 0.9. What's the probability that the SO2 concentration lies between 5 and 10?"
I don't know how I can bring it to the standard normal distribution. All I got to was P(5<Y<10) = P([ln(5)-1.9]/0.9 < Z < [ln(10)-1.9]/0.9) and I know it's incorrect because the numbers 1.9 and 0.9 are wrong there.
Hi. I'm currently writing my thesis and I'm trying to figure out a way to draw log-normal distributions with ranges of median and variances. Context: I'm evaluating different powder production techniques. Different process parameters result in different particle distributions (log-normal). Because the processes are physically different, I can't compare the particle distribution for one set of process parameters.
So I need a way to illustrate a log-normal distribution of a process with varying variances and median.
does anyone know how to do this in Excel?
I have a data set, say x, and I have found that the natural log transformation of the data, log(x), follows a normal distribution. Would this suggest that x follows a lognormal distribution or something else entirely?
I am trying to prove this following equation for a log-normal distribution:
E(Rt) = exp(Β΅ + Ο^2 /2) β 1
The text book gives the hint to use the expression:
E(X^n) = exp(nΒ΅ + (Ο^2 * n^2)/2), n > 0
I understand that the expected value, or the first movement, just makes all the n=1. But in the first equation where does the "-1" at the end of the equation come from?
Furthermore, how would you derive or prove the second equation?
Thanks!
I was doing some research for my probability and statistics class, and i stumbled upon the log-normal distribution. Since our professor recommended finding the mean, variance and median of some common transformations of the normal distribution on our own, i went i head and tried.
I have showed mean, variance, but i'm struggling with the median. Intuitively it would make sense that it would just be the inverse transformation applied to the transformation applied to the un-transformed median.
So that if X has log-normal distribution, Y=log(X) has a normal distribution Y~(0,1). To find the median of X you take the exponential of the median of log(X), which would be exp(log(0))=1 in this case.
But i feel like om missing an argument, because i feel like this no prove only an idea which seems to be correct.
Could anybody lead me to a paper or give me some hint to how i would go about proving this?
(Sorry about the rambling, this is all very new material to me, and english isn't my first langauge.)
So I plotted a log likelihood function in R for a random normal sample I pulled and noticed that whenever I would do it, it seemed like there was a clear maximum LL on the curve for sigma^2 but for mu, it would stay relatively constant. I'm pretty sure I've written the code correctly, so I wanted to get your guys' opinion on this matter. Is this what a LL curve is supposed to look like for a normal distribution? It just seems weird that when the sigma^2 is maximized, mu barely varies along that dimension, whereas sigma^2 can largely vary along that dimension (because of the steep curve).
Here is a link with pictures of the curve: http://imgur.com/a/Qr9wq
Just to be thorough and in case any of you are useRs, here is the code I used (note, you will need the package rgl to run the plot):
#Define LL function
LL <- function(X, theta)
{
mu <- theta[1]
sigma2 <- theta[2]
log.likelihood <- 0
n <- length(X)
for (i in 1:length(X))
{
log.likelihood <- log.likelihood - (((X[i]-mu)^2)/(2*sigma2)) - log(sqrt(2*pi*sigma2))
}
return(log.likelihood)
}
#Parameters
Mu <- 100
Sigma2 <- 50
#Sample
N <- 100
set.seed(1)
IQs <- rnorm(N, mean=Mu, sd=sqrt(Sigma2))
#Possible values to test
x <- posMu <- seq(80, 120, length.out=200)
y <- posSig <- seq(20, 60, length.out=200)
#x1 <- sort(x, decreasing=T)
#Produce LLs for plotting
LLlist <- NULL
for (m in 1:length(posMu)){
LLs <- NULL
for(s in 1:length(posSig)){
posTheta <- cbind(posMu[m],posSig[s])
LLs <- c(LLs, LL(IQs,posTheta))
}
LLlist <- cbind(LLlist,LLs, deparse.level=0)
}
z <- LLlist
#Find the approximate MLE
mLL <- which(LLlist == max(LLlist), arr.ind=TRUE)
cbind(posMu[mLL[2]],posSig[mLL[1]],LLlist[mLL])
#Graph the LLs
library(rgl)
open3d()
plot3d(mean(x),mean(y),mean(z), xlab="Mu", ylab="Sigma2", zlab="log L", xlim=c(min(x),max(x)), ylim=c(min(y),max(y)), zlim=c(min(z),max(z)))
surface3d(x, y, z, color=rainbow(length(x)))
Hello, could someone explain why the log-normal distribution's mean is e^(u + sigma^2)/2 and the variance is (e^(sigma^2) -1)*e^(2u + sigma^2) ?
I'm just not sure how these are derived.
So basically I have to create 1000 random values with log-normal distribution. I know how to create this by using LOG10(LOGINV(RAND(),mean,sd)). This is all fine, but I want to know how to skew the distribution to the left/right. I've been scouring the interwebs for answers, and I've found a couple of solutions but they are inelegant and too convoluted. Can anyone help? I'm using Excel 2007.
When I try to do it myself with the formulas I have I always end up with ln (E+R), where E is the expectation. I have no idea what I'm doing wrong. Any help is appreciated!
https://preview.redd.it/cfsxemgbu2u61.png?width=677&format=png&auto=webp&s=63147e43cb99658a7aa926d5238d400a0f5ba46a
Okay, so I understand the sampling distribution of the mean approaches a normal distribution with a large enough sample, even if the population itself is not normal. Okay.
Since I don't understand the logic of why this is true, only that it is true if you actually try it with a computer program, I am left with a lot of questions. Like does this principle also apply to a sampling distribution of trimeans, medians, ranges, modes, correlations, etc.? Also, is there a quantifiable difference between how many samples you need to get a near normal distribution depending on the original population (e.g., depending on skew, size)?
Hi would it make sense to run a log-transformed gamma glm?
My model is:
log(y) ~ log(x) + z, family = Gamma(link = "identity")
I chose the gamma distribution as my data is always positive and was skewed and my model selection showed it it to be much better than the linear model, I've not seen much about logtransformed results but I am not sure if this is doing something that invalidates it
Hey guys, sorry if this might be a stupid question for you all. I work in finance and for a model I'm working on, we are working with a bivariate normal distribution, so I need to understand it.
I tried reading wikipedia and surfing the web, but all the mathematical explanations kinda go over my head. I understood the gist of it, but I'm looking for an explanation for dummies, an ELI5 if you will.
Willing to give out gold if someone makes me understand it.
Edit: Thanks so much guys, all your explanations have been super helpful!
I think what did the trick for me was making up an example in my head with two variables (e.g. IQ and height of a population) and wrapping my head around how for example the height of every person with 110 IQ is normally distributed and vice versa.
But every comment helped putting the pieces of the puzzle together, so thanks again!
Hi,
Im currently doing some portfolio evaluating, with measures such as Expected Shortfall, Value at Risk and Skewness / Kurtosis. I have read that normal distribution is important for this risk measures, but have not been able to see any specific cases. Will ES and VaR measure ES and VaR as lower than the real risk if the returns are negative skeewed?
I would be glad if you had some graphs/articles that could show what happens in situation with none-normal distributed returns.
Looking for a resource on how to transform data (better fit for normal distribution) in R
Any help?
I am working with some election data using pandas. I would like to know how votes from party A would transfer to parties B and C in each of the 650 seats if party A did not exist.
We assume that we know that nationally:
I am looking to generate a normal distribution of numbers between 0 and 1 for each seat, where:
As an example with completely separate numbers:
seat | to_B | to_C | to_dnv |
---|---|---|---|
1 | 0.5 | 0.3 | 0.2 |
2 | 0.1 | 0.6 | 0.3 |
3 | 0.3 | 0.3 | 0.4 |
... | ... | ... | ... |
650 | etc | etc | etc |
Here in this manual example:
The motivation is such that later I zip together this table with another separate table I have already built which contains the election results of each seat. Then I will use these normally distributed numbers to redistribute party A's votes into party B, C and DNV.
What is the best way to go about generating such a matrix? Preferably in Pandas.
Current I am using a Dirichlet Distribution, but this has it's problems. Note that below, I am using more precise numbers for the actual averages expected.
Code so far:
# Create Dirichlet Distribution
average_to_labour = 0.514921161
average_to_conservative = 0.29652857
average_to_dnv = 0.188550269
seats = 650
redistribution_matrix = numpy.random.dirichlet((average_to_labour, average_to_conservative, average_to_dnv), size=seats)
redistribution_matrix_df = pandas.DataFrame(redistribution_matrix, columns=['to_lab_share', 'to_con_share', 'to_dnv_share'])
redistribution_matrix_df = redistribution_matrix_df.rename_axis('seat')
redistribution_matrix_df[['to_con_share']].plot(kind='hist',bins=[0,0.2,0.4,0.6,0.8,1],rwidth=0.8)
These fulfil the expectations. Each row sums to 1, and the average of each column ro
... keep reading on reddit β‘Lets say i am given a distribution of vectors x in Rd with ||x||<s and pdf
p(x)=1/Z exp(-x^T H x)
if H is positive definite, this is a variant of the multivariate truncated normal distribution. However, unlike in the unconstrained case, the distribution is still normalizable if H has negative eigenvalues.
Is anything known about distributions of this type? I was not able to find anything, but maybe i just don't know the right term. I am especially interested in whether it is possible to sample from this distribution or whether the normalization constant Z can still be computed somewhat analytically.
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.