A list of puns related to "Gaussian"
I came across this quote from Karl Pearson " Many years ago [in 1893] I called the Laplace-Gaussian curve the normal curve, which name, while it avoids the international question of priority, has the disadvantage of leading people to believe that all other distributions of frequency are in one sense or another abnormal "
And recently I also came across this tweet "Originally, Gauss (in 1823) used the term "normal" (in the sense of "orthogonal") referring to the geometric interpretation of a system of linear equations from which the distribution bearing his name is derived "
So my question is two fold :
Did Gauss originally use the term 'Normal' referring to the geometric interpretation of a system of linear equations ?
Was Pearson influenced by Gauss' s original usage of the word 'Normal' and therefore used the word 'Normal curve' or 'Normal Distribution'.
I have a set of 90 highly autocorrelated time series and I am interested in whitening them in such a way as to remove as much of the autocorrelation as possible.
My initial thought was to replace each series with a transformed series such that every value x_t is replaced by h(x_t | x(t-1), x(t-2)...x(t-Ο)), where h( | ) is the local conditional entropy and Ο is the embedding dimension.
It then occurred to me that, for Gaussian variables, this might be equivalent to serially regressing out increasing lags: first calculate the residuals when correlating X(-1) against X_0 and re-z-score, then calculate the residuals when correlating the last residual time series against X(-2), and so on for Ο steps.
Is there a reason this would work?
Can I say that if I have a variable which is distributed like a gaussian, then it must be the result of the sum of a lot of independent random variables? This question could be rephrased as: "CLT implies gaussian, but does gaussian implies CLT?"
Say I'm doing a simple multiple regression on the following data (R):
n <- 40
x1 <- rnorm(n, mean=3, sd=1)
x2 <- rnorm(n, mean=4, sd=1.25)
y <- 2*x1 + 3*x2 + rnorm(n, mean=2, sd=1)
mydata <- data.frame(x1, x2, y)
mod <- lm(y ~ x1 + x2, data=mydata)
I don't get the statistical difference between:
tmp <- predict(mod) + rnorm(length(predict(mod)), 0, summary(mod)$sigma)
(R function simulate
);tmp <- rnorm(length(predict(mod)), mean(y), sd(y))
;What is the proper way to resample the dependent variable assuming gaussian GLM?
I ran an HF calculation in Gaussian with the option formchk, and I'd like to take a look at the output orbitals. Usually I would use Gaussview, but our HPC system's interactive desktop is down right now.
The Gaussian documentation states:
" formchk converts the data in a Gaussian checkpoint file into formatted forms that are suitable for input into a variety of visualization software."
So... what visualization software is it talking about? Any suggestions of useful ones for big systems?
Rstudio stats ape here. I've been seeing some toilet paper stats surrounding the DRS'd share count. If you really want to figure out the distribution of shares owned you need an Inverse Gaussian Distribution. This type of graph is heavily weighted towards low number x values, in this case number of shares owned. We would expect there to be many thousands of Computershare accounts with only a few shares, and only one or two outliers far out on the x axis in the millions of shares, creating a distribution with a large head and long tail:
https://en.m.wikipedia.org/wiki/Inverse_Gaussian_distribution
https://aosmith.rbind.io/2018/11/16/plot-fitted-lines/
https://www.statmethods.net/advstats/glm.html
https://bookdown.org/ndphillips/YaRrr/linear-regression-with-lm.html
This is how you #might analyze Computershare account data in R with this distribution if it actually mattered what the average shares per account is, which it doesn't, because we don't have enough data and the data we have is biased towards large values.
# This code is untested
library(ggplot2)
library(readr)
library(stats)
# Many accounts have only 1 share, more have two, some have three,........,DFV, Ryan Cohen are last with the most shares
RC_shares <- (the max number of shares in one Computershare account is Ryan Cohen's account)
# Make a numerical vector as the x variable
number_of_shares<- c(1:RC_shares)
# Read in the data you collected on number of shares per account, binned and ordered.
num_accounts <- read.csv("path_to_data.csv")
fit_model <- glm(num_accounts ~ shares_owned, data = shares_owned, family = gaussian(link="inverse"))
summary(fit_model)
# Make a column of predicted values based on the linear model
num_accounts$predlm <- predict(fitlm)
# Plot the histogram with the regression line
ggplot(num_accounts, aes(x=shares_owned)) + geom_histogram(bins = RC_shares-1) + geom_line(aes(y = predlm), size = 1)
####Question: Shouldn't this be a Poisson distribution as a Poisson distribution measures discrete values?
####Response: The poisson distribution is:
> "...the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time
... keep reading on reddit β‘Okay, so English isn't my first language so i don't know how coherently can i express this question, okay so there's Gaussian elimination that says you can invert a 3x3 matrix by turning the edges of the matrix to 0s and the main diagonal into 1s and the opposite matrix (identity matrix? In think is in english) will be result in the inverse of the first, so you can either do by interchanging lines or by multiplying a line by any number that isn't zero, So how do i know which to use in which situation, i have looked all over the internet and there's no proper explanation, let's say there's a matrix: 1 2 3
0 13 5
4 2 4
And I'm trying to turn the 13 into a 1, i can either multiply it by a fraction or multiply the first row by -6 and ad it to the second. Are there actual rules inside the Gauss system that i should follow?
I'd like to clarify this is a completely random matrix and this isn't a homework question, i'm going to do some exams and have to be extremely specific while answering, the teacher also scores your methods of solving
Just met the normal distribution today and worked really hard trying to get the normalization constant of its pdf. Was surprised to later read that such a solution does not exist.
Wanted to know why this is the case from a deeper understanding.
Hello,
Just a small question on principal curvatures and gaussian curvature. I know that for regular surfaces, we cannot have complex principal curvature; this is also a result of the shape operator of regular surface being self-adjoint.
But what would it mean for a surface to not be regular/have complex principal curvatures?
I tried to google but I couldn't find even find a good example of non regular surfaces
Thanks!
So I've been posting loads on here recently and it's been really helpful. Apologies for the short number of posts in a few days. Anyway, I've been simulating the Bit Error Rate for different coding schemes recently and I am now simulating a convolutional code with a code rate of 1/3 and it is decoded using a Viterbi decoder which I think I've finally got working.
I need to assume the encoded signal is modulated by BPSK and is subjected to AWGN, simulated by adding Gaussian noise with a variance of:
Ο^(2) = N0/(2Eb)
where N0 is noise power and Eb is energy per bit
Now, previously I've done this by doing:
SNRdb = 1:1:12;
SNR = 10.^(SNRdb/10);
corrupted_signal = (sqrt(SNR(j)) * codeword) + randn(1, length(codeword));
So I define the SNR values I want to use, in decibel form. Convert this to a linear form and then for each SNR value, the received signal with noise is equal to the square root of the current SNR value multiplied by the incoming signal. This is then added to a randn vector which has the same length as the incoming signal and each value is drawn from the normal distribution.
In this case, I assume that sqrt(SNR) is the standard deviation, so in the case of:
Ο^(2) = N0/(2Eb)
We'd instead have:
corrupted_signal = (sqrt(0.5*(1/SNR(j))) * codeword) + randn(1, length(codeword));
But that can't be right surely as it's having the opposite effect as intended? The output of the entire sqrt bit is decreasing as SNR increases meaning that there's a higher chance that bits could be flipped when decoded. How do I go about incorporating this variance into my algorithm when adding noise?
Context: I am trying to get the coefficients for a natural interpolation spline function. The matrix for this function is rather large (112 by 113) since there are 28 data points. Iβm using an emulator to do my calculations since my ti84 plus doesnβt have enough memory to hold the matrix.
Using the rref matrix function, I have been trying to get the coefficients. The ones I have been getting are not giving me the proper function and are larger compared to the coefficients that a different online calculator gave me. The reason why I canβt just use the values that the online solver gave me is because I need to show my work and understand the steps.
Iβve checked over the matrix multiple times and thereβs nothing wrong with it. What could possibly be going wrong?
Iβm using the method from the website below: https://timodenk.com/blog/cubic-spline-interpolation/
I was reading about maximum likelihood principle. Gaussian distribution was used in the calculation to show that negative log likelihood is same as the squared loss.
Why only gaussian is used for calculation and if its just for sake of explaining then why we use squared loss as a metric for many dataset.
Hello, I am using Gaussian 16 to calculate the energies, frequencies, orbitals etc, of several molecules I work with. I have a Ryzen 9 3950X and 16 GB of 3600 MHz 16-16-16-36 (DOCP Profile) RAM sticks. I have read in multiple places that Ryzen CPUs work best with memory with tighter timings. I have rigorously tested another profile for these sticks that is stable over several days of Memtest86, Prime95 and TestMem5. However, this profile has lower clock speed. It is 3200 MHz 14-14-14-28. Now I have actually tested the normal DOCP profile vs my second profile while mining RandomX, just to see if it would make any difference and the second profile (3600-14-14-14-28) actually had a 22% increase in Hashrate which translates to 22% increase in performance for that task. When I tested it for CinebenchR20, there was actually a performance decrease. I was wondering, since mining is pretty calculation heavy, if maybe that would translate into better performance when using Gaussian. Sorry if this is a stupid question, I am still relatively new to Gaussian which is why I came here to get some more insight into the matter.
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.