A list of puns related to "Statistical Inference"
I posted this on an econometrics subreddit, but I want to also ask the more general statistics community. I hope that's okay! Note, I'm not an instructor, but starting my Ph.D. soon it hopes of teaching econometrics!
For Students: When you took your first statistical inference course, did you REALLY understand statistical inference by end of the course?
And for professors: Did you think the average student truly understands statistical inference?
By "really" and "truly", I'm not talking about being able to robotically calculate statistics: t-stat, p-value, CI and knowing when to reject and fail to reject the null... the stuff that gets you an A+ in Stats 101 if you can repeat them on the test. I got an A+ in intro statistics because we had past finals/midterms and I just "memorized" how to do it on the exam. The exam was literally the same as a practice midterm just different numbers/scenarios.
I'm talking about the bigger picture of statistical inference. If you asked your average introductory statistics student (or even just the A+ students) the more philosophical and epistemological questions, would they be able to give a confident answer?
In my experience, I feel like most students get lost in learning the methods of calculations, they neglect to appreciate the bigger picture. It's like a student getting an A+ in Calculus because they can apply all the derivative rules perfectly but somehow never really understood that a derivative is the rate of change.
Most of the stuff intro stats students are evaluated on are things a computer can do for them. They are neglected or not truly tested to understand the epistemological value of knowing how statistical inference works.
Hi everyone,
I have recently been learning a lot about time series regressions primarily through this book by RJ Hyndman and co. (https://otexts.com/fpp3/). In the Time Series regression section, the author primarily gives examples where the target and predictors are both at the same intervals (i.e. monthly, quarterly, or yearly)...
However, I was curious if you can fit a model where the target variable is a monthly series and the predictors were a combination of both monthly and quarterly data. The data could look sometime like this:
Period | Target | x1 (monthly) | x2 (monthly) | x3 (quarterly) |
---|---|---|---|---|
Jan 2018 | 10.25 | 50 | 5 | 100 |
Feb 2018 | 9.5 | 42 | 6 | 100 |
March 2018 | 10.8 | 47 | 8 | 100 |
April 2018 | 12.75 | 50 | 12 | 250 |
May 2018 | 13 | 49 | 13.5 | 250 |
... | ... | ... | ... | ... |
In this case, the time series linear model (TSLM) would be: target = b0 + b1x1 + b2x2 + b3x3 + error_term
Notice that the quarter predictor will have repeated measures for months that are from the same quarter. Is this a cause of concern from an inference perspective (beta estimation, confidence intervals, and p-values) and/or forecasting perspective?
I have yet to learn about dynamic regression and ARIMA models. So, maybe the answers are there somewhere. But, so far I couldn't really find satisfactory resources.
Would love to hear your opinions!
Cheers
I plan on majoring in either chemical engineering or environmental engineering. Which would be more appropriate to meet my math requirement for these majors?
I have the option between Statistical Inference I (covers probability and calculus-based statistics) and Applied Partial Differential Equations I. Both are unfortunately proof-based, so I don't plan on taking both. Which is more important in the field?
I know PDEs are used more in future engineering courses (fluid mechanics), but I hear statistics is incredibly useful in industry. Thoughts on the matter?
Anyone take this class? 01:960:291
How was it?
Iβm a statistics Major at my university, and I really enjoyed my statistical inference class I took last semester. We had a big emphasis on estimators, hypothesis testing, and some computation. we learned how to derive estimators, maximum likelihood estimation, rao-Blackwell, sufficiency, bias, variance, compute MSE, consistency , compute confidence intervals, bootstrapping, different parametric tests, Monte Carlo, Also did all of this in R. Upcoming and later statistical inference classes in my major include more of a computational side, coding up simulations, resampling methods, jackknife, permutation tests, non parametric hypothesis testing etc.
I have one statistical learning class, which is an elective. My question is, with such a rigorous emphasis on statistical inference and hypothesis testing, how much can I apply the stuff Iβve learned here in industry and in data science? Can the concepts I listed above be applied anywhere in Machine learning? The linear models and glms class i take will cover most of the basics of statistical learning. But the classical hypothesis testing and statistical inference, where is that applied in todayβs world of data science and machine learning?
I'm considering pursuing a major in statistics and so I need to complete either a course on stochastic processes or statistical inference (using Casella and Berger). Is either one of these courses obviously more useful than the other if my goal is to become a machine learning engineer or data scientist?
The stochastic processes course uses the book 'Introduction to Probability Models' by Sheldon Ross.
Would one of these books be much more difficult/time consuming than the other?
Hi folks,
I am having trouble wrapping my head around missing data methods taxonomy. I know missing data methods range from complete case analyses, to weighting methods (probably suitable for survey data) to imputation based methods.
I understand that single imputation means replacing a missing value by another plausible value whereas multiple imputation means replacement with multiple plausible values. However, not alot of articles seem to be talking about inference with single imputation method.
I have been reading articles from Donald Rubin, Roederick Little ,Stef Van buuren, etc. If anyone has a list of other authors' i should be looking into, please let me know!
Thank you and wish u all a nice Friday!
Last year I made my Statistics 2 (inference) course in my BS in Economics. During the semester, I used to study from the teacher notes, and from the Berenson, Levine & Krehbiel book. For me, this last one was quite concise and easy to understand, but, even so, it is a lacking in mathematical proofs and explanations book in my opinion. So, I would like you to recommend me a statistical inference books that you consider are better to understand not only the whole inference concepts, but also mathematical proofs and deductions.
I'm trying to perform a simulation to try to detect the effects of a non-normal distribution of the error on things like the % of times a test with H0: B1=2 (when the B1 is actually 2) is rejected (which in this case should tend to alpha but should be huge on small samples on skewed distributions), however the results have been disappointing.
Can someone explain to me why is the statistical inference supposed to fail if the error isn't normal? Maybe if I understand this I can trace the error
In undergrad/grad school, stats was taught with the assumption that getting data for the population of interest is usually not possible, so you must use inference on a sample to draw conclusions about the population. But in a business context, you often have access to all the data you need, because the population is usually just everything within the scope of the business (sales, manufacturing, supply chain, etc.).
As a hypothetical example, let's say a given product is sold in a number of regions. The business has all data available regarding the quantity of this product sold in these regions for a given time period.
When comparing the quantities sold between regions, does statistical significance and inference matter? Does the question being asked affect whether inference is important? (e.g., ad-hoc analysis of past sales to determine which products should be discontinued, forecasting sales volume, etc.)
Hey,
Looking to find someone who is good in this subject to help me out. Dm me so we can talk more
I'm a PhD in Aerospace so I don't know many people in the math department. I got a B.A. in math so I've got the background and have taken all necessary pre-reqs. But I wanted to try to talk to someone whose taken the class before to get a good judge of if its a good fit for me.
My research is in optimal estimation of nonlinear systems. I work a lot with kalman/unscented/sigma-point filters, non-linear least squares and modeling dynamic systems, but I'm looking to build a better foundation in the fundamentals of bayesian estimation, hypothesis testing, post-fit residual testing, etc.
It LOOKS like this course should be a good fit, but I wanted to reach out to people who've taken it before to see if its worth it. I've only got 2 more courses I need to take and I'd like to make the most out of them. Ideally I'd take this course as my only course next spring (2022).
So has anyone taken it before? Any advice or insight into how it was?
Hello everyone! I'm having a bit of trouble with different paradigms regarding statistical inference.
When I'm talking about paradigm, I'm referring to frequentist inference, bayseian inference, likelihood-based inference and AIC-based inference. I believe there are more but let us limit ourself to these four. Let me sum up my concerns with the following three questions:
1. Basic question, but I'm not sure if I got it right. How am I suppose to look at these paradigms? I thought they were different ways of looking at inference, so they look at the same thing but see it differently? They are not looking at different parts of inference, correct?
2. If the paradigm is not specified, what I'm a suppose to do and could not knowing the paradigm affect me negatively?
In a recent course, we covered maximum likelihood estimation. If I'm not mistaken, this would by default refer to a likelihood-based inference. But other paradigms have their way of looking at it, and by looking at the examples given in the course, it would seem to be from frequentist inference.
3. When covering topics like MLE, Least square estimation or other theory, are they being presented in a "general statistical" point of view or do they always have a specific paradigm in mind?
I recently got a book about MLE, where they specify at the beginning:
"The properties of maximum likelihood inference that are presented herein are from the point of view of the classical frequentist approach to statistical inference ."
Does this mean that theory should be learned from all vantage point that exists for that theory?
If anyone has a good source (book, articles..) for learning more about the different paradigms I would love to hear it!
Thank you all for the help!
I need help with an exam. It covers mostly MLE, Method of Moments, confidence intervals, and parametric & nonparametric bootstrap. The test is tomorrow at any time.
Hi everyone, I hope all is well! I have a new open ecology article, and this is another one from Ecology Letters
You can find the open access link here: https://onlinelibrary.wiley.com/doi/full/10.1111/ele.13728
As always, please feel free to discuss this article in the comments below if you like. Questions, comments, or anything remotely relevant is fair game!
Abstract: Ecologists increasingly rely on complex computer simulations to forecast ecological systems. To make such forecasts precise, uncertainties in model parameters and structure must be reduced and correctly propagated to model outputs. Naively using standard statistical techniques for this task, however, can lead to bias and underestimation of uncertainties in parameters and predictions. Here, we explain why these problems occur and propose a framework for robust inference with complex computer simulations. After having identified that model error is more consequential in complex computer simulations, due to their more pronounced nonlinearity and interconnectedness, we discuss as possible solutions data rebalancing and adding bias corrections on model outputs or processes during or after the calibration procedure. We illustrate the methods in a case study, using a dynamic vegetation model. We conclude that developing better methods for robust inference of complex computer simulations is vital for generating reliable predictions of ecosystem responses.
A team from University of Michigan, MIT-IBM Watson AI Lab and ShanghaiTech University publishes two papers on individual fairness for ML models, introducing a scale-free and interpretable statistically principled approach for assessing individual fairness and a method for enforcing individual fairness in gradient boosting suitable for non-smooth ML models.
Here is a quick read: Improving ML Fairness: IBM, UMich & ShanghaiTech Papers Focus on Statistical Inference and Gradient-Boosting
The papers Statistical Inference for Individual Fairness and Individually Fair Gradient Boosting are on arXiv.
For Students: When you took your first econometrics course, did you understand statistical inference?
And for professors: Did you think the average student understands statistical inference.
By "really," I'm not talking about being able to robotically calculate statistics: t-stat, p-value, CI and knowing when to reject and fail to reject the null... the stuff that gets you an A+ in Stats 101 if you can repeat them on the test. I'm talking about the big picture.
If you asked your average introductory econometrics students the more philosophical and epistemological questions, would they be able to give a confident answer?
Describe the big picture of statistical inference, what is it, what is it trying to do?
What does it mean that the sample estimate has a distribution? What does it mean that estimator is a random variable itself? Why is this a problem?
What is a hypothesis test doing? Why are you undertaking it? Why do people do it? What information does it give you that you didn't know before?
In my experience, I feel like most students get lost in learning the methods of calculations, they neglect to appreciate the bigger picture. Imagine if a student got an A+ in Calculus because they can apply all the derivative rules perfectly but somehow never really appreciate that a derivative is the rate of change.
Most of the stuff intro stats students are evaluated on are things a computer can do for them. They forgot to understand the epistemological value of knowing how statistical inference works in the big picture because that's not what they are tested on. It makes learning about the intuition of applied econometrics difficult without a full appreciation of statistical inference.
Is it just me, or can people sort of relate?
I posted this in r/ChemicalEngineering but am also posting it here for a broader perspective since I haven't declared my major yet.
I plan on majoring in either chemical engineering or environmental engineering. Which would be more appropriate to meet my math requirement for these majors?
I have the option between Statistical Inference I (covers probability and calculus-based statistics) and Applied Partial Differential Equations I. Both are unfortunately proof-based, so I don't plan on taking both. Which is more important in the field?
I know PDEs are used more in future engineering courses (fluid mechanics), but I hear statistics is incredibly useful in industry. Thoughts on the matter?
Anyone take this class? 01:960:291
How was it?
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.