[Discussion] Do undergrads really understand statistical inference?

I posted this on an econometrics subreddit, but I want to also ask the more general statistics community. I hope that's okay! Note, I'm not an instructor, but starting my Ph.D. soon it hopes of teaching econometrics!

For Students: When you took your first statistical inference course, did you REALLY understand statistical inference by end of the course?

And for professors: Did you think the average student truly understands statistical inference?

By "really" and "truly", I'm not talking about being able to robotically calculate statistics: t-stat, p-value, CI and knowing when to reject and fail to reject the null... the stuff that gets you an A+ in Stats 101 if you can repeat them on the test. I got an A+ in intro statistics because we had past finals/midterms and I just "memorized" how to do it on the exam. The exam was literally the same as a practice midterm just different numbers/scenarios.

I'm talking about the bigger picture of statistical inference. If you asked your average introductory statistics student (or even just the A+ students) the more philosophical and epistemological questions, would they be able to give a confident answer?

  1. Describe the big picture of statistical inference, what is it, what is it trying to do?
  2. What does it mean that the sample estimate has a distribution? What does it mean that estimator is a random variable itself? Why is this a problem?
  3. What is a hypothesis test doing? Why are you undertaking it? Why do people do it? What information does it give you that you didn't know before? What does it mean that you're assuming something to be true? What does it mean about your assumption when you reject/fail-to-reject the null?
  4. What does it mean that you do not know the parameter and can never truly know what it is? How does affect your ability to understand how some random process works?

In my experience, I feel like most students get lost in learning the methods of calculations, they neglect to appreciate the bigger picture. It's like a student getting an A+ in Calculus because they can apply all the derivative rules perfectly but somehow never really understood that a derivative is the rate of change.

Most of the stuff intro stats students are evaluated on are things a computer can do for them. They are neglected or not truly tested to understand the epistemological value of knowing how statistical inference works.

πŸ‘︎ 139
πŸ’¬︎
πŸ‘€︎ u/OptimizedCatholic
πŸ“…︎ Dec 17 2021
🚨︎ report
[Q] Time Series Regression: Does modeling a monthly target variable using a couple quarterly (Q1, Q2, Q3, Q4) time series predictors along with monthly predictors violate any statistical assumptions? Would the beta coefficients make sense or are inferences void in this case?

Hi everyone,

I have recently been learning a lot about time series regressions primarily through this book by RJ Hyndman and co. (https://otexts.com/fpp3/). In the Time Series regression section, the author primarily gives examples where the target and predictors are both at the same intervals (i.e. monthly, quarterly, or yearly)...

However, I was curious if you can fit a model where the target variable is a monthly series and the predictors were a combination of both monthly and quarterly data. The data could look sometime like this:

Period Target x1 (monthly) x2 (monthly) x3 (quarterly)
Jan 2018 10.25 50 5 100
Feb 2018 9.5 42 6 100
March 2018 10.8 47 8 100
April 2018 12.75 50 12 250
May 2018 13 49 13.5 250
... ... ... ... ...

In this case, the time series linear model (TSLM) would be: target = b0 + b1x1 + b2x2 + b3x3 + error_term

Notice that the quarter predictor will have repeated measures for months that are from the same quarter. Is this a cause of concern from an inference perspective (beta estimation, confidence intervals, and p-values) and/or forecasting perspective?

I have yet to learn about dynamic regression and ARIMA models. So, maybe the answers are there somewhere. But, so far I couldn't really find satisfactory resources.

Would love to hear your opinions!
Cheers

πŸ‘︎ 12
πŸ’¬︎
πŸ“…︎ Jan 14 2022
🚨︎ report
Which would be a more useful class: Statistical Inference or Applied PDEs?

I plan on majoring in either chemical engineering or environmental engineering. Which would be more appropriate to meet my math requirement for these majors?

I have the option between Statistical Inference I (covers probability and calculus-based statistics) and Applied Partial Differential Equations I. Both are unfortunately proof-based, so I don't plan on taking both. Which is more important in the field?

I know PDEs are used more in future engineering courses (fluid mechanics), but I hear statistics is incredibly useful in industry. Thoughts on the matter?

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/RxnPlumber
πŸ“…︎ Dec 25 2021
🚨︎ report
Statistical Inference for Data Science

Anyone take this class? 01:960:291

How was it?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/ilovespaghetti1
πŸ“…︎ Nov 16 2021
🚨︎ report
Workshop: An Intro to Statistical Inference
πŸ‘︎ 9
πŸ’¬︎
πŸ‘€︎ u/daslu
πŸ“…︎ Nov 20 2021
🚨︎ report
Workshop: An Intro to Statistical Inference
πŸ‘︎ 9
πŸ’¬︎
πŸ‘€︎ u/daslu
πŸ“…︎ Nov 14 2021
🚨︎ report
[Q] With the rise of machine learning and prediction, where does statistical inference knowledge become useful?

I’m a statistics Major at my university, and I really enjoyed my statistical inference class I took last semester. We had a big emphasis on estimators, hypothesis testing, and some computation. we learned how to derive estimators, maximum likelihood estimation, rao-Blackwell, sufficiency, bias, variance, compute MSE, consistency , compute confidence intervals, bootstrapping, different parametric tests, Monte Carlo, Also did all of this in R. Upcoming and later statistical inference classes in my major include more of a computational side, coding up simulations, resampling methods, jackknife, permutation tests, non parametric hypothesis testing etc.

I have one statistical learning class, which is an elective. My question is, with such a rigorous emphasis on statistical inference and hypothesis testing, how much can I apply the stuff I’ve learned here in industry and in data science? Can the concepts I listed above be applied anywhere in Machine learning? The linear models and glms class i take will cover most of the basics of statistical learning. But the classical hypothesis testing and statistical inference, where is that applied in today’s world of data science and machine learning?

πŸ‘︎ 83
πŸ’¬︎
πŸ‘€︎ u/veeeerain
πŸ“…︎ Jul 14 2021
🚨︎ report
Kamera event: what is expected to happen based on statistical inference.
πŸ‘︎ 2k
πŸ’¬︎
πŸ‘€︎ u/sulanthy
πŸ“…︎ Feb 06 2021
🚨︎ report
[Q] Stochastic Processes vs Statistical Inference

I'm considering pursuing a major in statistics and so I need to complete either a course on stochastic processes or statistical inference (using Casella and Berger). Is either one of these courses obviously more useful than the other if my goal is to become a machine learning engineer or data scientist?

The stochastic processes course uses the book 'Introduction to Probability Models' by Sheldon Ross.

Would one of these books be much more difficult/time consuming than the other?

πŸ‘︎ 60
πŸ’¬︎
πŸ‘€︎ u/tail-recursion
πŸ“…︎ May 17 2021
🚨︎ report
Paper: Many published findings using the RD design in top political science journals are exaggerated, if not entirely spurious. Estimates tend to bunch just above the conventional level of statistical significance. Researchers tend to use inappropriate methods for inference arxiv.org/abs/2109.14526
πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/smurfyjenkins
πŸ“…︎ Sep 30 2021
🚨︎ report
[Question] Single Imputation : can they lead to valid statistical inferences from incomplete data?

Hi folks,

I am having trouble wrapping my head around missing data methods taxonomy. I know missing data methods range from complete case analyses, to weighting methods (probably suitable for survey data) to imputation based methods.

I understand that single imputation means replacing a missing value by another plausible value whereas multiple imputation means replacement with multiple plausible values. However, not alot of articles seem to be talking about inference with single imputation method.

  1. Are single imputation methods really not encouraged if, down the line, i would want to use my imputed datasets to make predictions?
  2. Are likelihood approaches less common than multiple imputation? I see bulk of missing data methods centered and built upon multiple imputation.. so I was just wondering...

I have been reading articles from Donald Rubin, Roederick Little ,Stef Van buuren, etc. If anyone has a list of other authors' i should be looking into, please let me know!

  1. Does anyone by any chance know of any missing data research groups/ have any upcoming online missing data conferences ? I don't have any in my school, this is not my supervisor's main area so I have difficulty bouncing off ideas.

Thank you and wish u all a nice Friday!

πŸ‘︎ 11
πŸ’¬︎
πŸ‘€︎ u/vanhoutens
πŸ“…︎ Jul 30 2021
🚨︎ report
[E] Statistical inference recomendation book

Last year I made my Statistics 2 (inference) course in my BS in Economics. During the semester, I used to study from the teacher notes, and from the Berenson, Levine & Krehbiel book. For me, this last one was quite concise and easy to understand, but, even so, it is a lacking in mathematical proofs and explanations book in my opinion. So, I would like you to recommend me a statistical inference books that you consider are better to understand not only the whole inference concepts, but also mathematical proofs and deductions.

πŸ‘︎ 8
πŸ’¬︎
πŸ‘€︎ u/torcazaxx
πŸ“…︎ Jul 06 2021
🚨︎ report
Why is the statistical inference supposed to fail if the error isn't normal?

I'm trying to perform a simulation to try to detect the effects of a non-normal distribution of the error on things like the % of times a test with H0: B1=2 (when the B1 is actually 2) is rejected (which in this case should tend to alpha but should be huge on small samples on skewed distributions), however the results have been disappointing.

Can someone explain to me why is the statistical inference supposed to fail if the error isn't normal? Maybe if I understand this I can trace the error

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/its-notmyrealname
πŸ“…︎ May 19 2021
🚨︎ report
Statistical Significance and Inference for Business: Does it matter?

In undergrad/grad school, stats was taught with the assumption that getting data for the population of interest is usually not possible, so you must use inference on a sample to draw conclusions about the population. But in a business context, you often have access to all the data you need, because the population is usually just everything within the scope of the business (sales, manufacturing, supply chain, etc.).

As a hypothetical example, let's say a given product is sold in a number of regions. The business has all data available regarding the quantity of this product sold in these regions for a given time period.

When comparing the quantities sold between regions, does statistical significance and inference matter? Does the question being asked affect whether inference is important? (e.g., ad-hoc analysis of past sales to determine which products should be discontinued, forecasting sales volume, etc.)

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/rweez409
πŸ“…︎ Jul 29 2021
🚨︎ report
Hiring help for statistical inference exam

Hey,

Looking to find someone who is good in this subject to help me out. Dm me so we can talk more

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/r28r7dk
πŸ“…︎ May 24 2021
🚨︎ report
Has anyone taken MTH 412/512 (Intro to Statistical Inference)?

I'm a PhD in Aerospace so I don't know many people in the math department. I got a B.A. in math so I've got the background and have taken all necessary pre-reqs. But I wanted to try to talk to someone whose taken the class before to get a good judge of if its a good fit for me.

My research is in optimal estimation of nonlinear systems. I work a lot with kalman/unscented/sigma-point filters, non-linear least squares and modeling dynamic systems, but I'm looking to build a better foundation in the fundamentals of bayesian estimation, hypothesis testing, post-fit residual testing, etc.

It LOOKS like this course should be a good fit, but I wanted to reach out to people who've taken it before to see if its worth it. I've only got 2 more courses I need to take and I'd like to make the most out of them. Ideally I'd take this course as my only course next spring (2022).

So has anyone taken it before? Any advice or insight into how it was?

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/ChrisGnam
πŸ“…︎ May 21 2021
🚨︎ report
Paradigms for statistical inference

Hello everyone! I'm having a bit of trouble with different paradigms regarding statistical inference.

When I'm talking about paradigm, I'm referring to frequentist inference, bayseian inference, likelihood-based inference and AIC-based inference. I believe there are more but let us limit ourself to these four. Let me sum up my concerns with the following three questions:

1. Basic question, but I'm not sure if I got it right. How am I suppose to look at these paradigms? I thought they were different ways of looking at inference, so they look at the same thing but see it differently? They are not looking at different parts of inference, correct?

2. If the paradigm is not specified, what I'm a suppose to do and could not knowing the paradigm affect me negatively?

In a recent course, we covered maximum likelihood estimation. If I'm not mistaken, this would by default refer to a likelihood-based inference. But other paradigms have their way of looking at it, and by looking at the examples given in the course, it would seem to be from frequentist inference.

3. When covering topics like MLE, Least square estimation or other theory, are they being presented in a "general statistical" point of view or do they always have a specific paradigm in mind?

I recently got a book about MLE, where they specify at the beginning:

"The properties of maximum likelihood inference that are presented herein are from the point of view of the classical frequentist approach to statistical inference ."

Does this mean that theory should be learned from all vantage point that exists for that theory?

If anyone has a good source (book, articles..) for learning more about the different paradigms I would love to hear it!

Thank you all for the help!

πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/AnkanTV
πŸ“…︎ Mar 26 2021
🚨︎ report
Statistical Inference Midterm

I need help with an exam. It covers mostly MLE, Method of Moments, confidence intervals, and parametric & nonparametric bootstrap. The test is tomorrow at any time.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/jowhee12
πŸ“…︎ Mar 11 2021
🚨︎ report
Impressions of differential privacy for supreme court justices. Jessica Hullman Β« Statistical Modeling, Causal Inference, and Social Science statmodeling.stat.columbi…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/QueeLinx
πŸ“…︎ Jun 04 2021
🚨︎ report
Open ecology article of the week: Towards robust statistical inference for complex computer models

Hi everyone, I hope all is well! I have a new open ecology article, and this is another one from Ecology Letters

You can find the open access link here: https://onlinelibrary.wiley.com/doi/full/10.1111/ele.13728

As always, please feel free to discuss this article in the comments below if you like. Questions, comments, or anything remotely relevant is fair game!



Abstract: Ecologists increasingly rely on complex computer simulations to forecast ecological systems. To make such forecasts precise, uncertainties in model parameters and structure must be reduced and correctly propagated to model outputs. Naively using standard statistical techniques for this task, however, can lead to bias and underestimation of uncertainties in parameters and predictions. Here, we explain why these problems occur and propose a framework for robust inference with complex computer simulations. After having identified that model error is more consequential in complex computer simulations, due to their more pronounced nonlinearity and interconnectedness, we discuss as possible solutions data rebalancing and adding bias corrections on model outputs or processes during or after the calibration procedure. We illustrate the methods in a case study, using a dynamic vegetation model. We conclude that developing better methods for robust inference of complex computer simulations is vital for generating reliable predictions of ecosystem responses.

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/Eist
πŸ“…︎ May 03 2021
🚨︎ report
[N] IBM, UMich & ShanghaiTech Papers Focus on Statistical Inference and Gradient-Boosting

A team from University of Michigan, MIT-IBM Watson AI Lab and ShanghaiTech University publishes two papers on individual fairness for ML models, introducing a scale-free and interpretable statistically principled approach for assessing individual fairness and a method for enforcing individual fairness in gradient boosting suitable for non-smooth ML models.

Here is a quick read: Improving ML Fairness: IBM, UMich & ShanghaiTech Papers Focus on Statistical Inference and Gradient-Boosting

The papers Statistical Inference for Individual Fairness and Individually Fair Gradient Boosting are on arXiv.

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/Yuqing7
πŸ“…︎ Apr 06 2021
🚨︎ report
Do students REALLY understand Statistical Inference?

For Students: When you took your first econometrics course, did you understand statistical inference?

And for professors: Did you think the average student understands statistical inference.

By "really," I'm not talking about being able to robotically calculate statistics: t-stat, p-value, CI and knowing when to reject and fail to reject the null... the stuff that gets you an A+ in Stats 101 if you can repeat them on the test. I'm talking about the big picture.

If you asked your average introductory econometrics students the more philosophical and epistemological questions, would they be able to give a confident answer?

  1. Describe the big picture of statistical inference, what is it, what is it trying to do?

  2. What does it mean that the sample estimate has a distribution? What does it mean that estimator is a random variable itself? Why is this a problem?

  3. What is a hypothesis test doing? Why are you undertaking it? Why do people do it? What information does it give you that you didn't know before?

In my experience, I feel like most students get lost in learning the methods of calculations, they neglect to appreciate the bigger picture. Imagine if a student got an A+ in Calculus because they can apply all the derivative rules perfectly but somehow never really appreciate that a derivative is the rate of change.

Most of the stuff intro stats students are evaluated on are things a computer can do for them. They forgot to understand the epistemological value of knowing how statistical inference works in the big picture because that's not what they are tested on. It makes learning about the intuition of applied econometrics difficult without a full appreciation of statistical inference.

Is it just me, or can people sort of relate?

πŸ‘︎ 36
πŸ’¬︎
πŸ‘€︎ u/OptimizedCatholic
πŸ“…︎ Dec 13 2021
🚨︎ report
Which would be a more useful class in the field: Statistical Inference or Applied PDEs?

I posted this in r/ChemicalEngineering but am also posting it here for a broader perspective since I haven't declared my major yet.

I plan on majoring in either chemical engineering or environmental engineering. Which would be more appropriate to meet my math requirement for these majors?

I have the option between Statistical Inference I (covers probability and calculus-based statistics) and Applied Partial Differential Equations I. Both are unfortunately proof-based, so I don't plan on taking both. Which is more important in the field?

I know PDEs are used more in future engineering courses (fluid mechanics), but I hear statistics is incredibly useful in industry. Thoughts on the matter?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/RxnPlumber
πŸ“…︎ Dec 25 2021
🚨︎ report
Statistical Inference for Data Science

Anyone take this class? 01:960:291

How was it?

πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/ilovespaghetti1
πŸ“…︎ Nov 10 2021
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.