A list of puns related to "Bayesian Inference"
I'm just now learning Bayesian inference as applied to engineering experimentation. This post is going to reveal my ignorance of the topic.
Here's a paper from NASA Langley about regression and the challenging of deciding what to do about potential variables for which we have prior information. The summary of the paper is that if you have a p-value from a prior experiment, you can update it with the information from a subsequent experiment. There is a first experiment having a regression analysis. There is a second experiment with a regression analysis. We combine these results of first and second experiments to update the mean and standard deviation of the regression coefficients. The approach is not pooling the data of two experiments into one analysis, but is using the two stand-alone estimates to create a third one. It sound like a powerful way to use previous experimental data.
My challenge is to come to terms with believing the statement, "I have prior knowledge that this variable must be in the model". If the p-value for a variable's coefficient is too high for my risk level, I'll not include that variable in a predictive model even if Isaac Newton himself were to tell me it must be there. But, guessin what Bayes would say, if the prior knowledge is that the factor is certainly influential, I need to include it, no matter how small (ignoring practical significance for the moment). How does prior knowledge of certainty play a role in making a statistical model and estimating a coefficient?
I've still got a lot of learning ahead, but this is my immediate question. Is this the recipe:
We must include a known variable in the model (the prior existence of the term in the physics has probability 1.0, the prior value of the coefficient is unknown).
We run an experiment to characterize effect of the variable, we report the value of the coefficient as estimated from the data, we ignore its variance (and p-value) except that the variance is in the model's residual error, hence confidence interval and prediction interval.
When we get more data we use the mean and variance of the prior effect and new data's effect to update the estimate of its value.
Do we also use the new data to update the estimate of the residual error?
I have been trying to understand the concepts related to Bayesian inference but I just canβt seem to get to how it works exactly. More specifically, what if my likelyhood function follows a normal distribution and my new data is a single measure whose error also follows a normal distribution? I canβt find any sources that address that particular case.
Been slogging on and off for years to try and finish a PhD dissertation that ended up having an applied stats theme. The start of the dissertation involved inference on differential equation models with Monte Carlo sampling Bayesian approaches (namely HMC), but more recently I've ended up turning to variational inference for some necessary computational efficiency gains that come at the cost of accuracy. I'd been bumbling around exact methods for a while without ever hearing about VI until some collaborators kindly referred this village idiot to the requisite papers, so I was wondering what other families of techniques I've been missing out on. What other broad families of inference methods are being researched in addition to the exact, variational, and hybrid approaches?
>At any rate, these results were interesting to us because they suggest that visualizations donβt necessarily make it easy for people to assess causal explanations, despite how much we might like to tout the value of exploratory visual analysis and interactivity for building intuitions about causal relationships ... Itβs possible people in our experiment sought disconfirming evidence as the easiest way to test a hypothesis that some process generated the data. But which form do existing visual analysis tools tend to better support?
Author: Jessica Hullman
Hello
I am taking a grad level theory based Bayesian inference class that has a reputation for being challenging. I have taken undergrad classes with bad reputations before and it is often overblown but this class is supposed to be hard for the first year phd students and I want to know how to prepare.
What are some study strategies for such a class? Recently my classes have been programming project based so Iβm a little out of practice. Can I treat this class the same as into probability or calculus class where I just do lots of practice problems?
Given
A) At least 17% of 16-24 year olds have used cannabis in the last year (seems like a massive underestimate but those are the official figures)
B) The average footballer in England is tested 3.7 times a year
C) The Football Association claim to test for social drugs like Cannabis.
D) No Premier League footballer has been caught doing drugs or avoiding a test in the last six years.
E) They will be told to avoid drugs, but they are wealthy young men living in cities.
Assuming cannabis is the only drug used and the warnings are 50% effective at reducing drug use, the average cannabis user uses the drug four times a year and it is detectable for 3 days, you would expect 6 players to fail a drug test for cannabis alone, yet there hasn't been a single drug test failure in the top league for six years.
Can I confidently say there is a cover-up? Either the tests don't work, evasion is rife or they are protecting players who fail.
I am saying this on the day the Luxembourgish Prime Minister has admitted that 54 out of 56 pages of his PhD thesis were plagiarised, including 20 pages straight from the website of the European Parliament and after a whole series of academic scandals. Because I am trying to determine how common blatant fraud is. Even fraud so transparent that it can be detected by anyone with a calculator and in fields where hundreds of journalists monitor every move.
Sources.
https://www.gov.uk/government/publications/united-kingdom-drug-situation-focal-point-annual-report/united-kingdom-drug-situation-focal-point-annual-report-2019#cannabis
"In total, 4,495 samples were collected on behalf of the FA without a single failed test in that time - although Peterborough United defender Josh Yorwerth has since been given a four-year ban for evading a test."
https://www.bbc.co.uk/sport/47748275
https://www.thefa.com/football-rules-governance/anti-doping/the-drug-testing-process
https://amp.theguardian.com/education/2021/oct/27/luxembourg-xavier-bettel-university-thesis-was-mostly-plagiarised?__twitter_impression=true
The question is In the first image:
I don't understand how to set up this question. Is the prior the probability of a sound being presented?
As far as I can tell, we need the whole table to find the expected probabilities of experiment 2.
A nice summary paper, at the cutting edge of Bayesian causal inference.
Link: https://arxiv.org/abs/2111.03897
Abstract: "Spurred on by recent successes in causal inference competitions, Bayesian nonparametric (and high-dimensional) methods have recently seen increased attention in the causal inference literature. In this paper, we present a comprehensive overview of Bayesian nonparametric applications to causal inference. Our aims are to (i) introduce the fundamental Bayesian nonparametric toolkit; (ii) discuss how to determine which tool is most appropriate for a given problem; and (iii) show how to avoid common pitfalls in applying Bayesian nonparametric methods in high-dimensional settings. Unlike standard fixed-dimensional parametric problems, where outcome modeling alone can sometimes be effective, we argue that most of the time it is necessary to model both the selection and outcome processes."
Looking into possible electives and wanted to get some feedback on the courses from anyone that has taken them.
How difficult are they? and how much mathematical knowledge do they demand?
I'm trying to see how effective parameter estimation is for a class of models. The parameter(s) I'd like to constrain appear in a function. When that function is integrated twice the result is a rate. This rate can be used to define a Poisson distribution and I can see how the observation of 0, 1, 2, 3... events affects the constraints on my free parameters. I have some basic experience with Pymc3/inference/ML, but I haven't had to do anything this complicated. How would I go about solving this problem?
Hi.
The ml-study sessions in coming the weekend will be mostly used for explorations preparing the November workshops.
In the coming weekend, we will explore at least one topic: probabilistic modeling through Bayesian inference using the inferme library by generateme.
Please follow the thread at Zulip for more details.
Hi all!
Wanted to post this solution that our firm used to deal with a problem caused by an incomplete dataset. Our solution uses a priors-driven inference approach. Although this particular example is unique to our dataset, the general approach has merit across a broad range of problems.
Would love to hear any feedback on our solution, in particular any suggestions for how to simplify it or make it more robust.
The article is here.
Thanks!
(cross-posted in r/MachineLearning since it touches both fields)
As a Data Scientist, Bayesian Inference/Stochastic Modeling, you'll utilize advanced quantitative & statistical techniques to drive business model and product innovation for Via. What You'll Do:
Use Bayesian modeling and stochastic simulations β¦
Read more / apply: https://ai-jobs.net/job/10126-data-scientist-bayesian-inferencestochastic-modeling/
Hi all,
A newbie when it comes to Bayesian inference here (with a background in Psychology and currently working in neuroscience and clinical neurology)! I only got started with Bayesian inference a week ago for a university project that I'm currently doing at UCL, UK. I will be using the bayesreg toolbox to interrogate clinical contributors to the occurrence of cerebral microbleeds. A few (basic) questions that I've been battling these past few days (some of which are probably very basic, but I'd benefit from hearing a Bayesian's informed perspective!):
- Has anyone used the bayesreg toolbox and would you recommend using it?
- Can anyone recommend some (established) 'gold standard' paper/guidelines for interpreting and reporting the results of a Bayesian inference model?
- How does model evaluation work in the Bayesian framework?
- Are all parameters in the prior distribution fixed? Do they all have to take integer values between 0 and 9?
- If one's parameters of interest are mixed (both discrete and continuous variables), what is the best way to approach the selection of the prior distribution?
- Does one need to set the same prior distribution for all parameters (i.e., a joint prior distribution), or can different parameters (within the same model) have different priors? Is that 'Bayesian' at all?
Thanks a million in advance!
https://docs.google.com/spreadsheets/d/1bvfj5_8rNz-tUpc6kHPVkBdIOG5lSpEdbjwujUyVFUU/edit?usp=sharing
"My WR is 7 for 10k hands - am I a winning player?"
Some form of this question is asked often with a couple of standard responses:
50/50.
You need 50k hands.
Check out Primedope (https://www.primedope.com/poker-variance-calculator/)
I was hoping to provide you all with another option. I think Primedope is a cool-ass website, but the way we're using it when we direct people there has a massive flaw that my spreadsheet addresses.
Stats be complicated
In the world of stats, things are rarely if ever black and white, wrong or right. We mostly try to explicitly state our assumptions and make them as reasonable as we can. When you use the primedope variance calculator something jumps out pretty quickly - the probability that you're going to win more than expected and less than expected given a provided win rate are symmetric (equal). This is 'ok' if we assume that there is no lower bound and upper bound on WR (although there is) AND as long as we recognize that this simulation assumes a known TRUE WR.
If someone posts their 2 bb/100 WR for 30k hands and asks if it's their true WR - most people are going to respond with "it's tough to say given your sample size"
If someone posts their 25 bb/100 WR for 30k hands and asks if it's their true WR - they're going to get slapped with "lol, no. You're on a sun run - enjoy it"
Why is that?
The results are so far outside of our expectations that we intuitively recognize that the expectation is for their WR to come down given more samples. Is there a way for us to formalize this process? YUP. Enter Bayesian Inference.
I'm going to skip defining how exactly BI works (anti-climatic!) to the chagrin of some and cheers of most, and move directly into the spreadsheet and how you use it.
I've highlighted, in orange, the values you can change to get your desired results.
They are:
Population winrate
Population winrate standard deviation
Your observed winrate
Your number of hands
Your SD in bb/100 (available in PT and I presume all other trackers, 100 is a good default for 6 max)
What I have assumed and do not allow you to adjust:
The shape of the population WR distribution (Gaussian - I actually think fatte
... keep reading on reddit β‘https://www.youtube.com/c/KapilSachdeva
Hi Guys,
Here is the information about the YouTube series I am working on.
As such my content is based on and inspired by the teachings of many books, papers, articles, and other youtube videos, but in particular, I use examples from Pattern Recognition & Machine Learning by Dr. Bishop.
My humble attempt is to try to explain both intuition and the underlying mathematics. Whether I am successful or not is for you to decide. Any feedback to improve is highly appreciated.
The link to the YouTube playlist (5 videos so far at the time of writing this post)
https://www.youtube.com/playlist?list=PLivJwLo9VCUISiuiRsbm5xalMbIwOHOOn
Many thanks.
Happy Learning
Kapil
What do you think of turning continuous observations into binary outcomes (e.g via thresholding) as a way of 'proving' the existence of an effect between 2 groups with a less questionable model?
For example: imagine you are assessing the effectiveness of a drug for increasing heart rate: 2 groups (control and treatment), one metric predicted variable (the after-treatment heart rate of each subject). I see 2 possible approaches:
The problem I see with Approach 1 is that the many modeling choices to be made ("Same variance? t-distribution vs normal? Truncated normal? Or maybe Gaussian mixture? etc.") make the whole analysis more questionable. Skeptical readers might challenge the conclusion by saying "Why should we have faith in your model?" and I wouldn't blame them.
OTOH, with Approach 2, there are hardly any modeling choices to make for binary outcomes: binary outcomes unquestionably are Bernouilli trials, they have a Bernouilli parameter, we infer it, end of story. So the tradeoff seems to be more robust conclusions, at the expense of poorer predictive applicability (which might be fine).
I also don't see any problems with using both approaches in the same analysis? I guess that would be frowned upon in a frequentist analysis, as it might seem like p-hacking, but in a Bayesian analysis I see no issue with doing that.
EDIT: please make sure to have read the above paragraph before answering.
*EDIT 2: some people seem to have issues with the general idea of simplifying observations. That makes little sense to me. If instead of average heart rate, I give you as experimental observations HD videos depicting the beating heart for an entire minute with millisecond preci
... keep reading on reddit β‘Open to feedback and PRs. Bear in mind itβs hosted on a distributed platform so changes will take some time to propagate.
Those curious on how it works: Iβve reverse engineered the serial number encoding on these items to associate them with a manufacturing batch. This is then correlated with known defective production rounds and so on.
At this point I just need some data to help train my model and confirm my hypothesis on the serial number encoder.
Give it a look!
My first post here. I learned about this case in 2019 and am fascinated by it (to the point of remembering what I was doing on April 1, 2014). I am very pleased to someone have created a sub just focused on this case, so I don't need to keep going on other subreddits about mysteries in general to see a few posts.
Bayesian Inference is a mathematical technique that is used to find lost things. It was successfully used to find the wreckage of the Air France 447 flight plane. If there were any qualified scientists who were informed about what the trail is like and estimated the average speed of travel of a person, along with scientific data on how people move when they are lost in mazes (is there any study on that? I appreciate if they show me!), this scientist could estimate a maximum radius from where they could be. So, in this scenario, calculating the possibilities, would it be possible to use Bayesian Inference with the same precision as Air France 447?
I have been trying to understand the concepts related to Bayesian inference but I just canβt seem to get to how it works exactly. More specifically, what if my likelyhood function follows a normal distribution and my new data is a single measure whose error also follows a normal distribution? I canβt find any sources that address that particular case.
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.