Owning the libs by not understanding how random sampling works v.redd.it/jfdor7ykppm71

👍︎ 17k

💬︎

👤︎ u/ArthursFist

📅︎ Sep 10 2021

🚨︎ report

Random non-crossing walks on an Urquhart mesh of a Poisson disk sampling

👍︎ 295

💬︎

👤︎ u/jphsd

📅︎ Nov 25 2021

🚨︎ report

Random sampling when buying vegetables. We randomly sample vegetables to check if they are alright before we buy them.

👍︎ 6

💬︎

👤︎ u/AgreeablePassenger32

📅︎ Jan 06 2022

🚨︎ report

Stratified random sampling

I’m a data analyst by trade who’s been asked by work to do a more data sciencey thing. I can tell my boss I’m not suited for this with zero repercussions, but it seemed like a good way to learn something new. My skill level with SQL is moderate-advanced, but I’m very much a beginner to R. I’m requesting some advice on where to start with a type of sampling I’ll explain below.

I have population A at an individual level (~20k people) along with their age, gender, and an estimate of income for their zip (I’m aware that this isn’t the ideal way to estimate income - it was requested we don’t go deeper than that because it’s not that important). I have the same variables for population B (but with ~50m potential people).

What I need to do is pull a random sample (of equal size) from B that approximates population A in terms of age, gender, and estimated income. SQL is a horrible tool for this. I’m inexperienced with R and don’t know where to start. Is this stratified random sampling? And if so do I need way more experience to pull something of this caliber off? Or is it not nearly as complex as I’d imagine due to the few fields I’m trying to group by?

Really just looking to get the right name for the type of sampling I’m looking to do so I can do some further research, but also fine with hearing “this is so far beyond your experience that you should just tell your boss this needs to be done by a data scientist.” Thanks!

👍︎ 3

💬︎

👤︎ u/GodKamnitDenny

📅︎ Dec 10 2021

🚨︎ report

Just a random sampling for today from my shelves

👍︎ 23

💬︎

👤︎ u/ZappaBeefheartBungle

📅︎ Nov 20 2021

🚨︎ report

Algum gênio da matemática e do inglês que saiba me explicar o que é "strict random probability sampling"?

socorro eu sou de humanas

👍︎ 4

💬︎

👤︎ u/joiceale

📅︎ Dec 11 2021

🚨︎ report

Random Sampling of Classical Luhvi Characters

👍︎ 45

💬︎

👤︎ u/XTMVA_existing

📅︎ Nov 23 2021

🚨︎ report

The Importance of Random Hemp and Cannabis Sampling for Accurate Results

https://www.cannabisbusinesstimes.com/article/importance-random-hemp-cannabis-sampling-accurate-results/

👍︎ 2

💬︎

👤︎ u/tweedledum13

📅︎ Dec 12 2021

🚨︎ report

Sampling random stuff in my van on a rainy day 😌 youtu.be/vovDCBNBtb4

👍︎ 5

💬︎

👤︎ u/WeHateSimon

📅︎ Nov 17 2021

🚨︎ report

Odd behavior zmodeler assigning wrong random polygroups after sampling.

Zmodeler > PolyGroup > A Single Poly. Click and hold, press SHIFT to sample, then release and click on other polys. The expected behavior is for the subsequent polys to inherit the previous' ID, but every click simply changes the poly to a random polygroup, instead of the sampled one.

The subtool has 7 subdiv levels, is partially hidden, and has layers. I am attempting to change specific face polygroups at subdiv 1.

👍︎ 2

💬︎

👤︎ u/STK_23

📅︎ Nov 17 2021

🚨︎ report

Monetary Unit Sampling vs Random Sampling

I first posted this in /r/auditing, but then noticing that subreddit isn't really active, so posting here as well.

I have so many questions that I don't even know where to begin, but I'll limit it to 5 for now.

As English is my third language, and I really don't know the correct statistical terms and yatta yatta the questions below may look at bit weird I guess, but hopefully they are understandable.

I work in auditing and currently part of a small project developing a monetary unit sampling model in VBA for my office. I know how the 2 models technically work, as in how they pick the samples after inputting a confidence interval, material error etc.

However, long story short: I'm having problems seeing the difference between the "conclusion" of monetary unit sampling and random sampling in auditing. I'll start by describing how I understand them, and then someone can hopefully correct me if I'm wrong.

If using random sampling on a population and you find X amount of misstatements/errors in the sample, you can then extrapolate the errors on the full population. If using MUS on a population, you set a material error threshold, and if your sample errors exceed this threshold, you can't extrapolate the results, but the conclusion is then "do something else to cover the risk". Is this correctly understood?
Again, as I understand it, in MUS we don't have to work with a 95% confidence level if we estimate the firm has a low control and inherent risk. If both are low, does it make theoretically sense using a 50% confidence level?
Assuming the above is correct I said, can we do the same for a random sampling? Or do we always have to go with a 90-99% confidence interval? (I guess what I'm asking is does control and inherent risk affect the confidence interval in the random sampling method as well?
As I again again understand it, using MUS is more efficient than using a random sampling, since it leads to fewer samples. But given that the material error in MUS is exceeded, could this lead to using a random sampling then?
As an auditor, what confidence intervals do you normally work with (in both MUS and Random, and do they differ between the models), what material and expected error values, and how many samples do you typically get (minimum and maximum, for both models).

I'm hopelessly lost and hope someone can clarify this a bit for me. I am aware that my assumptions above, like, MUS being more efficient than random sampling could be wrong, thu

... keep reading on reddit ➡

👍︎ 4

💬︎

👤︎ u/Wild_Load296

📅︎ Oct 27 2021

🚨︎ report

How many sides does a unknown dice have. Finding out by random sampling?

There is a set of size N (each has element a unique ID - so all values are equally probably picked).

I look at M random samples (without removing from the set). From those samples I can see that I get some unique picks some duplicates and some tripples from that I should be able to predict N.

The problem is: I dont know N - I can take samples of a set and see if I get duplicates or more unique values. All values are equally probable.

MATH: Find a formula for the approximate N and another formula for a confidence range. In other words: How many sides does my die have?

The formula should predict 6 for the standard dice:ID, Frequency1, 32, 63, 44, 45, 26, 1

----------

total: 20 throws, 6 distinct values -> There could still be a 7 which we have not picked once. But very unlikely

MATH: Sample size of 970440 : 959780 unique, 5000 duplicates, 220 tripples. How big is my source set?

Things that are related but not exactly the answer:

Coupon Collector Problem

German Tank Problem

Things that should be the solution which I didnt find a good formula for:

Animal Tag - Recapture

👍︎ 6

💬︎

👤︎ u/dangi12012

📅︎ Aug 25 2021

🚨︎ report

Y'all were so right about the glass dip pen for sampling, thank you! Even better, I got a random "irregular" one off etsy and I could not possibly have lucked out more!

👍︎ 153

💬︎

👤︎ u/Knitsune

📅︎ Jul 22 2021

🚨︎ report

A random sampling from my local Drive In. Notice anyone missing? I think this is pretty telling of the fandom at large.

👍︎ 680

💬︎

👤︎ u/Lord_Ruler

📅︎ May 05 2021

🚨︎ report

Insights on random sampling for population estimation !!

Do u think Random Sampling is pick the items at random and do inference ? Probably you are wrong. Check out this article to understand the methods for Random Sampling and why Random Sampling is best to estimate population parameter !!

https://ainxt.co.in/how-to-do-sampling-through-random-sampling-process/

👍︎ 2

💬︎

👤︎ u/balajivenky06

📅︎ Oct 21 2021

🚨︎ report

LPT: Ask 2-3 random friends in your life what series/shows they have been watching on Netflix, HULU, Prime, etc. and start sampling a few of them until you get hooked. It will help you check out and replace your daily addiction to the toxic news, social media, work drama, etc. You deserve it!

👍︎ 12

💬︎

👤︎ u/Iudiehard11

📅︎ Aug 07 2021

🚨︎ report

Improving representativeness for survey with non-random sampling

Greetings, r/AskStatistics. I hope everyone is okay.

I need help with a question, but first a little context.

The State Government where I work aims to conduct a survey to obtain estimates about certain behaviors (smoking, sedentary lifestyle, etc.) of its population.

It turns out that, to carry out this research, some possible Municipalities (11 of 184) were selected a priori, in a non-random way, for issues beyond my control. Data collection is still in the planning stages.

The problem is: I would like this survey to provide estimates that are representative for the entire population of the State, even with this limitation of pre-selected Municipalities.

I believe that, if all Municipalities could be chosen, a three-stage cluster sampling (census sector, households and individuals) would be an adequate choice.

However, with the limitation of Municipalities, is there any technique that I can be using to improve the estimates and infer for the entire State?

Some options occurred to me, but I couldn't say how they could be applied, such as: (1) some specific technique in the choice of the sample, (2) application of a goodness-of-fit test to check if the chosen Municipalities have proportions of certain variables of interest similar to the whole state, or (3) some calculation for the inference of probabilistic weights for the respondents.

Do these options make any sense? Could you give suggestions or references on how to carry out these or other possibilities?

Grateful in advance and, please, excuse me for any writing errors.

👍︎ 2

💬︎

👤︎ u/Italo_Aguiar

📅︎ Aug 15 2021

🚨︎ report

I tried sampling one of my old tracks and threw random sounds on top of it, with some Death Grips sounding drums I found from one of the new drum packs. What do you guys think? v.redd.it/3p7yphv2abf71

👍︎ 6

💬︎

👤︎ u/PossessionDapper609

📅︎ Aug 04 2021

🚨︎ report

A random sampling of high-income, fatFIRE careers in LCOL/MCOL areas (Midwest)

DISCLAIMER: Likely irrelevant thread found through usual nerdy research gathered based on owners of homes in the $1-2M range (Zillow, Whitepages etc.). I know this is poor data gathering given that you wouldn't expect a linear correlation between $1-2M houses and fatFIRE but it's an interesting data picture.

Unsurprisingly, the bulk were business owners who ran fairly boring, niche businesses (in line with Thomas Stanley's Millionaire Mind research on decamillionaires). Slightly more surprising for me -- and maybe this is just an oversight because of the category of person who would tend to use reddit -- is the lack of professionals e.g. FAANG/big tech employees as well as physicians (EDIT: most of my original research was in IL and Minnesota. After adding Iowa, the proportion of physicians increased significantly.. Additionally, the executive category, which I assumed would account for more typical VPs of firms was primarily just made up of part-owners.

The original list: $1-2M

Business Owners - 40% of $1-2M homes are business owners. 27% of the business owners are in finance/real estate companies. 18% of the business owners are in IT/software. 27% are generally brick-and-mortar type and/or manufacturing businesses of some sort. 23% are in some form of consulting/other professional services.

-insurance agency president

-financial planning firm president

-vendor financing company president

-real estate consulting business

-it consulting firm president

-CEO management consulting firm

-owner executive search firm

-personal injury law firm owner

-enterprise technology development company president

-pharmacy benefit manager firm owner

-plastic surgeon/practice owner

-healthy foods bar sold at major grocer owner/partner

-retired lumber company owner turned chairman of investment firm

-owner, home builder/contractor

-owner, software company/former consultant

-owner, wine distributorship

-owner, orthopedic/medical sales company

-owner, disaster/workplace recovery company

-owner, telecommunications company

-owner, pallet/shipping packaging company

-owner, electrical contracting company

-owner, wireless manufacturer

-owner, concrete contractor

Sr. Executives -20% of total for $1-2M group are sr executives.

-wealth management firm partner/principal

-VP Sales (3x)

-hedge fund partner

-president and physician,100+ location multi-specialty physician group

-CMO/SVP and physician, major academic hospit

... keep reading on reddit ➡

👍︎ 232

💬︎

👤︎ u/careerthrowaway10

📅︎ Dec 01 2020

🚨︎ report