A list of puns related to "Live variable analysis"
Hi all.
Trigger warning for suicide mentioned as variables in research.
I'm in my first thesis research unit for my honours degree in psychology. I have been allocated my topic and supervisor, and don't have flexibility in the variables I mentioned, but how I fashion the variables is up to me.
My initial research question looked at the impact of suicide literacy and secrecy on perceived stigma in those bereaved by suicide.
There is existing literature on the relationship between literacy and stigma (it decreases stigma), and existing literacy on secrecy and stigma (it increases stigma), so seeing how they overlap I thought would be really interesting and literature points in that direction.
I went to break it down and "make it simple" to see how/why it's important and find the "so what" factors, and arrived at these points:
stigma > increased suicidality and mental health problems
secrecy > increases stigma
literacy > decreases stigma
This led me to think, okay well if secrecy leads to an increase in stigma, and literacy leads to a decrease in stigma, I wonder how literacy might moderate the relationship between secrecy and stigma. Maybe increasing suicide literacy will reduce the impact of secrecy on perceived stigma?
So I arrived at a second research question option, which is to look at the effect of suicide literacy on the relationship between secrecy and self-stigma in those bereaved by suicide. The only problem being there is no existing literature that links suicide literacy to secrecy that I can find. It would be completely new research, which makes things very difficult when it comes to providing evidence as to why it's important.
My supervisor has explained that there being a gap isn't enough to justify undertaking the research, nor does the fact I can justify it myself.
At the moment I'd like to pursue the second research question, perhaps the way around it is to combine the two justifications: There's a bunch of research on literacy and stigma, a bunch of research on secrecy and stigma. With that in mind, alongside the gap in literature between literacy and secrecy, it's worthwhile study. But in the end it feels more like, it COULD be important so maybe we should look at it.
Can anyone point me in a direction that might help me justify why this research is important, aside from the gap in literature? Or suggest reasons why a gap in literature IS enough of a reason? I'm floundering a bit with this.
Thanks so
... keep reading on reddit β‘Hey everyone, so im trying to compare the dependent variables of several independent variables between one another. For example Groups A), B) C) D) all participated in a test that examines academic performance of academic categories E) F) G) H) I) F) and K). How do I evaluate the differences between each groups scores in each academic category in a test? Im assuming an ANOVA or a MANOVA test but im not entirely sure which one and what type.
Thanks in advance guys! My brain melts on anything stats related and im trying to get my head around it.
I have conducted an experiment in which participants experienced a frustrating task, i.e., they were systematically prevented from achieving their goal. A central research question is whether or not facial expressions of an individual can predict whether or not the individual participant experienced frustration.
Facial expressions can be coded as Action Units. Subjective ratings of frustration were done post-hoc by the participants themselves. For the sake of my statistics questions, let us assume frustration induction was successful, and subjective rating and action unit-coding reliable.
I would like to show that
A regression framework seems appropriate, especially HLM (since AUs as well as frustration ratings are specific for individuals). As predicted variable, we can use the count of an AU. Predictors would be subject, frustration rating, and AU. The null hypothesis is that knowing subject/frustration rating/AU do not help predicting the count, which I would like to reject.
The issue: There are 89 action units. Treating AUs naively as categories in a regression framework is out of the question. The number of AUs can be brought down to roughly 40-50 when excluding very low frequency AUs, and doing overall dimension reduction (PCA, clustering). The underlying problems are a) the great variance in AUs shown across participants, and b) the importance of an AU even if only shown very briefly and rarely.
The question: How do I deal with the many levels of the categorical data? Is there any way to make the regression approach work?
I'm analyzing a dataset for academic research, but can't seem to find a proper technique / set of techniques. First off, I will describe my data:
I would like to find out if some factors correlate with skill level, and which factors are the most significant. This is done to explore the data and find out trends with skill level. However, the data being nominal, and having varying group sizes, I believe non-parametric techniques would be the only ones that can be applied. So the criteria for an analysis technique would be:
If you can suggest a technique or a set of techniques that would help me tackle the problem, it would be greatly appreciated. I believe something in the lines of Kruskal-Wallis/ Discriminant analysis / Logistic regression might be good, but I'm not sure how these could be applied to multiple dependent groups with multiple independent variables. Also, Mann-Whitne
... keep reading on reddit β‘Hey all,
I've used the Metafor package several times before to run a single-paper meta-analysis, where I basically find the effect size across the studies in my paper
However, I've always done this when the predictor variable was categorical (e.g. https://rstudio-pubs-static.s3.amazonaws.com/10913_5858762ec84b458d89b0f4a4e6dd5e81.html )
I'm trying to do the same thing but with a continuous predictor variable and am having some trouble figuring out how to do this
The set-up is very straightforward: I have predictor variable X that is continuous (values are between 0 and 200) and a outcome variable Y that is also continuous (values are between 0 and 10) for N studies.
I've tried searching for this, but have surprisingly been unable to find a tutorial or instructions (maybe I"m looking for the wrong search terms or it could be because I'm not well-versed in meta-analysis..)
Are there any example codes that I could use or some webpage that has an example tutorial for something like this?
Thank you very much for your time!
I have daily income vs time (days). The income values are very large; would taking the log transform of the income values somehow "compromise" any further time series analysis?
I don't think so, but just wanted to make sure.
Thank you!
Hello everyone,
I am currently working on a project on relationship beliefs and conflict resolution styles. We are doing a daily diary study and assessing relationship beliefs at baseline and follow up and conflict styles at daily diary (5 days - will be averaged). We want to know whether baseline beliefs could predict these conflict styles at daily diaries but we also want to know whether the conflict styles at daily diaries could predict beliefs at follow up. Iβve been looking at a path analysis but Iβm not sure if thatβs the right way to go.
I hope I made sense - still trying to improve my stats knowledge and skill. Any advice or resources would be greatly appreciated!
I would like to use PROC LCA in SAS to do a latent class analysis using complex survey data. Weight and cluster variables can be included in the procedure, but including a strata variable is not an option. For a sample that was designed with strata, and weights that were calculated using strata, how would not using a strata variable impact the LCA results? Is there a way to include the strata variable in the analysis, similar to the inclusion of strata in the proc surveyfreq procedure? Thank you for your help.
Hi there,
I am analysing some data where I have two groups, an experimental group and a control group. I have tried to make sure that the groups do not differ in terms of baseline measures but for two variables they are different, e.g. years of education.
Should I control for those variables in my analyses? I first decided to do it (did ANCOVA, regression and partial correlation) but I have since been advised against it but I didn't get a very good explanation for why not other than that it's not recommended. I also know there are different opinions about it.
So, why should I control for those variables in the analyses, or why should I not? :)
Thanks!
Howdy y'all
I'm trying to find the best statistical measure for the correlation between two binary variables.
I've tried calculating a chi-squared contingency table (Both values present n11, both absent n00), but I'm not sure this is the right analysis since the high numbers of n00 outweigh n11, and make it seem like there is a positive correlation when there is mostly no values.
What statistical analysis can I run on my data to get a good measure for the significance of the correlation between two binary variables?
I am looking at the impacts that COVID-19 lockdowns have had on public transportation usage over a number of cities over the world. I was told that difference-in-difference would be the way to go about this but I'm confused as to how to actually implement it. Any guidance would be greatly appreciated.
My data is:
date | (city1) | (city1_lockdown) | (city2) | (city2_lockdown) |
---|---|---|---|---|
Mar 1 | 50 | 0 | 40 | 0 |
Mar 2 | 51 | 0 | 45 | 1 |
... | ... | .... | ... | ... |
Mar 30 | 55 | 0 | 47 | 1 |
where (cityx) is something like daily ridership or % capacity or revenue, etc. cityx_lockdown is a dummy indicating if that city went into lockdown or not.
I tried the following with python (dont have access to stata): > import statsmodels.formula.api as smf > model = smf.ols(formula = 'city2 ~ city2_lockdown + city1 + city2_lockdown*city1', data=combo).fit() print(model.summary())
but this gives me a coeff of 0 for city2_lockdown (and city1*city2_lockdown by extension)
here I guess I am writing (variable) = f(intercept + control + control*variable_dummy).
So...questions:
is this the correct way to format my model to fit? It's not exactly clear from examples I've found online of DID
how do I extend this to include multiple control and variable cities? I have a few cities for both those that underwent lockdown and those that didnt. (I understand the assumption I'm making with DID that the variables would follow the same trends as controls if there were no interventions[lockdowns]).
Thanks!
I have a hypothesis that one particular variable (x1) plays a significant role in determining 'y'. How can I use linear regression analysis to best identify the best explanatory variable?
I've a data set comprising of voter turnout across countries. I have certain other variables as well. Now, since election years do not necessarily conicide across countries, how do I change this year variable into a time index?
I have data on 1000 different PC games and I need to use the number of owners as a variable in a regression. The problem is that owners is an estimated range written as, for example, "10,000,000 .. 20,000,000" in each cell for that column. Is there a way to parse or edit this so that itβs an actual range that excel recognizes and not a string (if that makes sense)? I would hate to have to go one by one and change the values directly...
Hi y'all!
I have a question: if I have a binominal variable (e.g., outgoing vs. introvert) and one ordinal variable (e.g., comfort level of doing something, on a Likert scale of 1-5), is there a way to test the correlation between them? I don't think either Pearson or Spearman works here.
Any help is appreciated. Thanks!
Hello everybody!
I need some help with a statistics assignment regarding variables and linear correlation.
My first question is which index would you choose if you wanted to create a policy that supports the financially weaker social classes: the standard deviation of income or the 15th percentile of income? (I think it is the second one but I am not sure)
In the link (http://www.the-crises.com/wp-content/uploads/2010/12/gini-index-usa.jpg) there is a chart that shows the evolution of the Gini index in the US. What correlation coefficient would you think better describes the correlation between the values of the Gini index and the years after World War Two: a positive, a negative, one close to zero, or two coefficients? The reason I am having trouble with this is because it doesnt specify the number of years after the war so I am assuming it is until 2009 so i chose the last option.
Do you think my answers are correct or would you choose something different?
Thanks for the help!
I have 3 categorical variables, 2 of which are binary and one has 4 categories. What kind of test I can do that can tell me that distribution of one binary variable is different (or not) at different levels of 4-level variable by the other binary variable? Is log linear the answer?
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.