A list of puns related to "Time Series Analysis"
https://preview.redd.it/j4nqkwevmx741.png?width=411&format=png&auto=webp&s=5031697c83800d27a0722f35b45fd0fe4c03e7d0
Version 1.3.0 was just released and now with multi-GPU support and is available to install:
conda install -c conda-forge stumpy
or
python -m pip install stumpy
This analysis package has over 13K+ downloads/installs on Github and provides a blazing fast implementation of something called the matrix profile, which can be used to find patterns, anomalies, time series chains, semantic segmentation, and much more!
Check it out and let us know what you think!
Hey everyone!
I'm premiering a new video series today where I take a look back at gameplay episodes from the channel and take an in-depth look at individual plays, politics, and becoming a better pilot!
Comments, shares, likes all big appreciated, if you've got any requests for specific episodes to look back at, or ways to make the series better I would love to hear them!
https://www.youtube.com/watch?v=GDCbpLfwYE4&lc=UgzwhSMjwcoDU2Hv7ax4AaABAg
I have this ongoing debate with my colleagues. I work in a market research firm where we collect surveys, and on many occasions we repeat the same survey at various intervals in time to track any changes in mindsets or opinions on the topics covered in the survey. For some projects, we have been tracking on a yearly basis, or thereabouts, for about two decades. Each sample is cross-sectional rather than longitudinal. That is, each sample taken over time does not consist of the same respondents.
When it comes to determining, or modeling, changes over time, one colleague is arguing that it should be viewed as a time-series analysis, where we would create a summarized data frame with each row of the data being the sample's aggregated result for the year, and model it from there like an ARIMA.
Another view the "year" variable of the raw dataset essentially as another variable like region, gender, etc. and simply use a more typical type of analysis, where the year variable is another predictor in the model. The argument there is that there are no seasonal or temporal effects in the data that justify the usage of time-series analysis. While some measures are stable over time, we do see a shift over the years for some others. The shift is often linear and stable in nature (take attitudes on same-sex marriage over time, as an example). In some cases we see a drop and then back up, but this is likely due to sampling noise more than anything else.
I myself have very little knowledge about time-series analysis, and was hoping for others' advice. Which method makes more sense? Or does it even matter at all?
I am newer to time series, and have played a small role on a team where we used ARIMA/ARIMAX methods for forecasting. We are assigned to do some research on modern machine learning methods to do Time Series (TS).
From what Iβve gathered, is Deep Learning seems to be good. The issue with deep learning is that if we lose interpretability, this makes ours superiors very nervous. So we want to preferably maintain as much as interpretability as possible.
Any recommendations would be great!
I'm econ major student and I'm now learning statistical time series analysis methods like ARIMA, VAR, GARCH.
Machine learning seems to be very popular in this thread but I don't see many traditional statistics topics here.
They are really challenging and really advanced topics in statistics (at least for me) but the more I learn, the more I doubt if they are actually used in financial trading, because they are too restrictive with a bunch of assumptions and are developed too long ago - like decades ago - compared to bleeding edge ML techniques like DNN.
Time series analysis seems to work with GDP or inflation rates but do they also work with S&P500 or derivatives?
From what I learned, time series analysis is all about forecasting but a lot of forecasts are already provided by big finance firms and you can see it anytime on economic calendars on investings.com so I don't know if I can really make use of it.
Why is it not as popular as ML in this sub? Is it outdated?
Tell me what you think. :)
What are some MOOCs or online problem set and solutions for graduate level Time-Series analysis. I am going through the Brockwell-Davis book and I am not sure I am picking up the material at the right depth because I am not sure if I am solving the problems correctly. Any recommendations? By way of background I have a PhD in mathematics.
I am doing a retrospective observational analysis on how a new hospital interention has impacted the number of patients recieving a certain medication. Β Data is presented as number of patients per month on a timeline.Β I would like to be able to show overall trends, as well as whether the trend changes on a specific month when the intervention started.Β I have been using a time series analysis (Mann-Kendall and Pettit's test) and basic t-test to compare data before and after the intervention date but I'm wondering if you all had any ideas for a better analysis type (such as ANOVA comparing events to time).
What error are we correcting for by using this model?
Also, this model is sometimes called equilibrium correction model. What equilibrium does this refer to?
I'm facing a little crisis right now. I studied economics and now I'm going for a MSc in statistics at a local University, and I'm planning to take all the time series analysis courses.
Even though I love them, I don't know if they're useful at all. Since many forecasts are required for really long terms (linear models won't do well unless we use, for example, equal lag length as the forecasting period).
I've worked a little, I'm in my second job, my first time in a bank (department of analytics), and I don't know if they're useful for the business.
Do you know what are or could be TS applications that are useful in the industry or baking or wherever TS data is available or could be constructed?
I often times find videos that simply explain to you the concepts. They just tel you about stationarity and non stationary, BIC, AIC, etc. but they donβt really show you how to do it. To me, it feels like going to the kitchen, someone just says βhereβs the ingredients and hereβs the pots and pansβ without actually giving me the recipe or even showing me how to make the food.
Iβd like to see an example of where they pulled raw data from some source and ran analysis on it and doing all the things like testing for stationarity, checking for autocorrelation, picking to use the AR(1) or ARMA model and explaining the actual results and what it means.
In this example, an LSTM neural network is used to forecast energy consumption of the Dublin City Council Civic Offices in Ireland using data between April 2011 β February 2013.
An LSTM model was generated and run on the data, and the mean percentage error was 6.1%.
The methodology and findings can be found here. Would be grateful for any feedback.
I have a (physical) machine that can be tuned by adjusting the values of some parameters A_1, β¦, A_n (n is around 10). This tuning affects some secondary parameters B_1, β¦, B_m that cannot be tuned by hand. The machine continuously produces an output X, and by looking at X in a time window, it is possible to decide if the machine was running stable or unstable.
All this information was logged for the past ~10 years, that is roughly around 25M data points.
The tuning of the machine is really complicated, as it can react very sensibly to parameter adjustments and also their influence is not quite well understood, so specialist interventions are needed to keep the machine running at a reasonable performance. The goal is to train a ML model that can support these interventions and generate some insights into how the parameters are related to the stability. For example we would be interested in something like βIf you raise A_1, you need to lower A_2 in order for the machine to remain stableβ or βraising A_1 will increase B_1 in a few hoursβ.
Up until now we ignored the time component and only ran some clustering to find out which settings were used when the machine was running stable and which were used when it was running unstable. Sadly, the used settings were are greatly (it could have been stable with A_1=100 and A_1=300) and a usually a single setting could lead to a stable as well as an unstable machine, so the time information is crucial.
I am looking for ideas how to approach this task. I was thinking about sub dimensional motif discovery to find typical patterns, but Iβm unsure how to link these patterns together.
We have a time series data like:
[1 2 3 4 5 6 7 8 9 10]
The task is to forecast the 11th number based on the first 10 elements. First we divide the sequence into multiple input/output samples, where several (in our case, 3) are used as input and one time step is used as output for one-step prediction:
X = [1 2 3], y =[4]
X =[2 3 4], y =[5]
How do we determine the number used in the input? I use 3 in this case. In econometrics, we use AIC or BIC to determine the "lag", which is 3 here. Do we use AIC/BIC here?
Hello again, sorry if the question may not be phrased correctly, but I dont know what the option that I need is called.
input float(test id) int year float(v1 v2 v3)
0 1004 2012 743.4 866 2195.7
0 1004 2013 1108.5 919.5 2136.9
1 1004 2014 1105.4 1000.7 2194
0 1004 2015 925.6 845.1 1454.1
0 1004 2016 1121.4 865.8 1456
0 1050 2012 146.4 62 94.1
0 1050 2013 413.2 170.4 349.2
0 1050 2014 402 181.2 412.1
0 1050 2015 261.5 245 598.8
0 1050 2016 476.6 190.1 498.6
0 1076 2012 2138.1 1136.1 1812.9
0 1076 2013 2139.4 1140 1827.2
0 1076 2014 2215.5 1223.5 2456.8
0 1076 2015 1625.4 1366.6 2698.5
0 1076 2016 2284.6 1481.6 2615.7
1 1078 2012 103533.7 26813 67235
1 1078 2013 59265.3 25267 42953
1 1078 2014 67790.7 21639 41207
1 1078 2015 66993.1 21326 41247
1 1078 2016 56551.4 20717 52666
Consider a dataset like this, what I would like is to be able to have 4 different samples to run tests with.
First sample = test is 0 for year N and 0 for year N+1.
In this example that would be id = 1004 in year 2012 till 2013 and id = 1004 in 2015 till 2016 aswell as id = 1050 year 2012 till 2013, 2013 till 2014, 2014 till 2015, 2015 till 2016 aswell as id =1076 year 2012 till 2013, 2013 till 2014, 2014 till 2015, 2015 till 2016
Second sample = test is 0 for year N and 1 for year N+1.
In this example that would be id = 1004 in 2013 till 2014
Third sample = test is 1 for year N and 0 for year N+1.
In this example that would be id = 1004 in 2014 till 2015
Fourth sample = test is 1 for year N and 1 for year N+1.
In this example that would be id = 1078 in year 2012 till 2013 and id = 1078 in year 2013 till 2014 and id = 1078 in 2014 till 2015 and id = 1078 in 2015 till 2016.
What I want to look at is the difference in v1 v2 v3 when variable test fits one of these categories if that makes sense.
Thank you in advance.
Say we have two I(1) variables, Y-t and X-t. In order to identify a non-spurious relationship between these two, we have to induce stationarity. One way to do this is differencing, of course. We could regress βY-t on βX-t.
However, this model is considered a "short term" model, and if we want to identify a long term relationship, we have to use other models such as an error correction model.
My question is: Why exactly is the simple differences regression a short term model? How are we defining long term and short term here? Any response will be greatly appreciated.
Edit: The highlighted text is what I am struggling with: https://i.imgur.com/tv46cRu.png
Just wanted to know what I should expect from Prof Neslehova and Steele as profs for GLM and Time series, respectively. Ie, the workload/difficulty of the exams.
Thanks and happy holidays!
Hi I'm running interrupted time series analysis to look at level change and slope change in England.
However, I now need to carry out the model in a multi-level to look at time series at regional level. Can I simply add region as random effect and keep the model as it is, or do I need to do change my dummy and time variables and/or add additional interaction terms.
I am very new to Time Series Analysis, and was wondering if there are any papers you all recommend reading on this topic. Iβm very much interested in the feature engineering aspect as well, specifically how to deal with the dates (do we split dates into individual column: year, month, day etc?). Any recommendations would be great!
I'm currently enrolled in Optimization (ISyE 6669) and Financial Modeling (MGT 8813). While the Financial Modeling course might be more relevant to my day to day job, I feel like spending the time and money to take a course where maybe 10% of the info is new to me might not be worth my while. I am wondering if I'd be better off taking a course like Time Series Analysis instead (especially since I took Regression last semester and R is fresh in my mind).
Has anyone taken Optimization and/or Time Series that would advise against taking them together due to the workload (considering a full time job in parallel)?
Thanks!
Hello guys,
Recently I've been working with macroeconomic data, with many variables being indexed (like 100 for base month/year).
Does one need to keep anything in mind when working with such variables while performing econometrics analysis? Especially if other variables are not similarly indexed?**
I'm mostly interested in time series analysis but if anyone has tips regarding the use of such variables in panel data, that would be helpful. Thanks!
** Actually I have a bunch of questions regarding such 'indexed' data so if you have any source i can refer to, that'd awesome.
Hey everyone, long time lurker here, thanks for all the tips I've gleaned from other posts. For my master's thesis I'm looking into extreme value prediction for electricity market prices. I have a strong background in the non-NN ML models (both in the stats theory side and Python implementation) but from the literature I've been reading I've come to the conclusion something like an LSTM model would be the best approach for this task, and would give me an opportunity to learn more about NNs.
I'm interested in any suggested resources for learning the theory behind NNs and how to implement them. I've been recommended PyTorch due to its more Pythonic API but am open to what you guys think would be most relevant.
Thanks for all responses!
Looking for a good python library for time series analysis, particularly GARCH fitting.
I've tried arch, but when I try import it, I get:
AttributeError: type object 'arch.univariate.recursions.array' has no attribute '__reduce_cython__'
Any suggestions?
Both ISYE 6420 Bayesian Statistics and ISYE 6402 Time-Series Analysis are available for OMSCS students this Fall 2019 semester, per the orientation/registration emails sent out by OMSCS advising.
I've referred to OMSCentral reviews for both courses, but I haven't been able to find any resources on the syllabi or lecture material for either one. I'm particularly interested in the lecture material for Bayesian Stats.
Where might I find this content?
Hey all!
I'm currently a masters student in a statistics program. My interest blossomed in the recent AI spring. I started the program after surfing the hype of Neural Networks, Machine Learning, and general artificial intelligence.
As I am planning the remainder of my program schedule I am faced with a decision. Between these four classes, I have to drop one to make room for a required course (Statistical Consulting/Practice, where we go through the soft skills practicing statistics). Here are the choices, from which I have to drop one from my schedule:
My intuition tells me to drop Applied Time Series, as it seems most AI technologies utilize Bayesian statistics and statistical learning techniques. An intro to data science course seems obvious in the value it offers for students ready to hit the job market.
So 3 questions.
I'm working with an hourly-time series with 8760 data points.
Testing the series stationarity with the ADF test in R as follows
adf.test(series, alternative = "explosive", k=730)
(in case you're wondering, the lag to which stationarity should be tested for is 730 because that's the number of hours in a month).
The p-value (0.09131) "tells" me I have no reason to reject the null hypothesis (with a confidence level of 5%) that my time series is stationary.
However, when I analyze the series ACF, I'm presented with a slow and "wavy" decay as you could see here.
For me, the ADF test is wrong. This test - as pretty much all the others tests for stationarity that I know - is filled with assumptions, and it didn't capture something important in the seasonality of my time series. Yet, it's mind-blowing for me to see the ADF test fails to confirm something the ACF shows so explicitly.
Is my conclusion right/adequate, or am I missing something?
Thank you.
Cross-posting to both r/OMSCS and r/OMSA
I'm planning to take Time Series and Bayesian Stats together in Spring. I've read the reviews on omscentral for the Time Series course, and they are horrible. Is the course really that bad? Can someone taking it in Fall 19 comment? Is it doable together with Bayesian by someone working full time?
I am registering for classes and I need help with choosing classes. I need advice on which statistics courses I should taken given I want a career in data science. My choices are Bayesian Statistics, Time Series Analysis, Stochastic Processes, Categorical Data Analysis, Survival Analysis, and Advanced Probability.
How would you rank these courses in terms of usefulness to a data science/data analyst/business analyst career?
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.