38 Hilarious Time series database Puns

Sending weather data to a time series database through a raspberry pi 4 - should I not use the DEB Raspiban installation of weewx?

I have a Davis Pro2 weather station and I am having trouble connecting weewx on a raspberry pi 4 to my influxdb database with this github code https://github.com/matthewwall/weewx-influx

Do I need to install the setup.py version of weewx rather than the DEB Debian version for Raspiban? https://weewx.com/docs/usersguide.htm

👍︎ 11

💬︎

👤︎ u/chochoofoo

📅︎ Dec 11 2021

🚨︎ report

QuestDB - fast time series database, zero-GC Java github.com/questdb/questd…

👍︎ 70

💬︎

👤︎ u/bluestreak01

📅︎ Nov 27 2019

🚨︎ report

What is time-series data, and why are we building a time-series database (TSDB)? questdb.io/blog/2020/11/1…

👍︎ 10

💬︎

👤︎ u/prtkgpt

📅︎ Dec 11 2020

🚨︎ report

Database to store TBs of sensor time series data?

I have 10 days of recording across 150 sensors saved in 5 minute HDF5 files. Each file is about 1 million rows (lowest sample rate but there is higher) by 150 columns, one of which is timestamp. There are about 3k of these files (so 3 billion rows x 150 columns at a minimum). I need to be able to query by time (often disjoint, e.g., 10am-2pm each day) and apply various transformations (z-score, pca/ica, clustering, wavelet etc) on the resulting data before returning. Currently I am getting all files and their write time (bash and python) and then using [write_time - 5minutes, write_time] as the window associated with that file. I pull in all files that contain some portion of the larger window I'm looking for (often hours) and then concatenating the data and applying transformations. I would like to just query a database for some time period and get the associated data or even better, just run the transformations somewhere in the database layer before returning. What type of database or data storing system would be best suited for my data?

👍︎ 6

💬︎

👤︎ u/Northstat

📅︎ Nov 07 2019

🚨︎ report

Time series database engine vs Splunk metrics

Has anyone seen any performance data that compares something like influxdb between Splunk metrics (https://docs.splunk.com/Documentation/Splunk/8.0.0/Metrics/Overview).

I haven’t seen any formal performance comparison for specifically Splunk metrics.

My company is interested in building a large-scale metrics collection database and is comparing contenders. I’m leaning something like timescale, but we already have Splunk, and we do not have a formal db admin, so interested what your recommendation would be if building this.

👍︎ 7

💬︎

👤︎ u/war_against_myself

📅︎ Nov 26 2019

🚨︎ report

Building a Time Series Database using AWS DynamoDB

Hi All,

Time series databases are becoming more and more common these days, I couldn’t find an easy and deployable solution to build one using DynamoDB, so created one here: https://coderecipe.ai/architectures/24198611

Just thought I would share, let me know if it’s helpful or if you have any suggestions!

👍︎ 7

💬︎

👤︎ u/codingrecipe

📅︎ Jun 27 2019

🚨︎ report

VictoriaMetrics - high-performance, cost-effective and scalable time series database, long-term remote storage for Prometheus github.com/VictoriaMetric…

👍︎ 7

💬︎

👤︎ u/ChristophBerger

📅︎ Jun 19 2019

🚨︎ report

Why bother with NoSQL databases? Time-series database scales writes in PostgreSQL by >20x, making it excellent for IoT projects blog.timescale.com/choose…

👍︎ 86

💬︎

👤︎ u/DamnShaneIsThatU

📅︎ Jul 14 2017

🚨︎ report

Before we were famous, I have started a mini series on YT using the fantastic 2006-07 database by The mad scientist. It's a time when PSG don't have the Qatari money they haven't won a league title for 12 years in league which is dominated by Lyon.. youtu.be/CI4AK8zTnEo

👍︎ 3

💬︎

👤︎ u/the-trequartista

📅︎ Mar 26 2020

🚨︎ report

[WP] Through a series of events you have been given a time machine and universal translator by a now-dead time traveler. You commit to finishing her mission: to obtain a printed set of the full Wikipedia database, and to take it back through time on a one-way trip to give it to Nikola Tesla.

👍︎ 31

💬︎

👤︎ u/Agent_Galahad

📅︎ Sep 16 2019

🚨︎ report

Analyzing NYC Taxi Open Data with GridDB and NodeJS | GridDB: Open Source Time Series Database for IoT griddb.net/en/blog/nyc-ta…

👍︎ 5

💬︎

👤︎ u/IllegalThoughts

📅︎ Apr 06 2020

🚨︎ report

Do we need Open Source Time Series Databases? iunera.com/kraken/fabric/…

👍︎ 6

💬︎

👤︎ u/Timbo2020

📅︎ Mar 30 2020

🚨︎ report

Do we really need Open Source Time Series Databases? iunera.com/kraken/fabric/…

👍︎ 4

💬︎

👤︎ u/Timbo2020

📅︎ Mar 30 2020

🚨︎ report

Coronavirus Geotracking Apps with Time Series Databases Analysis iunera.com/kraken/industr…

👍︎ 2

💬︎

👤︎ u/Timbo2020

📅︎ Apr 16 2020

🚨︎ report

CrateDB on making sense of IIoT and high end time series databases crate.io/a/making-sense-o…

👍︎ 4

💬︎

👤︎ u/RareEdge

📅︎ Feb 27 2020

🚨︎ report

What are Open Source Time Series Databases like Druid? iunera.com/kraken/fabric/…

👍︎ 2

💬︎

👤︎ u/Timbo2020

📅︎ Mar 30 2020

🚨︎ report

What are Open Source Time Series Databases? iunera.com/kraken/fabric/…

👍︎ 2

💬︎

👤︎ u/Timbo2020

📅︎ Mar 30 2020

🚨︎ report

Introduction to GridDB: a highly scalable in memory, NoSQL time series database optimized for IoT youtube.com/watch?v=boAZX…

👍︎ 10

💬︎

👤︎ u/IllegalThoughts

📅︎ Jul 10 2019

🚨︎ report

optimal database for large spatial and time-series dataset

First time poster. I know only a little about databases, but would appreciate any suggestions that anyone can give to point me in the right direction.

I've been working on an application to calibrate weather data. I want to store the results of my calibration in a database which can be accessed by my program and eventually to be displayed in a web application. All I need to store is one parameter in 5-minute increments for a 0.5x0.5km grid across the entire United States. This is something like 1.6 trillion rows generated per year. 90% of them would be zeros. The rest of the values would be real numbers, 2 bytes of precision would be plenty. It's possible I would add a couple other parameters in the future.

The most recent couple days of data will need to be updated relatively frequently (every hour or so) as new data comes in. After this it will be unlikely to change but could still be queried.

Data will generally be updated and queried for a contiguous area over a certain time domain.

I'm trying to figure out what kind of database I should think about using for this sort of problem. I have a little experience with Sqlite and could imagine doing this with a column for x, a column for y, a column for datetime, and a column for the value. But I'm not sure that this is the optimal solution for such a large database and I'd prefer not have to start from scratch down the road.

Right now this is my personal project, but if/when it gets to this point hopefully my company will be on board, so an enterprise solution could be appropriate. I'm leaning toward running the application on an AWS instance.

Thanks in advance for any helpful suggestions! :)

Edit: The other thing I realized is significant is I would be willing to sacrifice being able to efficiently look up values in the database (such as, give me all the times the parameter was greater than such and such a value) as long as I can look it up by (x,y,t) location. Looking into array databases right now.

👍︎ 9

💬︎

👤︎ u/PatrickGenesius

📅︎ Sep 26 2018

🚨︎ report

Updating a series of database entries at time intervals.

Suppose that I want to pull information from various sources once every hour and store them in my database and write error logs if any of them fail. The app itself can read and write to this DB at will in response to user requests

The idea I have right now is just to run the scraper script as a cron job every hour separate from the flask backend itself.

It seems ok but I feel like there's a more formal way of doing this.

👍︎ 9

💬︎

👤︎ u/tfwnokitsunemommygf

📅︎ Jan 08 2019

🚨︎ report

Time Series Databases and Operational Historians: Get the Best of Both Worlds With CrateDB and the Crate IoT Data Platform crate.io/a/time-series-da…

👍︎ 11

💬︎

👤︎ u/nachrieb

📅︎ Aug 02 2019

🚨︎ report

Data Modeling with GridDB | GridDB: Open Source Time Series Database for IoT griddb.net/en/blog/data-m…

👍︎ 3

💬︎

👤︎ u/IllegalThoughts

📅︎ Oct 24 2019

🚨︎ report

What the heck is time-series data (and why do I need a time-series database)? blog.timescale.com/what-t…

👍︎ 25

💬︎

👤︎ u/RobAtticus

📅︎ Jun 26 2017

🚨︎ report

NodeJS Client | GridDB: Open Source Time Series Database for IoT griddb.net/en/blog/nodejs…

👍︎ 6

💬︎

👤︎ u/pmz

📅︎ Aug 16 2019

🚨︎ report

5 Myths About Relational Databases, Time Series Data, and Operational Historians crate.io/a/5-myths-relati…

👍︎ 5

💬︎

👤︎ u/nachrieb

📅︎ Sep 04 2019

🚨︎ report

[HELP] Database design with time series

Hello, i need some help to design a database with time series. Here is my porblem:

I got 20 stations,
Each station own 5 sensors (they are different from each others)
Each sensor produce 2 measurements (the value and its uncertainty) at different rate : 1m, 1h, 1d, 1M

So it's like 500k/year for 1m time serie and 350/year for 1d time serie. Sensors are alive since 5 years. So we're looking at 5 * 20 = 100 time series at 2.5kk values and so on for other time series. Of course, stations and sensors are not synchronized (that's a shame).

Here is the pseudo-design i imagine (i'm no programmer here, i tryed my best):

First, station_table:

station_id (int) PK
station_name (string)

Then, sensor_type_table:

sensor_id (int) PK
sensor_type (string)
sensor_name (string)

Then, timestamp_table:

timestamp_id (int) PK
sensor_index (int) FK
station_index (int) FK

Finally, measurement_table:

measurement_id (int) PK
timestamp_id (int) FK
measurement_value (int)
measurement_uncertainty (int)

Then i create 1 station_table, 1 sensor_table, i populate with my stations and sensors type. Then, for each time series (20 stations * 20 sensors * 4 rates = 400) i create 400 timestamp_table and 400 corresponding measurement_table.

It feels very odd to me to create so much tables, am I right doing so?

Then i would like to perform some analysis (FT, MA... like 20 or more indicators). Would it be crazy to create a new column per indicator inside the measurement_table, to be able to retrieve them quickly? Or do i need to create some dedicated tables?

Oh and datas will keep coming : the system is still alive. I'll perform analysis with python (numpy) and will interact with the DB with python. I was planning on using pgSQL, i used it a few times to do some simple things (nothing about time series)

In case you're wondering, it's a study about 2D deformation of a solid (temperature is also needed to isolate some effects), there is 20 experiment in parallel.

👍︎ 6

💬︎

👤︎ u/vajenetehaispoint

📅︎ May 28 2018

🚨︎ report

InfluxData and Google Cloud Partner to Offer Leading Time Series Database on Google Cloud Platform influxdata.com/blog/influ…

👍︎ 7

💬︎

👤︎ u/thomcrowe

📅︎ Apr 09 2019

🚨︎ report

Akumuli - time-series database github.com/akumuli/Akumul…

👍︎ 106

💬︎

👤︎ u/chaotic-kotik

📅︎ Oct 24 2014

🚨︎ report

Measuring vertical scalability for time series databases in Google Cloud

Ever wondered how far your monitoring database could scale on a single node? Me too. So I run performance tests for InfluxDB, TimescaleDB and VictoriaMetrics on standard Google Cloud machine types with CPU counts varying from 1 to 64 and RAM sizes varying from 3.75GB to 260GB. Read the resulting article.

👍︎ 8

💬︎

👤︎ u/valyala

📅︎ Apr 11 2019

🚨︎ report

Kdb+ time-series database available on-demand; free 64-bit personal edition kx.com/news/kx-extends-us…

👍︎ 19

💬︎

👤︎ u/deltux

📅︎ Mar 28 2018

🚨︎ report

After clustering my time-series database into 50-100 clusters with K-Shape, is there a fast (around 1 minute run time) way to compare individual time-series from a similar (but different) database, and then find the cluster it belongs to had it been clustered with K-Shape with reasonable accuracy?

After obtaining a list of 50 clusters with K-Shape, is there a method that can compare individual time-series samples from a different database with each of the 50 original clusters and then get the same result had it been clustered with K-Shape? With reasonable accuracy of course.

Sorry if it's something that can't be done. I'm trying to learn this from scratch for a school project.

(Additional info: The 2 databases consist of Forex price samples from random points in time.)

👍︎ 2

💬︎

👤︎ u/Ifffrt

📅︎ Dec 11 2018

🚨︎ report

Power IoT and time-series workloads with TimescaleDB for Azure Database for PostgreSQL azure.microsoft.com/en-us…

👍︎ 7

💬︎

👤︎ u/yuretz

📅︎ Mar 19 2019

🚨︎ report

Need database for time series data on vm with spin disk/san

So i'm making a network monitor on isp scale. Lets say 10metrics *26 interfaces * 50.000 switches

Aiming 10 minute polling interval, was going for 1min with influx..

= 13m metrics per 10minutes, = 1.3m metrics per minute = 22k metrics per second

If 10metrics/interface is too much I can sacrifice 5 of them for hourly polling.

Since we dont have ssd's on our servers we cannot use influxdb. Therefore im looking for a viable tsdb that works on spin disk storage

I can order any amount of cpu cores, hdd size, ram, well discussable..

Another question that comes to mind, do I have to cluster for this?

Got a nodejs(hate me for it but its blazing fast) app for snmp polling, stats aggregation depending on db, statsD for influx

👍︎ 15

💬︎

👤︎ u/nginz

📅︎ May 13 2016

🚨︎ report

When Boring is Awesome: Building a scalable time-series database on PostgreSQL blog.timescale.com/when-b…

👍︎ 58

💬︎

👤︎ u/pimterry

📅︎ Apr 04 2017

🚨︎ report

mdb - is a numeric time-series database. Implemented as C++ library. github.com/lysevi/mdb

👍︎ 25

💬︎

👤︎ u/lysevi

📅︎ Sep 30 2015

🚨︎ report

InfluxDB | Installation | How To Use | Time Series Database ? iotbyhvm.ooo/influxdb-tim…

👍︎ 2

💬︎

👤︎ u/harshhvm

📅︎ May 24 2019

🚨︎ report

What is a Time Series Database? - IoTbyHVM - Bits & Bytes of IoT iotbyhvm.ooo/what-is-a-ti…

👍︎ 2

💬︎

👤︎ u/harshhvm

📅︎ May 24 2019

🚨︎ report

Running time-series database on 32-core machine (16M writes/sec) akumuli.org/akumuli/2017/…

👍︎ 26

💬︎

👤︎ u/chaotic-kotik

📅︎ Mar 14 2017

🚨︎ report