High-frequency time series database

Click For Summary
SUMMARY

The discussion centers on selecting a database for high-frequency time series data, specifically comparing MongoDB, Kyoto Cabinet, and HDF5. The user plans to insert 1200 rows of 8 entries per second, resulting in approximately 5 GB of data daily. HDF5 is identified as a storage scheme rather than a traditional database, while MongoDB is a full-fledged database requiring a server. Kyoto Cabinet is also described as a storage scheme, with a focus on speed, although its popularity and capabilities are questioned.

PREREQUISITES
  • Understanding of high-frequency time series data management
  • Familiarity with HDF5 storage scheme
  • Knowledge of MongoDB database architecture
  • Awareness of Kyoto Cabinet's key-value storage model
NEXT STEPS
  • Research HDF5 performance optimization techniques
  • Explore MongoDB indexing strategies for time series data
  • Investigate Kyoto Cabinet's data retrieval speed benchmarks
  • Learn about data modeling best practices for time series databases
USEFUL FOR

This discussion is beneficial for data engineers, software developers, and researchers involved in high-frequency data collection and analysis, particularly those evaluating database options for time series applications.

meanrev
Messages
116
Reaction score
2
I'm choosing a database to write high-frequency time series data onto and have narrowed it down to MongoDB, Kyoto Cabinet or HDF5.

I will be inserting 1200 rows of 8 entries per second, cumulating about 5 GB of data per day I'm estimating.

Does anyone have experience between the three and could facilitate me in making the decision?

Thanks!
 
Technology news on Phys.org
Well, I don't know much about databases; but I will give one silly opinion...

The thing is I have been trying to learn various things available within Python...one of those things I run into was, precisely, HDF5. When I read about HDF5, I understood that it was a storage scheme and not necessarily a database (i.e., there is no database server running with its own intelligence to answer queries or return sets or anything like that).

Of the other two choices that you mention, I just quickly read the main webpages and it looks like MonoDB is a real database (requires a server) and Kyoto Cabinet does not, this last, again, it's just a storage scheme.

So, my first opinion, if you need speed, is to forget about using a real database and stick to a storage scheme...so, Kyoto or HDF5.

It seems Kyoto talks about one key,value per line...does not seem too impressive as a storage scheme...but maybe that's where speed comes from.

HDF5, from what I remember, is actually rather versatile as far as as to what it can store.

The Kyoto site does not look like much...how popular is this?

Just becauss I learned about HDF5 before I ever heard about Kyoto, it sounds like HDF5 is more popular within the scientific/engineering community...

Anyway, that's my un-educated opinion.

gsal
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
28K
  • · Replies 8 ·
Replies
8
Views
4K
Replies
29
Views
6K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
6K
  • · Replies 2 ·
Replies
2
Views
612
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 13 ·
Replies
13
Views
7K
Replies
4
Views
10K