存储时间序列数据的最佳开源解决方案是什么?

What is the best open source solution for storing time series data?(存储时间序列数据的最佳开源解决方案是什么?)
本文介绍了存储时间序列数据的最佳开源解决方案是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我有兴趣监视一些对象.我希望每 15 分钟获得大约 10000 个数据点.(也许一开始不是,但这是一般的球场").我还希望能够获得每日、每周、每月和每年的统计数据.将数据保持最高分辨率(15 分钟)两个月以上并不重要.

I am interested in monitoring some objects. I expect to get about 10000 data points every 15 minutes. (Maybe not at first, but this is the 'general ballpark'). I would also like to be able to get daily, weekly, monthly and yearly statistics. It is not critical to keep the data in the highest resolution (15 minutes) for more than two months.

我正在考虑存储这些数据的各种方法,并且一直在研究经典的关系数据库或无模式数据库(例如 SimpleDB).

I am considering various ways to store this data, and have been looking at a classic relational database, or at a schemaless database (such as SimpleDB).

我的问题是,这样做的最佳方式是什么?我更喜欢开源(免费)解决方案,而不是昂贵的专有解决方案.

My question is, what is the best way to go along doing this? I would very much prefer an open-source (and free) solution to a proprietary costly one.

小记:我正在用 Python 编写这个应用程序.

Small note: I am writing this application in Python.

推荐答案

HDF5,可以访问通过 h5py 或 PyTables,专为处理非常大的数据集而设计.两个接口都运行良好.例如,h5py 和 PyTables 都具有自动压缩功能,并且支持 Numpy.

HDF5, which can be accessed through h5py or PyTables, is designed for dealing with very large data sets. Both interfaces work well. For example, both h5py and PyTables have automatic compression and supports Numpy.

这篇关于存储时间序列数据的最佳开源解决方案是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

groupby multiple coords along a single dimension in xarray(在xarray中按单个维度的多个坐标分组)
Group by and Sum in Pandas without losing columns(Pandas中的GROUP BY AND SUM不丢失列)
Group by + New Column + Grab value former row based on conditionals(GROUP BY+新列+基于条件的前一行抓取值)
Groupby and interpolate in Pandas(PANDA中的Groupby算法和插值算法)
Pandas - Group Rows based on a column and replace NaN with non-null values(PANAS-基于列对行进行分组,并将NaN替换为非空值)
Grouping pandas DataFrame by 10 minute intervals(按10分钟间隔对 pandas 数据帧进行分组)