New ask Hacker News story: Ask HN: Recommended way to store financial time-series based data for trading?
Ask HN: Recommended way to store financial time-series based data for trading?
2 by dfischer | 0 comments on Hacker News.
I've been studying and experimenting with this as a hobby and want to get more serious now. I've been storing using flat files but I now want to experiment with scaling out my ideas and infrastructure to collect my own data raw 24/7 across the globe. I see various time series databases to use but this doesn't seem clear to me on a winner. I looked at influx, timeseriesdb, and various others. Most of them have material geared towards IoT and not much financial. I've been considering a stack built entirely on GCP that looks roughly like: regional injestor (compute) -> pub/sub -> Dataflow -> pub/sub -> firestore and BigQuery The idea is to allow clients to subscribe to prebuilt aggregation metrics from dataflow/beam and optimize for latency cross-regionally. The automated rules at the most would need to react in seconds not milliseconds. I would be more than happy with a guaranteed rolling window of 5-15 seconds for my most time hungry decisions. Basic aggregations: OHLC, stdev Advanced aggregations: values based on custom strategies that would be injected into the feed for a client (automated trading app) to consume and act on. Is it crazy to do all the rolling window / strategy calculations in the airflow piece of the architecture or does that make more sense in comparison trying to compute it per client? Visually I am imagining various signals/strategies would be separate airflow templates and a client would subscribe to whatever strategy it wants to use. Thanks.
2 by dfischer | 0 comments on Hacker News.
I've been studying and experimenting with this as a hobby and want to get more serious now. I've been storing using flat files but I now want to experiment with scaling out my ideas and infrastructure to collect my own data raw 24/7 across the globe. I see various time series databases to use but this doesn't seem clear to me on a winner. I looked at influx, timeseriesdb, and various others. Most of them have material geared towards IoT and not much financial. I've been considering a stack built entirely on GCP that looks roughly like: regional injestor (compute) -> pub/sub -> Dataflow -> pub/sub -> firestore and BigQuery The idea is to allow clients to subscribe to prebuilt aggregation metrics from dataflow/beam and optimize for latency cross-regionally. The automated rules at the most would need to react in seconds not milliseconds. I would be more than happy with a guaranteed rolling window of 5-15 seconds for my most time hungry decisions. Basic aggregations: OHLC, stdev Advanced aggregations: values based on custom strategies that would be injected into the feed for a client (automated trading app) to consume and act on. Is it crazy to do all the rolling window / strategy calculations in the airflow piece of the architecture or does that make more sense in comparison trying to compute it per client? Visually I am imagining various signals/strategies would be separate airflow templates and a client would subscribe to whatever strategy it wants to use. Thanks.
No comments