This and Materialize seemed like great tools. I met some of the team of Rising Wave at the Kafka conference last year in London and was impressed by their work. It may be great if you need such a tool.
In the end, I went with ClickHouse and it's materialized views feature. It might not be quite as powerful as what these other tools are doing, but it works for us, and it's really easy to set up. Before we were using Timescale's continuous aggregates, which had good performance, but require some domain knowledge to setup. ClickHouse materialized views are great because you don't need to be an expert to use them. And even so, performance is still very good.
I've been running this in prod self hosted for around 6 months (podman with docker compose, minio for s3, streaming with pulsar). We have built position calculations for risk monitoring and booking enrichment pipelines. Risingwave is a much better alternative to Kafka Streams: primarily around consistency, sql first, easy state query and deployment.
The RisingWave team are pretty responsive on slack and the ask ai feature also helps to solve questions. They have coverage from Singapore, China and California.
Issues we have seen have mostly been related to reliability of our on prem Minio cluster which is used to store the data. Other bugs do appear from release to release but once raised get attention quickly.
Looking at the contributor list, I doubt they speak English or frequent HN so you’ll only get the engineers’ perspective. Looks new and the cloud offering a way to sell it.
A. Some of the team members are in the bay area including the founder who writes well.
B. Used it for streaming sql on citus cluster and planning to use it more.
A) I assumed the team members were English speaking and are amazing at what they do (look at it!), more that they might have more customers on the eastern side of the world and that’s totally my bias from past B) but glad to see I’m wrong and that there’s people using it for things. It looks awesome.
This seems very good. Always wondering what are the usecases for this apart for observability/real-time analytics? Do people use this for incremental view maintenance in Postgres?
I'm thinking of using it to replace an analytics pipeline at my job, which now uses expensive batch jobs.
If the tech is solid, we would have instant and incremental updates, instead of recomputing everything every X hours.
This would simplify things a lot.
I think Materialize offers a similar product, but last I checked it was only available as a SaaS solution.
I hope to do a proof of concept soon, to compare both solutions
The use case section also mentions Event-driven applications which is quite broad. Like other comment on the thread I'd be curious to hear about anyone having experience with RisingWave in this use case area.
I apologize for this stupid question but whenever i see products like this or kafka, i cant help but wonder. when exactly do you need a system like this compared to a traditional redis pub sub?
It's very useful any time the input to some system is a stream of events, potentially from a whole bunch of different sources, but you want the output to be a unified relational data model.
I used to work in insurance, and we had a whole bunch of systems of record for different functions of the business -- CRM, policy management, billing, claims, etc. Some were our own tech, many were SaaS. It's great to be able to keep these systems decoupled operationally. That way, you can replace pieces and have your business areas have fairly independent IT stacks.
But many backoffice tasks, like finance, accounting, and servicing need a holistic view of what's going on. It's helpful to ingest all the data into a centralized warehouse, and build up a unified model of the state of the business. A lot of analysts like to write these data transformations in SQL.
Insurance is not a fast-paced business, so we largely ingested the data in structured form. But you can imagine that for faster businesses, like advertising, monitoring, IoT, or trading, the data from the systems of record might be an event stream, rather than a data model. These stream processing databases are designed for this type of situation, where you may want real-time ETL, event-by-event.
This and Materialize seemed like great tools. I met some of the team of Rising Wave at the Kafka conference last year in London and was impressed by their work. It may be great if you need such a tool.
In the end, I went with ClickHouse and it's materialized views feature. It might not be quite as powerful as what these other tools are doing, but it works for us, and it's really easy to set up. Before we were using Timescale's continuous aggregates, which had good performance, but require some domain knowledge to setup. ClickHouse materialized views are great because you don't need to be an expert to use them. And even so, performance is still very good.
We wrote about it briefly here: https://blog.picnic.nl/building-a-real-time-analytics-platfo...
Does anyone have experience with RisingWave in production? It seems like an interesting product but I can't find any experience reports.
I've been running this in prod self hosted for around 6 months (podman with docker compose, minio for s3, streaming with pulsar). We have built position calculations for risk monitoring and booking enrichment pipelines. Risingwave is a much better alternative to Kafka Streams: primarily around consistency, sql first, easy state query and deployment.
The RisingWave team are pretty responsive on slack and the ask ai feature also helps to solve questions. They have coverage from Singapore, China and California.
Issues we have seen have mostly been related to reliability of our on prem Minio cluster which is used to store the data. Other bugs do appear from release to release but once raised get attention quickly.
Looking at the contributor list, I doubt they speak English or frequent HN so you’ll only get the engineers’ perspective. Looks new and the cloud offering a way to sell it.
A. Some of the team members are in the bay area including the founder who writes well. B. Used it for streaming sql on citus cluster and planning to use it more.
A) I assumed the team members were English speaking and are amazing at what they do (look at it!), more that they might have more customers on the eastern side of the world and that’s totally my bias from past B) but glad to see I’m wrong and that there’s people using it for things. It looks awesome.
Alex Chi was in the project. He is now writing TinyLLM.
Oh I wasn’t talking about the engineers behind the project, they’re good. More if there were companies using this already…
Don’t take it the wrong way, just that the east and west tend to only share things when it’s profound - like deepseek.
But I could be wrong, sometimes things go under the radar until it’s ready.
This seems very good. Always wondering what are the usecases for this apart for observability/real-time analytics? Do people use this for incremental view maintenance in Postgres?
I'm thinking of using it to replace an analytics pipeline at my job, which now uses expensive batch jobs. If the tech is solid, we would have instant and incremental updates, instead of recomputing everything every X hours. This would simplify things a lot.
I think Materialize offers a similar product, but last I checked it was only available as a SaaS solution.
I hope to do a proof of concept soon, to compare both solutions
The use case section also mentions Event-driven applications which is quite broad. Like other comment on the thread I'd be curious to hear about anyone having experience with RisingWave in this use case area.
Replied above, used it for triggering actions based on lack of or presence of data points.
I apologize for this stupid question but whenever i see products like this or kafka, i cant help but wonder. when exactly do you need a system like this compared to a traditional redis pub sub?
It's very useful any time the input to some system is a stream of events, potentially from a whole bunch of different sources, but you want the output to be a unified relational data model.
I used to work in insurance, and we had a whole bunch of systems of record for different functions of the business -- CRM, policy management, billing, claims, etc. Some were our own tech, many were SaaS. It's great to be able to keep these systems decoupled operationally. That way, you can replace pieces and have your business areas have fairly independent IT stacks.
But many backoffice tasks, like finance, accounting, and servicing need a holistic view of what's going on. It's helpful to ingest all the data into a centralized warehouse, and build up a unified model of the state of the business. A lot of analysts like to write these data transformations in SQL.
Insurance is not a fast-paced business, so we largely ingested the data in structured form. But you can imagine that for faster businesses, like advertising, monitoring, IoT, or trading, the data from the systems of record might be an event stream, rather than a data model. These stream processing databases are designed for this type of situation, where you may want real-time ETL, event-by-event.
EDIT: Also, their website has a use cases section: https://risingwave.com/use-cases/
Solutions like this can help with complex transformations that rely on intermediate state like streaming aggregations.
They often have watermarking, windowing, and all that good stuff built in whereas with redis you would have to build all of that in your application.
What about the transformation in the middle, like a moving average?
I really like their architecture diagram, it uses colors very well to contrast the different technologies. Does anyone know what they used to make it?
A Rust-based Flink? Is it simpler?
Nope..SQL is the lingua franca of the world
Flink does SQL. https://www.confluent.io/blog/getting-started-with-apache-fl...
How does this compare to Materialize?
As of 2022: https://github.com/orgs/risingwavelabs/discussions/1736