Maybe I don't understand, but I thought the whole point of datomic (and XTDB by extension) was to avoid denormalization.
I am surprised the author says:
> "Old Yakread" has a lot of slow queries. For example, loading the subscriptions page on my account takes more than 10 seconds: for each of my hundreds of subscriptions, it has to run a query to figure out how many unread posts there are and when the most recent post was published.
I would have thought you would grab all this data in a single query roughly like this:
Datomic offers the ability to declare a "composite index" which can help to accelerate some kinds of access patterns but can't solve 6NF join overheads entirely. If you want guaranteed read performance then denormalized views are the way to go, and perhaps even an IVM engine like Materialize - or this looked promising at one time: https://github.com/sixthnormal/clj-3df
You can do that, it's just slow if there are a lot of results.
Agreed you want to keep data in your main database normalized since it's easier to reason about and avoid bugs/inconsistencies in the data. The inherent trade-off is just that it's more computationally expensive to get the denormalized data.
The idea of materialized views is to get the best of both worlds: your main database stays normalized, and you have a secondary data store (or certain tables/whatever inside your main database, depends on the implementation) that get automatically precomputed from your normalized data. So you can get fast queries without needing to introduce a bunch of logic for maintaining the denormalized data.
The hard part is how do you actually keep those materialized views up to date. e.g. if you're ok with stale data, you can do a daily batch job to update your views. If you want to the materialized views to be always up-to-date then things get harder; the solution described in the article is one attempt at addressing that problem.
I am surprised the author says:
> "Old Yakread" has a lot of slow queries. For example, loading the subscriptions page on my account takes more than 10 seconds: for each of my hundreds of subscriptions, it has to run a query to figure out how many unread posts there are and when the most recent post was published.
I would have thought you would grab all this data in a single query roughly like this:
^ AI disclaimer, but I think it gets the gist, you do your logical joins right in the queryAgreed you want to keep data in your main database normalized since it's easier to reason about and avoid bugs/inconsistencies in the data. The inherent trade-off is just that it's more computationally expensive to get the denormalized data.
The idea of materialized views is to get the best of both worlds: your main database stays normalized, and you have a secondary data store (or certain tables/whatever inside your main database, depends on the implementation) that get automatically precomputed from your normalized data. So you can get fast queries without needing to introduce a bunch of logic for maintaining the denormalized data.
The hard part is how do you actually keep those materialized views up to date. e.g. if you're ok with stale data, you can do a daily batch job to update your views. If you want to the materialized views to be always up-to-date then things get harder; the solution described in the article is one attempt at addressing that problem.