In collaboration with folks at the Microsoft Gray Systems Lab (formerly known as CISL), we designed and implemented Kaskade, a query optimization framework that exploits materialized graph views for query optimization purposes. Kaskade employs a novel inference-based view enumeration technique that significantly reduces the search space of views that need to be considered. Moreover, it introduced a graph view size estimator to pick the most beneficial views to materialize given a query set and to select the best query evaluation plan given a set of materialized views. We evaluated its performance over real-world graphs, including a large-scale provenance graph maintained at Microsoft that enables auditing, service analytics, and advanced system optimizations. Our results showed that Kaskade substantially reduces the effective graph size and yields significant performance speedups (up to 50x), in some cases making otherwise intractable queries possible.


Joana M. F. da Trindade, Konstantinos Karanasos (Microsoft), Carlo Curino (Microsoft), Sam Madden, Julian Shun