Great news! Our paper on efficient multi-dimensional high-velocity stream analytics with arbitrary predicates got accepted at VLDB 2024. A preliminary version can be read here.
For many applications, it is useful to be able to ask queries with an arbitrary number of predicates that can be decided on at query time. For a stream s with attributes <Location, LAI, Wind, Temperature>, an example SQL query considering only the first two attributes would look as follows:
SELECT COUNT(*) FROM s WHERE s.Location = ‘Eindhoven’ AND s.LAI = ‘2’
We want the user to vary this number of filters as they like. Current approaches to solve this problem are too expensive. The amount of needed to solve this grows exponentially with the number of attributes in the stream. We developed a new sketch named OmniSketch, to address this problem. Using this sketch, our resources have to grow linearly with the number of attributes instead of exponentially. The experiments support that we outperform the state of the art, especially when the number of attributes grows. This method also supports range queries, for example:
SELECT COUNT(*) FROM s WHERE s.Location = ‘Eindhoven’ AND s.LAI = ‘2’ AND s.Temperature BETWEEN 30 AND 35’