Using query compiler output to guide query resource allocation

A performance suggestion:

Within a consumer group a long running query gets roughly half the resources when there is also a continuous stream of very short/simple queries running. This can lead to idle CPU/Resources because the simple queries don’t need the entire other half of the resources.

It would be great if the resource manager took into account the cost estimates from the query compiler/optimizer when allocating resources so that simple “SELECT 1”s or queries that read from replicated tables do not get allocated a lot of unnecessary CPU/Network/IO.

1 Like

You are correct. A constant stream of short, inexpensive queries running alongside a few long-running queries can lead to inefficient CPU utilization.

This is generally not a problem in a fully utilized system where the number of running queries equals or exceeds the number of available cores. Such systems usually have a slight default overcommit of CPU resources and built-in mechanisms to handle short queries efficiently.

It is also important to note that even without the short queries, a single long-running query might not use the CPU fully, as it could be limited by I/O or rely on algorithms that are not completely parallelizable.

But your suggestion is good. Taking the cost of short, inexpensive queries into account within the consumer group scheduler is a valuable idea for optimizing the resource allocation.