After a cluster enlargement, performance may not increase as expected or even decrease.
Profile of affected queries shows strong data imbalance (NODE SYNC) and possibly disc access
This may be caused by the semantics of the cluster enlargement (REORGANIZE TABLE). Meaning, the step causing the problem ** is the Cluster enlargement to N+X nodes, including REORGANIZE of the fact tables.
Database running on N nodes
Fact tables have no distribution keys
Data is inserted in a mostly sorted way (ie. daily data with daily timestamps)
Data is queried using strong date filters
Reorganize is content-agnostic and tries to move as little data as possible. In fact, on each of the N nodes, the data inserted last is taken and transmitted to the X new nodes until data is balanced across all nodes. With chronologically sorted data, data inserted last equals the latest data... this means that data will be split across the cluster, with N nodes storing 'old data' and X nodes storing 'new data'.
In the worst case, X==1 and any query asking for the latest data are performed on one node only.
To avoid this, always put a distribution key on your fact tables, butnoton a date column.