The data distribution of table
among AMPs is called Skew Factor.
Generally For Non-Unique PI we get duplicate values so the more
duplicate vales we get more the data have same row hash so the same data will
come to same amp, it makes data distribution inequality, One amp will store
more data and other amp stores less amount of data, when we are accessing full
table, The amp which is having more data will take longer time and makes other
amps waiting which leads processing wastage
In this situation (unequal distribution of data) we get Skew
Factor High
For this type of tables we should avoid full table scans
Ex:
AMP0 AMP1
10000(10%) 9000000(90%)
In this situation skew factor is very high 90%
No comments:
Post a Comment