On this page:
Summary: This page provides an explanation of the Quality Rule 1101000 "Never use SQL queries with a cartesian product" that is triggered by the SQL Analyzer during an analysis.
What is searched by this rule ?
The algorithm is the following for each SELECT:
- all table references of FROM clauses are considered
- if there is a 'joining' between two table references then they are considered as 'connected'
- if all table references are 'connected' then there is no violation
You can view this Quality Rule as a detector of 'Nested Loops' in query plan. Explicit CROSS JOINS are not excluded from violations.
Example of violations
Trivial
SELECT * FROM A, B;
Less trivial
This is less trivial, and may appear as an incorrect violation at first sight:
Select * From TBL1 a, TBL2 b Where a.COL3 = 12.5 AND b.COL5 = '10102';
Here each table is filtered but independently. So the query can be rewritten as follows:
Select * From (select * TBL1 Where COL3 = 12.5) a, (select * FROM TBL2 b Where b.COL5 = '10102') b;
This can be easily proven by using query plan display. Here are some examples conducted with postgresql:
create table t1 (col1 integer); create table t2 (col1 integer); SELECT * FROM t1, t2; /* ==> classical cartesian product "Nested Loop (cost=0.00..72074.00 rows=5760000 width=8)" " -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4)" " -> Materialize (cost=0.00..46.00 rows=2400 width=4)" " -> Seq Scan on t2 (cost=0.00..34.00 rows=2400 width=4)" */ SELECT * FROM t1, t2 WHERE t1.col1 = 1 and t2.col1 =1; /* ==> almost same query plan as classical cartesian product : Nested Loop "Nested Loop (cost=0.00..81.83 rows=144 width=8)" " -> Seq Scan on t1 (cost=0.00..40.00 rows=12 width=4)" " Filter: (col1 = 1)" " -> Materialize (cost=0.00..40.06 rows=12 width=4)" " -> Seq Scan on t2 (cost=0.00..40.00 rows=12 width=4)" " Filter: (col1 = 1)" */ SELECT * FROM t1, t2 WHERE t1.col1 = t2.col1; /* ==> no cartesian product "Merge Join (cost=337.49..781.49 rows=28800 width=8)" " Merge Cond: (t1.col1 = t2.col1)" " -> Sort (cost=168.75..174.75 rows=2400 width=4)" " Sort Key: t1.col1" " -> Seq Scan on t1 (cost=0.00..34.00 rows=2400 width=4)" " -> Sort (cost=168.75..174.75 rows=2400 width=4)" " Sort Key: t2.col1" " -> Seq Scan on t2 (cost=0.00..34.00 rows=2400 width=4)" */