実行時間を短縮するために、このクエリを最適化するか、テーブル構造を変更できますか?私はEXPLAIN
の出力を実際に理解していません。私はいくつかの指標がないのですか?で更新私のpostgresクエリを最適化する
EXPLAIN SELECT COUNT(*) AS count,
q.query_str
FROM click_fact cf,
query q,
date_dim dd,
queries_p_day_mv qpd
WHERE dd.date_dim_id = qpd.date_dim_id
AND qpd.query_id = q.query_id
AND type = 'S'
AND cf.query_id = q.query_id *emphasized text*
AND dd.pg_date BETWEEN '2010-12-29' AND '2011-01-28'
AND qpd.interface_id IN (SELECT DISTINCT interface_id from interface WHERE lang = 'sv')
GROUP BY q.query_str
ORDER BY count DESC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=19170.15..19188.80 rows=7460 width=12)
Sort Key: (count(*))
-> HashAggregate (cost=18597.03..18690.28 rows=7460 width=12)
-> Nested Loop (cost=10.20..18559.73 rows=7460 width=12)
-> Nested Loop (cost=10.20..14975.36 rows=2452 width=20)
Join Filter: (qpd.interface_id = interface.interface_id)
-> Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: interface.interface_id
-> Seq Scan on interface (cost=0.00..1.02 rows=1 width=4)
Filter: (lang = 'sv'::text)
-> Nested Loop (cost=9.16..14943.65 rows=2452 width=24)
-> Hash Join (cost=9.16..14133.58 rows=2452 width=8)
Hash Cond: (qpd.date_dim_id = dd.date_dim_id)
-> Seq Scan on queries_p_day_mv qpd (cost=0.00..11471.93 rows=700793 width=12)
-> Hash (cost=8.81..8.81 rows=28 width=4)
-> Index Scan using date_dim_pg_date_index on date_dim dd (cost=0.00..8.81 rows=28 width=4)
Index Cond: ((pg_date >= '2010-12-29'::date) AND (pg_date <= '2011-01-28'::date))
-> Index Scan using query_pkey on query q (cost=0.00..0.32 rows=1 width=16)
Index Cond: (q.query_id = qpd.query_id)
-> Index Scan using click_fact_query_id_index on click_fact cf (cost=0.00..1.01 rows=36 width=4)
Index Cond: (cf.query_id = qpd.query_id)
Filter: (cf.type = 'S'::bpchar)
ANALYZE EXPLAIN:
EXPLAIN ANALYZE SELECT COUNT(*) AS count,
q.query_str
FROM click_fact cf,
query q,
date_dim dd,
queries_p_day_mv qpd
WHERE dd.date_dim_id = qpd.date_dim_id
AND qpd.query_id = q.query_id
AND type = 'S'
AND cf.query_id = q.query_id
AND dd.pg_date BETWEEN '2010-12-29' AND '2011-01-28'
AND qpd.interface_id IN (SELECT DISTINCT interface_id from interface WHERE lang = 'sv')
GROUP BY q.query_str
ORDER BY count DESC;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=19201.06..19220.52 rows=7784 width=12) (actual time=51017.162..51046.102 rows=17586 loops=1)
Sort Key: (count(*))
Sort Method: external merge Disk: 632kB
-> HashAggregate (cost=18600.67..18697.97 rows=7784 width=12) (actual time=50935.411..50968.678 rows=17586 loops=1)
-> Nested Loop (cost=10.20..18561.75 rows=7784 width=12) (actual time=42.079..43666.404 rows=3868592 loops=1)
-> Nested Loop (cost=10.20..14975.91 rows=2453 width=20) (actual time=23.678..14609.282 rows=700803 loops=1)
Join Filter: (qpd.interface_id = interface.interface_id)
-> Unique (cost=1.03..1.04 rows=1 width=4) (actual time=0.104..0.110 rows=1 loops=1)
-> Sort (cost=1.03..1.04 rows=1 width=4) (actual time=0.100..0.102 rows=1 loops=1)
Sort Key: interface.interface_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on interface (cost=0.00..1.02 rows=1 width=4) (actual time=0.038..0.041 rows=1 loops=1)
Filter: (lang = 'sv'::text)
-> Nested Loop (cost=9.16..14944.20 rows=2453 width=24) (actual time=23.550..12553.786 rows=700808 loops=1)
-> Hash Join (cost=9.16..14133.80 rows=2453 width=8) (actual time=18.283..3885.700 rows=700808 loops=1)
Hash Cond: (qpd.date_dim_id = dd.date_dim_id)
-> Seq Scan on queries_p_day_mv qpd (cost=0.00..11472.08 rows=700808 width=12) (actual time=0.014..1587.106 rows=700808 loops=1)
-> Hash (cost=8.81..8.81 rows=28 width=4) (actual time=18.221..18.221 rows=31 loops=1)
-> Index Scan using date_dim_pg_date_index on date_dim dd (cost=0.00..8.81 rows=28 width=4) (actual time=14.388..18.152 rows=31 loops=1)
Index Cond: ((pg_date >= '2010-12-29'::date) AND (pg_date <= '2011-01-28'::date))
-> Index Scan using query_pkey on query q (cost=0.00..0.32 rows=1 width=16) (actual time=0.005..0.006 rows=1 loops=700808)
Index Cond: (q.query_id = qpd.query_id)
-> Index Scan using click_fact_query_id_index on click_fact cf (cost=0.00..1.01 rows=36 width=4) (actual time=0.005..0.022 rows=6 loops=700803)
Index Cond: (cf.query_id = qpd.query_id)
Filter: (cf.type = 'S'::bpchar)
sql92スタイルの代わりにJOIN構文を使用して、同じプランを取得するかどうかを確認することをお勧めします。私はそれが起こるべきではないことを知っていますが、私は時々2つのスタイル間の速度の大きな変化を見ました - おそらくあなたの意図で明確になっているので、クエリオプティマイザも役立ちますか? – iain
@iain: 'ANSI'スタイルを書き直すと' PostgreSQL'でプランを変更するサンプルクエリを投稿できますか? – Quassnoi
@Quassinoi - いいえ、私はそれが起こったのを見たことのないクエリにアクセスすることはできません。私はそれが公正であることをmysqlで最もよく見てきました。あなたはちょうど逸話と一緒に住んでいなければなりません:) – iain