部分的な索引付けが有効でない

の部分索引が表示されたときに、なぜ私はSeq scanになりますか。部分的な索引付けが有効でない

\d+ call_records; 

id     | integer      | not null default nextval('call_records_id_seq'::regclass) | plain |    | 

plain_crn   | bigint      | 
active    | boolean      | default true 
timestamp   | bigint      | default 0 


Indexes: 
    "index_call_records_on_plain_crn" UNIQUE, btree (plain_crn) 
    "index_call_records_on_active" btree (active) WHERE active = true

期待されるように、idはインデックススキャンでした。

EXPLAIN select * from call_records where id=1; 
             QUERY PLAN          
---------------------------------------------------------------------------------------- 
Index Scan using call_records_pkey on call_records (cost=0.14..8.16 rows=1 width=373) 
    Index Cond: (id = 1) 
(2 rows)

同じことがplain_crn

EXPLAIN select * from call_records where plain_crn=1; 
               QUERY PLAN            
------------------------------------------------------------------------------------------------------ 
Index Scan using index_call_records_on_plain_crn on call_records (cost=0.14..8.16 rows=1 width=373) 
    Index Cond: (plain_crn = 1) 
(2 rows)

しかし、activeの場合も同じではないために行きます。

EXPLAIN select * from call_records where active=true;                               QUERY PLAN       
----------------------------------------------------------------- 
Seq Scan on call_records (cost=0.00..12.00 rows=100 width=373) 
    Filter: active 
(2 rows)

出典

2016-07-03 Viren

すべてのために ''（冗長、解析）を説明の出力を投稿してください –

PostgreSQLがインデックスを「アクティブ」で使用するかどうかは、真と偽の比率によって決まります。 falseよりも真実がある時点では、クエリプランナはテーブルスキャンがおそらくより高速であると判断します。

私はテストするテーブルを構築し、100万行のランダム（ish）データをロードしました。

select active, count(*) 
from call_records 
group by active;

 
active count 
-- 
f  499983 
t  500017

真と偽の行のほぼ同じ数を持っています。ここに実行計画があります。

explain analyze 
select * from call_records where active=true;

 
"Bitmap Heap Scan on call_records (cost=5484.82..15344.49 rows=500567 width=21) (actual time=56.542..172.084 rows=500017 loops=1)" 
" Filter: active" 
" Heap Blocks: exact=7354" 
" -> Bitmap Index Scan on call_records_active_idx (cost=0.00..5359.67 rows=250567 width=0) (actual time=55.040..55.040 rows=500023 loops=1)" 
"  Index Cond: (active = true)" 
"Planning time: 0.105 ms" 
"Execution time: 204.209 ms"

それから私は、 "アクティブ" 更新された統計を更新し、再度チェック。

update call_records 
set active = true 
where id < 750000; 

analyze call_records; 
explain analyze 
select * from call_records where active=true;

 
"Seq Scan on call_records (cost=0.00..22868.00 rows=874100 width=21) (actual time=0.032..280.506 rows=874780 loops=1)" 
" Filter: active" 
" Rows Removed by Filter: 125220" 
"Planning time: 0.316 ms" 
"Execution time: 337.400 ms"

シーケンシャルスキャンをオフに私の場合には、PostgreSQLは正しい決断をしたことを示しています。テーブルスキャン（シーケンシャルスキャン）は約10ms高速でした。

set enable_seqscan = off; 
explain analyze 
select * from call_records where active=true;

 
"Index Scan using call_records_active_idx on call_records (cost=0.42..39071.14 rows=874100 width=21) (actual time=0.031..293.295 rows=874780 loops=1)" 
" Index Cond: (active = true)" 
"Planning time: 0.343 ms" 
"Execution time: 349.403 ms"

出典

2016-07-03 12:40:43

グレート説明を照会します。 – Viren

は、それが言及されている任意の情報源です。 – Viren

@Viren：クエリの計画は非常に複雑です。しかし、あなたがそれを打ち砕くなら、それは基本的な論理です。マニュアルの* ["Plannerで使用される統計"]（https://www.postgresql.org/docs/current/static/planner-stats.html）*から始めてください。詳細については、* [プランナーが統計をどのように使用するか]（https://www.postgresql.org/docs/current/static/planner-stats-details.html）*を参照してください。これは多くの要因によって異なりますが、通常、インデックススキャンでは、テーブルから約5％以下しか選択しなかった場合にのみ支払いが行われます。シンプルで愚かなシーケンシャルスキャンの場合よりもオーバーヘッドがかなり多くなります。 –

あなたはそれがseqscanよりもはるかに高いのですがわかりますインデックス・スキャン

SET enable_seqscan = OFF;

のコストをテストして開始する必要があります。おそらく、テーブル内の行の合計が非常に少ないことがあります。 *を選択しているので、Postgresは各行をルックアップする必要があります。したがって、インデックスを確認してからほとんどのページを取得するのではなく、すべての行を順番にスキャンするほうがはるかに簡単です。

出典

2016-07-03 12:20:40

部分的な索引付けが有効でない

答えて

関連する問題