MySQL 8.0.14でSELECT COUNT(*)が加速しない！- 「innodb_parallel_read_threads」検証その２

「その１」と真逆のタイトル

atsuizo.hatenadiary.jp
の続きです。

要するに、innodb_parallel_read_threadsの効果がないケースの話です。

いや、思った以上に、効果がない。。。

というわけで、暗い感じのスタートになりましたが、前回の続きで
「3.「innodb_parallel_read_threads」は、フルスキャン以外のPKスキャンに効くか」
をやります。

環境

「その１」と同じです。

レコード件数も2^24＝16,777,216件のままです。

今度はバージョン差異は見ずに、MySQL 8.0.14の中だけで、並列化の恩恵を受けられるかどうかを確認します。

条件

「PKクラスタインデックスにしか効かない」というドキュメントの記述に従い、PKだけを対象にして条件のパターンを用意しました。

全件カウント(前回の結果)
PK指定で、中間範囲（Between）のカウント
PK指定で、上全部（>）のカウント
PK指定で、下全部（<）のカウント
PK指定で、IN句を使ってランダム10件のカウント・・・PKなんだから当然10件。
絞り込みはせず、カウントではなくPK列の値を取得

いずれも、PK列以外の情報以外は必要としないクエリです。

検証結果

条件	innodb_parallel_read_threads	1回目（秒）	2回目（秒）	3回目（秒）
1	4	13.16	12.16	12.55
1	1	33.07	30.18	29.51
2	4	16.11	15.10	16.67
2	1	17.06	15.52	15.50
3	4	16.91	14.70	14.60
3	1	16.47	14.40	15.22
4	4	21.95	22.00	21.61
4	1	21.94	19.85	19.94
5	4	0.01	0.00	0.00
5	1	0.01	0.00	0.00
6	4	42.44	42.02	41.80
6	1	44.37	42.65	40.64

まず、並列化によって速くなっているのは、前回やった「全件カウント」だけでした。

また、IN句によるランダム10件は、従来からのPK指定の抽出なので、並列化に関係なく「チョッパヤ」です。

それにしても、並列化の効果がでるものとでないもの、何が違うのか。
実行計画を取ったところ、以下のような状態です。

条件	select_type	type	possible_keys	key	Extra
1	SIMPLE	index	NULL	PRIMARY	Using index
2	SIMPLE	range	PRIMARY	PRIMARY	Using where; Using index
3	SIMPLE	range	PRIMARY	PRIMARY	Using where; Using index
4	SIMPLE	range	PRIMARY	PRIMARY	Using where; Using index
5	SIMPLE	range	PRIMARY	PRIMARY	Using where; Using index
6	SIMPLE	index	NULL	PRIMARY	Using index

２～５は違いがあるとして、６は違いがない。が、並列化の恩恵は受けない。
うーん。わからん。

諦めかけていたとき、Worklogに書いてある！って連絡をいただきました。

worklogにはonly if the request is a non-locking SELECT COUNT(*).
と書いてありますねーhttps://t.co/qCAbRHkAIC
— kentarokitagawa (@keny_lala) 2019年1月23日

Read the sub trees of an index in parallel only if the request is a non-locking SELECT COUNT(*).
Allow for "physical" read (a.k.a read uncommitted) and logical read using MVCC rules.
Additionally refactor the persistent cursor code.
Current scope is limited to providing sufficient infrastructure for DDL operations to read the data in parallel. Making the second phase of CHECK TABLE parallel is an added bonus for now. This speeds up CHECK TABLE a little.
The parallel SELECT COUNT(*) ...; is required because we would like to test the changes independent of any other external component.
We will not implement any locking by the parallel read threads. That is a bigger project and can be done as follow up work. This is because some assumptions inside InnoDB have to change and we will have to handle lock waits and coordinate the rollback in the reader threads.
MySQL :: WL#11720: InnoDB: Parallel read of index

ん？