mysql - slow query due to high index cardinality

mysql - slow query due to high index cardinality



I've got the following mysql query taking approx 55 seconds to complete


SELECT this_.id AS y0_ FROM event this_
INNER JOIN member m1_ ON this_.member_id=m1_.id
INNER JOIN event_type et2_ ON this_.type_id=et2_.id
WHERE m1_.submission_id=40646 AND et2_.name IN ('Salary')
ORDER BY m1_.ni_number ASC, m1_.ident1 ASC, m1_.ident2 ASC, m1_.ident3 ASC, m1_.id ASC, et2_.name ASC LIMIT 15;



If I remove the join/where/order to the 'event_type' table, then the query runs in under 1 second.



So something clearly up with my join to the 'event_type' table, but a similar query in another database with similar database volumes runs absolutely fine. So my suspicion is something wrong with this 1 database.



The 'show create table' of the 'event' table is:


Create Table: CREATE TABLE `event` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`version` bigint(20) NOT NULL,
`data_size` bigint(20) DEFAULT NULL,
`encoded_data` mediumblob,
`last_updated` datetime NOT NULL,
`member_id` bigint(20) NOT NULL,
`parent_event_id` bigint(20) DEFAULT NULL,
`status` varchar(255) DEFAULT NULL,
`type_id` bigint(20) NOT NULL,
`updated_by` varchar(255) NOT NULL,
`failed_workflow_case` varchar(255) DEFAULT NULL,
`failed_workflow_task` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FK5C6729A2434DA80` (`member_id`),
KEY `FK5C6729AE4E22C6E` (`type_id`),
KEY `IND_parent_event_id` (`parent_event_id`),
CONSTRAINT `FK5C6729A2434DA80` FOREIGN KEY (`member_id`) REFERENCES `member` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `FK5C6729AE4E22C6E` FOREIGN KEY (`type_id`) REFERENCES `event_type` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=46241198 DEFAULT CHARSET=latin1



The EXPLAIN of the query is:


+----+-------------+-------+------------+--------+-------------------------------------+-------------------+---------+--------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------+-------------------+---------+--------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | et2_ | NULL | ref | PRIMARY,IND_name | IND_name | 257 | const | 1 | 100.00 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | this_ | NULL | ref | FK5C6729A2434DA80,FK5C6729AE4E22C6E | FK5C6729AE4E22C6E | 8 | iconnect.et2_.id | 3303 | 100.00 | NULL |
| 1 | SIMPLE | m1_ | NULL | eq_ref | PRIMARY,IND_submission_id | PRIMARY | 8 | iconnect.this_.member_id | 1 | 5.00 | Using where |
+----+-------------+-------+------------+--------+-------------------------------------+-------------------+---------+--------------------------+------+----------+----------------------------------------------+



The indexes from the 'event' table are:


+-------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| event | 0 | PRIMARY | 1 | id | A | 14307622 | NULL | NULL | | BTREE | | |
| event | 1 | FK5C6729A2434DA80 | 1 | member_id | A | 4680601 | NULL | NULL | | BTREE | | |
| event | 1 | FK5C6729AE4E22C6E | 1 | type_id | A | 4360 | NULL | NULL | | BTREE | | |
| event | 1 | IND_parent_event_id | 1 | parent_event_id | A | 114404 | NULL | NULL | YES | BTREE | | |
+-------+------------+---------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+



What jumps out at me from this is:
1. why is the EXPLAIN going through 3303 rows and the FK5C6729AE4E22C6E index?
2. why does the FK5C6729AE4E22C6E index have a cardinality of 4360 when there are only 17 rows in the 'event_type' table? could this incorrect cardinality be affecting the query optimizer?



I've done an ANALYZE TABLE on both 'event' and 'event_type' and this has made no difference.



Any suggestions?



execute plan from other server with the same data (loaded from a dump file):


+----+-------------+-------+------------+------+-------------------------------------+-------------------+---------+-----------------+-------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------------------------+-------------------+---------+-----------------+-------+----------+----------------------------------------------+
| 1 | SIMPLE | et2_ | NULL | ALL | PRIMARY | NULL | NULL | NULL | 17 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | m1_ | NULL | ref | PRIMARY,IND_submission_id | IND_submission_id | 8 | const | 27992 | 100.00 | NULL |
| 1 | SIMPLE | this_ | NULL | ref | FK5C6729A2434DA80,FK5C6729AE4E22C6E | FK5C6729A2434DA80 | 8 | iconnect.m1_.id | 3 | 11.11 | Using where |
+----+-------------+-------+------------+------+-------------------------------------+-------------------+---------+-----------------+-------+----------+----------------------------------------------+





stackoverflow.com/questions/13633406/…
– juergen d
Aug 21 at 7:40





a) The statistic is an estimate. MySQL samples random parts, (depending by your configuration), and extrapolates; a factor 200 can indeed have an impact on index choice though b) the most relevant table seems to be member (and specifically indexes on submission_id or ni_number), not event. Please add that table, and make sure they are identical on both servers c) since you already seem to have the correct execution plan on your other server: please add it. Then try it (by e.g. forcing the indexes). Figuring out why MySQL doesn't find that plan on its own would be the 2nd step.
– Solarflare
Aug 21 at 9:31



member


submission_id


ni_number


event





Thanks for the update; that execution plan is about what I expected; is the number of rows per submission id correct? In that case, the incorrect statistics seem to be the problem (3k < 27k, while it actually should be (700k > 27k), and I assume they are (more) correct on the other server. Check innodb_stats_persistent_sample_pages (assuming innodb_stats_persistent is enabled); I'd assume the default (20) should be fine, but make sure noone lowered or disabled it. In any case, the current value seems to be too low, so increase it and try analyze again.
– Solarflare
Aug 21 at 14:42


innodb_stats_persistent_sample_pages


innodb_stats_persistent


analyze





There are 43k event records for 14k members with the submission_id specified, 14k of those event records are for the event_type record specified by the name arg. So my query should be finding 14k results but returning only the first 15. I'm failing to grasp how that is relating to the rows returns in either of the execution plans.
– shuttsy
Aug 22 at 12:51





The innodb stats are persisted and the sample_pages is 20 on both servers.
– shuttsy
Aug 22 at 12:52




1 Answer
1



Increasing innodb_stats_persistent_sample_pages from 20 to 100, then running ANALYZE TABLE on event/member tables changed the cardinality of the indexes and the execution plan, then the query ran in under 1 second. Thanks to Solarflare for the suggestion.






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ャフサォクコ ケウ,コ,ワ メ,ロスョノ゙,クネ,フムカヤヲニ,エコ゚ツ ウイオン゙ケワサネォキモュキォウイノンコチ゚メヌナイゥフュ,カヒウネェ ネ,ホノケ,ムュキ ッボーミュハ,チ ツス ィ メウイマヤ,゙ウチ ヅ ロ,ォジヌェ ャヌット ェ,マャ,チナエヒネソキツテ トホヲヲミーァ

Node.js puppeteer - Use values from array in a loop to cycle through pages