mysql大數據查詢優化的示例分析

148次閱讀

沒有評論

共計 3339 個字符，預計需要花費 9 分鐘才能閱讀完成。

這篇文章給大家分享的是有關 mysql 大數據查詢優化的示例分析的內容。丸趣 TV 小編覺得挺實用的，因此分享給大家做個參考，一起跟隨丸趣 TV 小編過來看看吧。

mysql 數據量少，優化沒必要，數據量大，優化少不了，不優化一個查詢 10 秒，優化得當，同樣查詢 10 毫秒。

這是多么痛的領悟！

mysql 優化，說程序員的話就是：索引優化和 where 條件優化。

實驗環境：MacBook Pro MJLQ2CH/A，mysql5.7，數據量：212 萬 +

ONE：

 select * from article
 INNER JOIN (
 SELECT id
 FROM article
 WHERE
 length(content_url)   0 and
 (select status from source where id = article.source_id)=1 and
 (select status from category where id = article.category_id)=1 and
 status = 1 and id   2164931
 order by stick desc,pub_time desc
 limit 240,15
 ) AS t
USING(id);

咋一看，大佬肯定會想殺了我，沒事做啥自關聯，還是 inner join。XX 樓的，把我的殺豬刀拿來，我要宰了博主！?。?/p>

說實話，早上出門我的腦袋沒被門擠，我也不想這樣的。

1. 數據量大了，你要做 offset 很大的分頁查詢，還真的這樣提速，原因 — 用 join 子表中的 id 覆蓋到全表，避免全表掃描。

看我的 order by（細語：不就是個 order by，TM 誰不會寫），你把這個 order by 換成你自己的表中的字段 desc or explain 看看。Extra — filesort ! shit !

2. 針對這種多個條件的 order by，通常我們會直接給兩個字段分別加 index，然而還是會 Extra — filesort。另辟蹊徑，給 order by 后面的所有條件加一個聯合索引，注意順序一定要和你的 order by 順序一致。這樣 Extra 就只剩下 where 了。

再看看 where，(select status from source where id = article.source_id)=1 and … 又啥 JB 寫法！

3. 想過用 join+index 的方式，最后測試出來，和這種方式幾乎無差別。生產環境是這樣寫的，那就這樣吧，還能少兩個索引（source_id，category_id），懶病犯了誰都阻擋不了，以后吃虧了又回來繼續優化唄。

4. 這個點是我昨晚才 get 到的，where 條件的滿足順序是優先滿足最后一個條件，從右到左，經過刪除 index 測試，確實有效果，能從 6 秒降到 4 秒，優化了 index 之后再次測試發現順序對耗時影響幾乎可以忽略不計，0.X 毫秒。

TWO：

 select * from article
 INNER JOIN ( SELECT id FROM article WHERE INSTR(ifnull(title,), 戰狼 )   0 and status != 9
 order by pub_time desc
 limit 100,10
 ) AS t USING(id);

嗯——又是 inner join…….

INSTR(ifnull(title,), 戰狼 )   0，為啥不用 like......

1. 考慮到這是管理平臺的搜索，沒有去搜索引擎上搜，搜索引擎是一個小時才同步一次數據，數據不全。管理人員搜索時只管他要的結果，like %XX% 不能走索引，效率比 instr 低了 5 倍，又測試了 regexp .*XX*.，還是比 instr 耗時多一點，索性 …..

desc or explain 看看，filesort..... 給 pub_time 加個 index 看看，還是 filesort.....

2. 這種情況有另外一種方案，SELECT id FROM article force index(pub_time)，指定使用這個索引。但是這種寫法太缺靈活性了，OUT！百度一下，有高人指點迷津：把 status 和 pub_time 建個聯合索引（pub_time_status，order 的條件在前），讓 where 查詢的時候，把這個 index 自動 force 上。

THREE：

select * from article where status != 9 order by pub_time desc limit 100000,25;
desc or explain，還是 filesort..... 前面不是給 status 和 pub_time 建了聯合索引了嗎，tell me why......

好吧，我也不知道，把 status 和 pub_time 再建個聯合索引 status_pub_time，這次 where 條件在前，explain 沒 filesort 了，但是這個 index 卻沒有被使用，它勾搭出了 pub_time_status。搞不懂啊

同時我又 explain 了 TWO 的 SQL，都是如下圖：

這二者中刪除任何一個都不行，刪除一個，就有 sql 會 filesort！

FOUR：

SELECT * from follow
 where (((SELECT status FROM source WHERE id=follow.source_id)=1 and follow.type=1) or ((select status from topic WHERE id=follow.source_id)=1 and follow.type=2)) AND user_id=10054
 ORDER BY sort limit 15,15;
 SELECT * from follow inner join(
 SELECT id from follow
 where (((SELECT status FROM source WHERE id=follow.source_id)=1 and follow.type=1) or ((select status from topic WHERE id=follow.source_id)=1 and follow.type=2)) AND user_id=10054
 ORDER BY sort limit 15,15
 ) as t using(id);
 (SELECT id, source_id, user_id, temporary, sort, follow_time, read_time,type from follow where (SELECT status FROM source WHERE id=follow.source_id)=1 and follow.type=1 and user_id=10054)
 union all
 (SELECT id, source_id, user_id, temporary, sort, follow_time, read_time,type from follow where (select status from topic WHERE id=follow.source_id)=1 and follow.type=2 and user_id=10054)
 ORDER BY sort limit 15,15;

看看這三句 sql，interesting，是不是！

為了公平起見，我已經優化了索引，user_id_sort(user_id,sort)，讓 where 在用 user_id 判斷時 force 上這個索引。

第一句：0.48ms

第二句：0.42ms

第三句：6ms，導致時間長那么多的原因是 union(查詢兩次表，合并成子表) 后不能用 index 覆蓋到 order by 的 sort 上

有的時候 union 不一定比 or 快。

感謝各位的閱讀！關于“mysql 大數據查詢優化的示例分析”這篇文章就分享到這里了，希望以上內容可以對大家有一定的幫助，讓大家可以學到更多知識，如果覺得文章不錯，可以把它分享出去讓更多的人看到吧！

正文完