follow center 优化之 postgresql 优化

March 29, 2017

select

发现速度很慢了,查message要9s,我的天

取出sql

        select * from ( select * from ( select * from (
            select m.*, c.message_id as collect, c.created_date as collect_date
                from (
    select * from all_message m
    ) m
            LEFT OUTER JOIN collect c
                ON m.id = c.message_id
                and c.user_id=491
        ) s  where created_at > '2016-12-23 22:00:43.999000' ) s ) s where
        lower(s.name) in (
            select lower(name) from god where id in(
                    select god_id from follow_who where user_id=491
                )
        )

     order by created_at  limit 50

加上 EXPLAIN ANALYZE 发现主要是慢在 order by created_at

发现没索引,么,我的错

CREATE INDEX index_message_created_at
ON message (created_at);

然后再查,550ms

不说什么了,我面壁思过去了

sync 会卡住

出在 insertIfNotExist 上

id = pg.insertIfNotExist(pg, 'message', m, "id_str='%s' and m_type='twitter'" % tweet.id_str)

建个联合idnex,看看会不会好一点

CREATE INDEX index_message_id_str_m_type
ON message (id_str, m_type);

查某个god的最近信息要29s

   EXPLAIN ANALYZE  select * from (
        select * from ( select * from (
            select m.*, c.message_id as anki, c.created_date as anki_date
                from (
            select m.*, c.message_id as collect, c.created_date as collect_date
                from (
    select * from message m where 1=1
    ) m
            LEFT OUTER JOIN collect c
                ON m.id = c.message_id
                and c.user_id=491
        ) m
            LEFT OUTER JOIN anki_save c
                ON m.id = c.message_id
                and c.user_id=491
        ) s ) s where  lower(s.user_name)=lower('ehan_bif')
    ) s where
            s.created_at < '2017-03-08 23:01:14.552000'

     order by created_at desc  limit 10

24.6s 天啊

              ->  Index Scan Backward using index_message_created_at on message m  (cost=0.42..106621.41 rows=3758 width=1240) (actual time=11102.946..24699.697 rows=10 loops=1)

如果不限定时间反而只要 8s

把这条时间限定放入内部的 sql 中, 也只要 8s

先优化 8 s 的问题吧. 拆解后, 只是一个 LEFT OUTER JOIN 就要 8 s 了

   EXPLAIN ANALYZE  
       
           
            select m.*, c.message_id as collect, c.created_date as collect_date
                from (
    select * from message m where 1=1
    ) m
            LEFT OUTER JOIN collect c
                ON m.id = c.message_id
                and c.user_id=491

这 8 s 花在了 Hash Cond 上, 因为用了 LEFT OUTER JOIN

lower(s.user_name)=lower('ehan_bif') 原本可以过滤很多数据, 发现因为用了lower, 导致无法使用索引, 取数据这里干脆一致用小写

user_name 条件启用后, 只要几ms, 看来不能乱用 lower, 会让条件过滤不可用.

搜索要花近 25s

拆解为这个sql 还是要 20s

        select * from (
        select * from ( select * from (
            select m.*, c.message_id as anki, c.created_date as anki_date
                from (
            select m.*, c.message_id as collect, c.created_date as collect_date
                from (
    select * from all_message m where 1=1
    ) m
            LEFT OUTER JOIN collect c
                ON m.id = c.message_id
                and c.user_id=491
        ) m
            LEFT OUTER JOIN anki_save c
                ON m.id = c.message_id
                and c.user_id=491
        ) s ) s where
            upper(s.text) like '%怎么快了%' or upper(s.content::text) like '%怎么快了%'

    ) s where
            s.created_at < '2017-03-28 23:53:17.427000'

     limit 10
  ->  Hash Join  (cost=116.98..168887.32 rows=349617 width=1318) (actual time=1.485..20084.332 rows=699525 loops=1)

text 和 s.content::text 没有索引, 导致慢.

建立索引时报出错误:

PostgreSQL said: index row requires 10320 bytes, maximum size is 8191

这时就考虑使用 postgresql 的全文搜索.md


comments powered by Disqus