Java自学者论坛

 找回密码
 立即注册

手机号码,快捷登录

恭喜Java自学者论坛(https://www.javazxz.com)已经为数万Java学习者服务超过8年了!积累会员资料超过10000G+
成为本站VIP会员,下载本站10000G+会员资源,会员资料板块,购买链接:点击进入购买VIP会员

JAVA高级面试进阶训练营视频教程

Java架构师系统进阶VIP课程

分布式高可用全栈开发微服务教程Go语言视频零基础入门到精通Java架构师3期(课件+源码)
Java开发全终端实战租房项目视频教程SpringBoot2.X入门到高级使用教程大数据培训第六期全套视频教程深度学习(CNN RNN GAN)算法原理Java亿级流量电商系统视频教程
互联网架构师视频教程年薪50万Spark2.0从入门到精通年薪50万!人工智能学习路线教程年薪50万大数据入门到精通学习路线年薪50万机器学习入门到精通教程
仿小米商城类app和小程序视频教程深度学习数据分析基础到实战最新黑马javaEE2.1就业课程从 0到JVM实战高手教程MySQL入门到精通教程
查看: 416|回复: 0

生产上数据库大量的latch free 导致的CPU资源耗尽的问题的解决

[复制链接]
  • TA的每日心情
    奋斗
    2024-11-24 15:47
  • 签到天数: 804 天

    [LV.10]以坛为家III

    2053

    主题

    2111

    帖子

    72万

    积分

    管理员

    Rank: 9Rank: 9Rank: 9

    积分
    726782
    发表于 2021-5-8 16:37:49 | 显示全部楼层 |阅读模式

    中午的时候,我们生产上的某个数据库,cpu一直居高不下

    通过例如以下的sql语句,我们查看当时数据库的等待,争用的情况:

    select s.SID,
           s.SERIAL#,
           'kill -9 ' || p.SPID,
           s.MACHINE,
           s.OSUSER,
           s.PROGRAM,
           s.USERNAME,
           s.last_call_et,
           a.SQL_ID,
           s.LOGON_TIME,
           a.SQL_TEXT,
           a.SQL_FULLTEXT,
           w.EVENT,
           a.DISK_READS,
           a.BUFFER_GETS
      from v$process p, v$session s, v$sqlarea a, v$session_wait w
     where p.ADDR = s.PADDR
       and s.SQL_ID = a.sql_id
       and s.sid = w.SID
       and s.STATUS = 'ACTIVE'
     order by s.last_call_et desc;

    从event能够看到,是latch 的争用导致的原因


    通过假设的sql,查看是什么样的latch

    select * from v$session_wait 
    where event  like 'latch free';
     

    P2就是 这个latch的name。通过v$latchname这个视图就能够知道哪个详细的latch

    1:45:55 PM SQL> select * from v$latchname where latch#=164;
     
        LATCH# NAME                                                                   HASH
    ---------- ---------------------------------------------------------------- ----------
           164 simulator hash latch                                             2233208730


    查看latch的历史情况

    2:11:59 PM SQL> select name,gets,misses,sleeps from v$latch where sleeps >0 order by sleeps desc;
     
    NAME                                                                   GETS     MISSES     SLEEPS
    ---------------------------------------------------------------- ---------- ---------- ----------
    simulator hash latch                                             4827860212  135426899   10890947
    cache buffers chains                                             1619822817 2850976006    4747728
    gc element                                                       4660052091   25748270     175073
    resmgr:schema config                                               91872524     153968      95708
    ges resource hash list                                            174151449    1070556      55459
    Real-time plan statistics latch                                    40953155     651496      44527
    call allocation                                                     3301878     265908      43501
    row cache objects                                                 336300485    4970324      19366


    这个simulator hash latch已经是显著的latch部分

    eagle在他的站点上有篇文章讲到了关于simulator这个

    http://www.eygle.com/archives/2011/11/simulator_lru_latch.html

    simulator意为模拟。也就是说当Oracle在内存中进行数据块处理时。实际上还会在预先分配的Buffer中进行相关信息记录,如DBA信息,当数据块被老化之后,下次读取时。假设请求的数据在Simulator内存中存在,则觉得继续缓存该数据块是有意义的,通过监控并模拟统计这些操作,并对计算结果加权运算。就能够实现对于内存的调整建议。


    在模拟过程中。也是通过Latch来实现的,相关的Latch就有 simulator lru latch 、 simulator hash latch等.

    就Buffer Cache而言。假设系统中该类争用严重,则能够考虑关闭db_cache_advice。消除这部分内部操作对于性能的影响。
    下面是一个相关BUG。在该Bug中,因为DB_CACHE_ADVICE的开启导致了严重的simulator lru latch的竞争:

    Bug 5918642  Heavy latch contention with DB_CACHE_ADVICE on

     This note gives a brief overview of bug 5918642.  
     The content was last updated on: 01-APR-2008
      Click here for details of each of the sections below.

    Affects:

    Product (Component) Oracle Server (Rdbms)
    Range of versions believed to be affected Versions < 11.2
    Versions confirmed as being affected
    Platforms affected Generic (all / most platforms affected)

    Fixed:

    This issue is fixed in

    Symptoms:

    Related To:

    Description

    High simulator lru latch contention can occur when db_cache_advice is
    set to ON if there is a large buffer cache.
    
    
    Workaround:
      Set db_cache_advice to OFF
    

    当然,这个仅仅是治标不治本的做法,这个是显现的表象的问题。根源的问题还是这个sql语句有问题

    当一个数据块读入到sga中时,该块的块头(buffer header)会放置在一个hash bucket的链表(hash chain)中。该内存结构由一系列cache buffers chains子latch保护(又名hash latch或者cbc latch)。对Buffer cache中的块,要select或者update、insert,delete等。都得先获得cache buffers chains子latch,以保证对chain的排他訪问。

    若在过程中发生争用,就会等待latch:cache buffers chains事件。

    产生原因: 1. 低效率的SQL语句(主要体如今逻辑读过高) 在某些环境中,应用程序打开运行同样的低效率SQL语句的多个并发会话。这些SQL语句都设法得到同样的数据集,每次运行都带有高 BUFFER_GETS(逻辑读取)的SQL语句是基本的原因。

    相反,较小的逻辑读意味着较少的latch get操作,从而降低锁存器争用并改善性能。注意v$sql中BUFFER_GETS/EXECUTIONS大的语句。 2.Hot block 当多个会话反复訪问一个或多个由同一个子cache buffers chains锁存器保护的块时。热块就会产生。

    当多个会话争用cache buffers chains子锁存器时,就会出现这个等待事件。有时就算调优了SQL,但多个会话同一时候运行此SQL,那怕仅仅是扫描特定少数块,也是也会出现HOT BLOCK的。

    SELECT P935.SEQUENCEID,
           null FA_SEQUENCEID,
           P935.ORDERID,
           P935.ORGORDERID,
           P935.PRODUCTNAME,
           P935.PRODUCTNUM,
           P935.ORDERTIME,
           P935.LASTUPDATETIME,
           P935.ORDERSTATUS,
           P935.MEMO,
           935 orderCode,
           P935.PAYERACCTCODE,
           P935.PAYERACCTTYPE,
           P935.PAYEEACCTCODE PLATACCTCODE,
           P935.PAYEEACCTTYPE PLATACCTTYPE,
           P936.PAYEEACCTCODE,
           P936.PAYEEACCTTYPE,
           EXT935.PAYER_DISPLAYNAME,
           EXT935.PAYER_NAME,
           EXT935.PAYER_IDC,
           EXT935.PAYER_MEMBERTYPE,
           EXT936.PAYER_DISPLAYNAME PLAT_DISPLAYNAME,
           EXT936.SUBMITNAME PLAT_NAME,
           EXT936.PAYER_IDC PLAT_IDC,
           EXT936.PAYER_MEMBERTYPE PLAT_MEMBERTYPE,
           EXT936.PAYEE_DISPLAYNAME,
           EXT936.PAYEE_NAME,
           EXT936.PAYEE_IDC,
           EXT936.PAYEE_MEMBERTYPE,
           P935.PAYEEDISPLAYNAME WEBSITENAME,
           CASE
             WHEN (SELECT count(*)
                     FROM PAYMENTORDER P936
                    WHERE P936.Ordercode = 936
                      and P936.Orderstatus = 0
                      AND <span style="color:#ff0000;">P936.Relatedsequenceid = P935.SEQUENCEID</span>) > 0 THEN
              0
             ELSE
              1
           END AS SHARINGRESULT,
           CASE D935.Dealcode
             WHEN 210 then
              14
             else
              D935.DEALTYPE
           end PAYMETHOD,
           D935.DEALAMOUNT,
           G935.EXT1,
           G935.Ext2,
           G935.PAYERCONTACTTYPE,
           G935.PAYERCONTACT,
           NVL(D935.PAYEEFEE, 0) PAYEEFEE,
           NVL(D935.PAYERFEE, 0) PAYERFEE,
           nvl(MS936.PAYEEFEE, 0) PLATFORMFEE,
           P935.VERSION
      FROM PAYMENTORDER          P935,
           PAYMENTORDER          P936,
           DEAL                  D935,
           GATEWAYORDER          G935,
           MSGATEWAYSHARINGORDER MS936,
           PAYMENTORDEREXT       EXT935,
           PAYMENTORDEREXT       EXT936
     WHERE P936.ORDERCODE = 936
       AND P935.ORDERCODE = 935
       AND P936.RELATEDSEQUENCEID = to_char(P935.SEQUENCEID)
       AND P935.SEQUENCEID = G935.SEQUENCEID(+)
       AND P935.SEQUENCEID = D935.ORDERSEQID(+)
       AND P935.SEQUENCEID = EXT935.ORDERSEQID(+)
       AND P936.SEQUENCEID = EXT936.ORDERSEQID(+)
       AND P936.SEQUENCEID = MS936.SEQUENCEID(+)
       AND MS936.SHARINGTYPE = 1
       AND P935.SEQUENCEID = :1
    UNION
    SELECT P938.SEQUENCEID,
           P935.SEQUENCEID FA_SEQUENCEID,
           P938.ORDERID,
           P938.ORGORDERID,
           P935.PRODUCTNAME,
           P935.PRODUCTNUM,
           P938.ORDERTIME,
           P938.LASTUPDATETIME,
           P938.ORDERSTATUS,
           P938.MEMO,
           938 orderCode,
           P938.PAYERACCTCODE,
           P938.PAYERACCTTYPE,
           P938.PAYEEACCTCODE PLATACCTCODE,
           P938.PAYEEACCTTYPE PLATACCTTYPE,
           P938.PAYEEACCTCODE,
           P938.PAYEEACCTTYPE,
           EXT938.PAYER_DISPLAYNAME,
           EXT938.PAYER_NAME,
           EXT938.PAYER_IDC,
           EXT938.PAYER_MEMBERTYPE,
           EXT938.PAYEE_DISPLAYNAME PLAT_DISPLAYNAME,
           EXT938.SUBMITNAME PLAT_NAME,
           EXT938.PAYEE_IDC PLAT_IDC,
           EXT938.PAYEE_MEMBERTYPE PLAT_MEMBERTYPE,
           EXT938.PAYEE_DISPLAYNAME,
           EXT938.PAYEE_NAME,
           EXT938.PAYEE_IDC,
           EXT938.PAYEE_MEMBERTYPE,
           P935.PAYEEDISPLAYNAME WEBSITENAME,
           null SHARINGRESULT,
           D938.DEALTYPE PAYMETHOD,
           D938.DEALAMOUNT,
           G935.EXT1,
           G935.Ext2,
           G935.PAYERCONTACTTYPE,
           G935.PAYERCONTACT,
           NVL(D938.PAYEEFEE, 0) PAYEEFEE,
           NVL(D938.PAYERFEE, 0) PAYERFEE,
           0 PLATFORMFEE,
           P935.VERSION
      FROM PAYMENTORDER    P935,
           PAYMENTORDER    P938,
           DEAL            D938,
           GATEWAYORDER    G935,
           PAYMENTORDEREXT EXT938
     WHERE P935.ORDERCODE = 935
       AND P938.ORDERCODE = 938
       AND P938.RELATEDSEQUENCEID = to_char(P935.SEQUENCEID)
       AND P935.SEQUENCEID = G935.SEQUENCEID(+)
       AND P938.SEQUENCEID = D938.ORDERSEQID(+)
       AND P938.SEQUENCEID = EXT938.ORDERSEQID(+)
       AND P935.SEQUENCEID = :2

    分析上面的sql,上面标红的地方。等号左边是varchar2的数据类型,括号右边是number的数据类型。会导致数据类型的隐式转换,造成极大的性能影响

    联系研发。改动了sql语句,问题解决

    哎...今天够累的,签到来了1...
    回复

    使用道具 举报

    您需要登录后才可以回帖 登录 | 立即注册

    本版积分规则

    QQ|手机版|小黑屋|Java自学者论坛 ( 声明:本站文章及资料整理自互联网,用于Java自学者交流学习使用,对资料版权不负任何法律责任,若有侵权请及时联系客服屏蔽删除 )

    GMT+8, 2025-1-23 03:56 , Processed in 0.065668 second(s), 27 queries .

    Powered by Discuz! X3.4

    Copyright © 2001-2021, Tencent Cloud.

    快速回复 返回顶部 返回列表