Q&I is very high

Last post 10-02-2020, 7:45 AM by RHor. 6 replies.
Sort Posts: Previous Next
  • Q&I is very high
    Posted: 09-21-2020, 5:38 AM

    Hi,

    I have a performance issue with DDB, but I am unable to locate the root cause of it. The problem is that Q&I times are getting very high to the point where no new clients can be added to SP. I wouldn't suspect hardware as MAs are build on healthy hardware. MAs don't exceed sizing, there are no hardware issues, no problems in event log, I've tested DDB NVMe disk with Iometer and I got ~50k IOPS with 160us Avg. Response Time (8 workers), up to ~126k IOPS with 380us Avg. Response time (48 workers) which is plenty more then what is required for this setup (2-Nodes Gridstore, Extra Large config)
    https://documentation.commvault.com/commvault/v11_sp16/article?p=1648.htm

    There are 2 DDBs per GRID, one is around 4x the size of the other and also where Q&I performance hurts the most, so I am focusing there.

    Commcell version:                     SP16 HPK61 (LTS)
    MediaAgent setup:                     2-Node GRID
    # DDBs per MA:                        2
    # partitions per DDB:                 2
    DDB disk type:                        NVMe
    DDB disk size:                        2TB

    DDB1:
    size:                                 900 GB
    partition size:                       450 GB
    Total App size:                       1.25 PB
    Total data on disk:                   390 TB
    Total # of unique blocks:             2,200,000,000
    # of pending deletes:                 0
    Baseline size:                        50 TB [Application size 90 TB]
    Avg. Q&I last 3 days:                 1,850 us
    Avg. Q&I last 14 days:                1,645 us
    Transactional DDB:                    NO
    Garbage Collection:                   NO
    Pruning logs for reconstruction:      NO

    So the questions are:
    1) Any idea why would I get so high Q&I times?
    2) What else can I check?
    3) Can anyone share their DDB statistics from similar setup?



    Thanks,
    Robert
  • Re: Q&I is very high
    Posted: 09-21-2020, 8:17 AM

    Hi,

    Which service pack you are on? It depends on how many primary and secondary records are in DDB.  recently we have few enhancement on DDB side. Talk to support and convert to existing DDB to V5 (Mark and Sweep). if needed increase IDX and DAT memory. 

     

    Talk to support they can help you on these steps.

  • Re: Q&I is very high
    Posted: 09-21-2020, 8:36 AM

    Hi,

    this is SP16 HPK61

    conversion to DDB V5 is already planned, but can't get there yet as SLA is pretty strict and I will need a downtime to get there.

    As of increasing IDX and DAT memory it is something I'm aware, but have not done before so I guess support is the way to go here.

    Is there anything else that I can do?


    Thanks,
    Robert
  • Re: Q&I is very high
    Posted: 09-22-2020, 7:07 AM

    Hi Nirav,

    I'm also interested in something you mentioned

    Nirav kapadia:

    [..]

    It depends on how many primary and secondary records are in DDB. 

    [..]

     

    I was looking into it, but was unable to find any documents in BOL mentioning limitations.

    Is there any hard limit or if not, then maybe a specific value when one should begin to worry about primary/secondary records?


    Thanks,
    Robert
  • Re: Q&I is very high
    Posted: 09-23-2020, 4:48 AM

    Judging from the Application Size, Baseline Size and the high Primary Block Count - I suspect that you have a high change rate and do not have have extended retention turned on your Storage Policy.  Assuming that you are not doing DDB backups at the same time as when the jobs are running, the best thing to help is to use the DDB v4 Gen2 which does a daily mark and sweep instead of having to maintain a Primary Block counter.

    That said, how large was the file used for IOMeter (if RAW isn't an option, I try to create as large a file as I can)?

    Also, is this a recent phenomenon - have the DDB graphs done a hockey-stick and trended up? Is it a recent Data Aging Job that has aged off a lot off jobs triggered it (the process of moving Primary & Scondary into the Zero-Reference table can be a massive task)?  Do you see much in the way of pending deletes?

    Check if you have some clients that are deduping poorly and then create a Storage Policy that just writes compressed backups instead to reduce the load on the DDB.

    You could also trial reducing the number of connections to the DDB (Control Panel/Media Management = Maximum number of parallel data transfer operations for deduplication engine)

  • Re: Q&I is very high
    Posted: 09-24-2020, 10:38 AM

    Hi Anthony,

     

    "Judging from the Application Size, Baseline Size and the high Primary Block Count - I suspect that you have a high change rate and do not have have extended retention turned on your Storage Policy."

     

    You are right.

     

    "Assuming that you are not doing DDB backups at the same time as when the jobs are running, the best thing to help is to use the DDB v4 Gen2 which does a daily mark and sweep instead of having to maintain a Primary Block counter."

    That is correct assumption. With support engagement we've been able to Enable Garbage Collection and pruning logs for reconstruction today. Fortunatelly it turned out that time consuming compactfile secondary wasn't neccessary as this DDB was created on SP14. I hope we will see some results soon.

     

    "That said, how large was the file used for IOMeter (if RAW isn't an option, I try to create as large a file as I can)?"

    It was tested on 500GB file with should be sufficient given that current partition size is around 450GB.

    "Also, is this a recent phenomenon - have the DDB graphs done a hockey-stick and trended up? Is it a recent Data Aging Job that has aged off a lot off jobs triggered it (the process of moving Primary & Scondary into the Zero-Reference table can be a massive task)?  Do you see much in the way of pending deletes?"

    Not exactly. Q&I was high for some time, but it was closer to 1300-1400us and lately it started to rise. This happened before but it usually went up to 1600s us and go back to it's usual 1300-1400us. This is the first time it went as high as 1800-1900us.

    "Check if you have some clients that are deduping poorly and then create a Storage Policy that just writes compressed backups instead to reduce the load on the DDB.

    You could also trial reducing the number of connections to the DDB (Control Panel/Media Management = Maximum number of parallel data transfer operations for deduplication engine)"

    These are valid points, but as we made somewhat big change today, I will hold on from additional changes until I see the results. Thanks for the hint. I think I will go back and look into this in 2 weeks when I'll get fresh 3 & 14 days statistics for upgraded DDB.

     

    This is all very helpful. Thank you.


    Thanks,
    Robert
  • Re: Q&I is very high
    Posted: 10-02-2020, 7:45 AM

    Hi,

     

    Just a quick follow up: It has been 1 week since we enabled Garbage Collection and Pruning Logs and I can already tell that there is a significant improvement in Q&I times.

    Avg. Q&I time in last 3 days dropped from over 1800us to ~700us which is great

    Avg. Q&I time in last 14 days dropped from 1650us to ~1500us which is just a slight decrease but it hasn't been 2 weeks yet, so I expect more decrease in next week or two.

     

    I will also keep in mind other options mentioned in this thread for future use cases.

    Thank you Nirav and Anthony for your help.


    Thanks,
    Robert
The content of the forums, threads and posts reflects the thoughts and opinions of each author, and does not represent the thoughts, opinions, plans or strategies of Commvault Systems, Inc. ("Commvault") and Commvault undertakes no obligation to update, correct or modify any statements made in this forum. Any and all third party links, statements, comments, or feedback posted to, or otherwise provided by this forum, thread or post are not affiliated with, nor endorsed by, Commvault.
Commvault, Commvault and logo, the “CV” logo, Commvault Systems, Solving Forward, SIM, Singular Information Management, Simpana, Commvault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director, CommServe, CommCell, SnapProtect, ROMS, and CommValue, are trademarks or registered trademarks of Commvault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.
Close
Copyright © 2020 Commvault | All Rights Reserved. | Legal | Privacy Policy