CV11 SP14 SDT PIPE

Last post 05-23-2019, 2:28 AM by pasko@mkb.ru. 13 replies.
Sort Posts: Previous Next
  • CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 10:03 AM

    Hello !

    When backing up archive logs also often became an error SDT PIPE. The task goes to pending status, you can click the resume and the task will be completed, but this error will be in the log file:

    RMAN-03009: failure of backup command on ch5 channel at 05/13/2019 15:01:15
    ORA-19502: write error on file "1285784_PROGNOZ_tmu1es8u_1_1", block number 121345 (block size=512)
    ORA-27030: skgfwrt: sbtwrite2 returned error
    ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
    sbtwrite2: Job[1285784] thread[13340]: Unable to write DATA buffer.
    channel ch5 disabled, job failed on it will be run on another channel
    This error occurs for different media agent servers and database servers.
  • Re: CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 10:08 AM

    Hi,


    Can you please check the relevant MediaAgent cvd.log for the errors? 

    If the issues persists, Please reach out support and they will help you. 

    BTW, get us the ClOraAgent.log/ORASBT.log from Oracle clinet and cvd.log from MA to check exactly what happening.

  • Re: CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 10:18 AM

    I collected and uploaded all files at "SDT PIPE.zip"

    Attachment: SDT PIPE.zip
  • Re: CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 10:53 AM
    • efg is not online. Last active: 06-25-2019, 9:27 AM efg
    • Top 10 Contributor
    • Joined on 02-02-2010
    • CommVault Tinton Falls NJ
    • Expert
    • Points 1,658

    Took a quick look at the uploaded logs from the MediaAgent.  It appears that the services for the media agent in use were restarted mid backup:

    2692  408   05/13 15:01:00 1285767 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935314]
    2692  408   05/13 15:01:00 1285791 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935335]
    2692  408   05/13 15:01:00 1285787 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935369]
    2692  408   05/13 15:01:00 1285812 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935425]
    2692  408   05/13 15:01:00 1285800 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935413]
    2692  408   05/13 15:01:00 1285784 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935326]
    2692  408   05/13 15:01:00 1285781 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935363]
    2692  408   05/13 15:01:00 1285815 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935427]
    2692  408   05/13 15:01:00 1285782 SdtBase::setLastErr: Setting last err [98][Services on the tail side of the SDT pipe are going down.] RCId [12935398]
    2692  1ee8  05/13 15:01:00 1285784 SdtTailServer::AddToReadMask: Server quit flag is set. Client [SDTPipe_prognoz-db2_simpana-ma-04_1285784_1557748854_13340_1058001536_0x126ccbc0], Id [3660] RCId [12935326]
    2692  1ee8  05/13 15:01:00 1285784 SdtTail::onFreeBufferAvailable: Cannot add the client [SDTPipe_prognoz-db2_simpana-ma-04_1285784_1557748854_13340_1058001536_0x126ccbc0], Id [3660] to the select mask. RCId [12935326]
    2692  1ee8  05/13 15:01:00 1285784 SdtTail::onError: Going to wakeup the select thread, so that it can drop the connection. RCId [12935326]
    2692  2a30  05/13 15:01:00 1285793 12935386-# [    DM_RECEIVER] Data Mover Type = [1], DrivePoolType = [10001]
    2692  2a30  05/13 15:01:00 1285793 12935386-# [    DM_RECEIVER] Did not find a DataWriter for the media group [17860]
    2692  2a30  05/13 15:01:00 1285793 12935386-# [    DM_RECEIVER] Instantiating the REGULAR DataMover for MediaGroup Id [17860]
    2692  2a30  05/13 15:01:00 1285793 12935386-# [    DM_RECEIVER] Added a new DataWriter for the media group [17860].. will proceed to mount
    2692  2a30  05/13 15:01:00 1285793 12935386-# [    DM_RECEIVER] USE COUNT for media group [17860] is [1]
    2692  2a30  05/13 15:01:00 1285793 12935386-# [    DM_BASE    ] Initializing DataMoverBase for MediaGroupId = 17860 ...
    2692  1b68  05/13 15:01:00 1285767 SdtTailServer::DropClientWorker: Removing Client [SDTPipe_simpana-ma-04_simpana-ma-04_1285767_1557745554_10768_11068_000000DAFB4A6450], Id [6480] from the list. RCId [12935314]
    2692  1b68  05/13 15:01:00 1285767 SdtTail::onDisconnect: Sending error [98][Services on the tail side of the SDT pipe are going down.] to the head. RCId [12935314]
    2692  2f18  05/13 15:01:00 1285791 SdtTailServer::DropClientWorker: Removing Client [SDTPipe_ts-prime-db1_simpana-ma-04_1285791_1557748847_1536_16476_000000004E989AB0], Id [6204] from the list. RCId [12935335]
    2692  2f18  05/13 15:01:00 1285791 SdtTail::onDisconnect: Sending error [98][Services on the tail side of the SDT pipe are going down.] to the head. RCId [12935335]
    2692  2da4  05/13 15:01:00 1285787 SdtTailServer::DropClientWorker: Removing Client [SDTPipe_twcms-db2_simpana-ma-04_1285787_1557748849_41730_577192544_0x11647470], Id [5956] from the list. RCId [12935369]
    2692  2da4  05/13 15:01:00 1285787 SdtTail::onDisconnect: Sending error [98][Services on the tail side of the SDT pipe are going down.] to the head. RCId [12935369]
    2692  2f18  05/13 15:01:00 1285791 SdtTail::onDisconnect: Sent error code to the head RCId [12935335]
    2692  2864  05/13 15:01:00 1285812 SdtTailServer::DropClientWorker: Removing Client [SDTPipe_cft-db-arch_simpana-ma-04_1285812_1557748851_58226_-758967616_0x12f93e30], Id [4296] from the list. RCId [12935425]

    You may want to check to see why the services were restarted on this MA.  Apparently from this log there were several jobs impacted by the stopping of the services.  Do you have updates installed on a schedule?  This would explain why the issue is intermittent.  Looking at the over all job it appears to have completed successfully (from what I could tell in the logs)


    Ernst F. Graeler
    Senior Engineer III
    Development
  • Re: CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 10:58 AM

    This error began to appear after the next hotfix for V11 SP14.

    It occurs on different media servers, we have 12 of them and different database servers.

    All media servers are working normally, updates and other procedures were not performed at this time.

  • Re: CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 11:10 AM
    • efg is not online. Last active: 06-25-2019, 9:27 AM efg
    • Top 10 Contributor
    • Joined on 02-02-2010
    • CommVault Tinton Falls NJ
    • Expert
    • Points 1,658

    Hmm...  this is odd then.   You may want to open a support ticket, as there was definitely an interruption in the services on the media agent, which should be investigated.

    The jobs are completing successfully, and due to the way RMAN confirms the success of each piece that gets backed up, these are fully recoverable.  But it would still be good to get to the bottom of the service interruption.


    Ernst F. Graeler
    Senior Engineer III
    Development
  • Re: CV11 SP14 SDT PIPE
    Posted: 05-13-2019, 1:17 PM
    • efg is not online. Last active: 06-25-2019, 9:27 AM efg
    • Top 10 Contributor
    • Joined on 02-02-2010
    • CommVault Tinton Falls NJ
    • Expert
    • Points 1,658

    After taking a closer look at the logs, there is a service interruption, but the services are not restarting as I had earlier thought.   None the less, something is causing the connection to break and shut down on the MediaAgent.  This in turn is causing the error that I highlighted in my earlier reply.  Apparently its affecting several jobs when it occurs.   I would still recommend getting someone from support to take a closer look at this.


    Ernst F. Graeler
    Senior Engineer III
    Development
  • Re: CV11 SP14 SDT PIPE
    Posted: 05-14-2019, 2:38 AM

    Now i created INCIDENT # 190514-107.

    Such error arises only for Oracle backups.

    When backing up file resources, SCL databases, virtual machines, this error does not occur.

  • Re: CV11 SP14 SDT PIPE
    Posted: 05-21-2019, 3:49 PM

    try updating the firewall on the MA and the client and test. We had similar issue in SP14, firewall update has fixed the issue

  • Re: CV11 SP14 SDT PIPE
    Posted: 05-21-2019, 4:41 PM
    • efg is not online. Last active: 06-25-2019, 9:27 AM efg
    • Top 10 Contributor
    • Joined on 02-02-2010
    • CommVault Tinton Falls NJ
    • Expert
    • Points 1,658

    Hey, I just heard there was a similar issue that was fixed not too long ago.   That being the case, this should be fixed in the latest hotfixpack.   When was the last time you installed the latest hotfixpack?  Looking at the docs the latest available one is HPK23.  Not sure what your change control is like, but would it be possible to try updating to that version?


    Ernst F. Graeler
    Senior Engineer III
    Development
  • Re: CV11 SP14 SDT PIPE
    Posted: 05-22-2019, 6:00 AM

    Can anyone confirm that the HPK23 solves this Problem? I ve 2 Environments with the same errors after updating SP12 to Sp14.


  • Re: CV11 SP14 SDT PIPE
    Posted: 05-22-2019, 6:09 AM

    Hello

    I plan to install a hotfix HPK23 today or tomorrow.I will report the result

  • Re: CV11 SP14 SDT PIPE
    Posted: 05-22-2019, 7:26 PM

    Hi,

    This is a timing issue which can happen if multiple network agents are configured for data pipeline. We have fixed the issue on both SP14 and SP15. The update should be available in hotfix pack soon.

    Meanwhile you can set the number of network agents to 1. For "File System Agent" it is available under subclient properties->Advanced->Performance->Resource Tuning.

    For other agents like Oracle its location may vary.

    Thanks,

    -Saurabh

  • Re: CV11 SP14 SDT PIPE
    Posted: 05-23-2019, 2:28 AM

    Hello !

    Yesterday i installed a new hotfix HPK23 (V11 SP14) on a Commcell server and all Media Agents.All servers i overloaded (still installed Windows update). Oracle DB backups went all night - the error did not happen again. I am glad. 

    Thanks to everyone who helped deal with the error !!! 

The content of the forums, threads and posts reflects the thoughts and opinions of each author, and does not represent the thoughts, opinions, plans or strategies of Commvault Systems, Inc. ("Commvault") and Commvault undertakes no obligation to update, correct or modify any statements made in this forum. Any and all third party links, statements, comments, or feedback posted to, or otherwise provided by this forum, thread or post are not affiliated with, nor endorsed by, Commvault.
Commvault, Commvault and logo, the “CV” logo, Commvault Systems, Solving Forward, SIM, Singular Information Management, Simpana, Commvault Galaxy, Unified Data Management, QiNetix, Quick Recovery, QR, CommNet, GridStor, Vault Tracker, InnerVault, QuickSnap, QSnap, Recovery Director, CommServe, CommCell, SnapProtect, ROMS, and CommValue, are trademarks or registered trademarks of Commvault Systems, Inc. All other third party brands, products, service names, trademarks, or registered service marks are the property of and used to identify the products or services of their respective owners. All specifications are subject to change without notice.
Close
Copyright © 2019 Commvault | All Rights Reserved. | Legal | Privacy Policy