Skip Menu |
 

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the DBD-Oracle CPAN distribution.

Report information
The Basics
Id: 83766
Status: open
Priority: 0/
Queue: DBD-Oracle

People
Owner: Nobody in particular
Requestors: bohica [...] ntlworld.com
Cc: mahdi_sbeih [...] hotmail.com
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



CC: mahdi_sbeih [...] hotmail.com
Subject: zombie processes with ora_connect_with_default_signals = ['CHILD']
Originally reported by Mahdi Sbeih via email to Tim. The docx attachment is original and the HTML version I converted in open office so they may not be the same. Dear Tim, Sorry for sending this email directly to you, but I am not an active member in Perl lists and forums and maybe you can direct this email to whom is responsible for the development of Perl DBD::Oracle module. I found a bug related to the ora_connect_with_default_signals feature. On our system RHEL5 Oracle11gR2, we had to use this feature on the child signal in order to avoid the "-1" return from the system call ora_connect_with_default_signals => ['CHLD'] This caused a sever bug, if the Perl script is running in the background and doesn't exit, every time it connects to the oracle database server it will create a zombie process, and this will later crash the machine itself since it will consume all the processes on the machine. Anyway, attached in an internal document that explains the problem with examples. I thought I should share this with the world Mahdi
Subject: system_signals_oracle_probelm.html

Message body is not shown because it is too large.

Subject: system_signals_oracle_probelm.docx
Download system_signals_oracle_probelm.docx
application/vnd.openxmlformats-officedocument.wordprocessingml.document 18.3k

Message body not shown because it is not plain text.

Subject: RE: [rt.cpan.org #83766] zombie processes with ora_connect_with_default_signals = ['CHILD']
Date: Tue, 5 Mar 2013 03:25:03 -0800
To: "bug-DBD-Oracle [...] rt.cpan.org" <bug-dbd-oracle [...] rt.cpan.org>
From: Mahdi Sbeih <mahdi_sbeih [...] hotmail.com>
Download (untitled) / with headers
text/plain 8.3k
Here is the exact versions of the platform and database versions Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production Subject: [rt.cpan.org #83766] zombie processes with ora_connect_with_default_signals = ['CHILD'] From: bug-DBD-Oracle@rt.cpan.org CC: mahdi_sbeih@hotmail.com Date: Tue, 5 Mar 2013 05:41:09 -0500 <URL: https://rt.cpan.org/Ticket/Display.html?id=83766 > Originally reported by Mahdi Sbeih via email to Tim. The docx attachment is original and the HTML version I converted in open office so they may not be the same. Dear Tim, Sorry for sending this email directly to you, but I am not an active member in Perl lists and forums and maybe you can direct this email to whom is responsible for the development of Perl DBD::Oracle module. I found a bug related to the ora_connect_with_default_signals feature. On our system RHEL5 Oracle11gR2, we had to use this feature on the child signal in order to avoid the "-1" return from the system call ora_connect_with_default_signals => ['CHLD'] This caused a sever bug, if the Perl script is running in the background and doesn't exit, every time it connects to the oracle database server it will create a zombie process, and this will later crash the machine itself since it will consume all the processes on the machine. Anyway, attached in an internal document that explains the problem with examples. I thought I should share this with the world Mahdi --Forwarded Message Attachment-- Problem The problem occurs when the following are true: Application that connects to Oracle database server Application and the database server are on the same machine, i.e. the database server is local and the connection is utilized using ORACLE_SID. Application uses the “system” command to execute external command such as dP reader or any other tool and the application MUST check on the system return code for success or failure. Platform is RHEL5 and Oracle 11g similar to our local Romania server. Although we are not 100% sure that this is the only combination. If all of the above are true, the return of the system command will always be “-1” despite the result of the command weather it is success or failure. This will prevent the application from deciding how to proceed and in most cases the application will consider the outcome as failure all the time. Without going into the detailed technical information, I will provide a couple of links that will describe these technical details. Bottom line, this is a problem in how Oracle client deals with signals when an application connects to a local database server, it seems Oracle client messes the signals and causes the system command to always return -1. The problem can happen in any of our applications that connect to local database server and use the system command such as dP readers, DpLoad, DpCreateDb.pl…etc. This is the general case of dP software in general. So this bug in Oracle is critical. Example Here is a simple Perl script that illustrates the problem #!/usr/bin/env perl_db use DBI; my %attr = ( PrintError => 0, RaiseError => 1, AutoCommit => 0 ); my($db_handle) = DBI->connect("dbi:Oracle:","dbname","pwd",\%attr); my($sys_ret) = system("date"); print("System return value should be 0 and we get <$sys_ret>\n"); The output of executing the above simple script will be something like this: Wed Feb 27 06:40:03 PST 2013 System return value should be 0 and we get <-1> Current Fix and the Zombie Processes This has been noticed few years ago and the same exact solution was implemented in Perl scripts and in dP tools such as dbascii. The solution was to alter the CHLD signal for such applications. For example in Perl DBD::Oracle they introduced an attribute to the connect method called: ora_connect_with_default_signals, when using this attribute we can avoid this problem, more details about this attribute can be found in: http://search.cpan.org/~pythian/DBD-Oracle-1.56/lib/DBD/Oracle.pm#ora_connect_with_default_signals In C applications such as dbascii, the following code was added after connecting to Oracle: signal(SIGCLD, NULL); In Perl application, the below has been added to the connection attributes: ora_connect_with_default_signals => ['CHLD'] The problem with this solution was discovered by a customer when using the “-log_db” option with DpLoad and a platform similar to Romania local server. When using the DpLoad option –log_db and this fix, every time DpLoad initiates a connection to the database server, it will also create a zombie process that remains on the system. This means that in production, a server might run out processes in hours if not minutes depending on process limit on the machine. The zombie problem occurs for sure in all other application as mentioned above but it only became critical in DpLoad because DpLoad in production runs in the background and doesn’t exit while other application like DpCreateDb.pl and dbascii will exit after finishing and the zombie processes will be minimal and they will be removed once the application exits. Here is an example for replicate the zombie processes, run this script from a terminal and from another terminal run the “top” command. At the top of the “top” command there is a zombie filed for the total number of zombie processes on the system: #!/usr/bin/env perl_db use DBI; my($db_handle); my($turn); my(@turns) = (1,2,3,4,5,6,7,8,9,10); my %attr = ( PrintError => 0, RaiseError => 1, AutoCommit => 0, ora_connect_with_default_signals => ['CHLD'] ); foreach $turn (@turns) { ## connecting $db_handle = DBI->connect("dbi:Oracle:","dbanme","pwd",\%attr); ## disconnecting $db_handle->disconnect(); } print("Sleeping for 15 seconds, from another terminal run top command and see how number of zombie processes increase.....\n"); sleep(15); my($sys_ret) = system("date"); print("System return value should be 0 and we get <$sys_ret>\n"); In the above example, the system returns 0 which is correct but the script creates 10 zombie processes that don’t get removed unless the above script finishes execution or killed. Technical information: http://www.nntp.perl.org/group/perl.dbi.dev/2012/02/msg6837.html http://www.nntp.perl.org/group/perl.dbi.users/2009/06/msg34023.html Suggested Fixes In all the below suggested fixes, we can safely remove the previous fix since it is not recommended to play with default signals. Stop using local connections when the client application and database server are on the same machine, instead use EZCONNECT. This requires update Perl scripts, dP readers and tools that relies on ORACLE_SID to connect to the local database server. In general it looks like the EZCONNECT method is best way to connect since you only need to enable it and once it is enabled, you can connect to any server. http://www.orafaq.com/wiki/EZCONNECT Setting the environment variable TWO_TASK to the oracle SID value, if this variable exists it will trigger oracle to connect to the server as if it is remote which will fix the problem, but make sure that ORACLE_SID is not used inside the script or application. There is a parameter that we can set on the engine level, this parameter is BEQUEATH_DETACH and by default is set to no, so if we add this variable to the file sqlnet.ora like this: BEQUEATH_DETACH = YES Here is a technical description about this option: http://www.nntp.perl.org/group/perl.dbi.users/2009/06/msg34023.html Recommendation My recommendation is to go with fix number 3, since it requires minimal changes and has very minimal side effects – we don’t care about them, see link below -. And by using this option we can safely revert the previous fix from our Perl scripts and dP executables. Moving forward we can consider using the first fix. To get the customer going ASAP, we can fix DpLoad by removing the previous fix and asking the customer to add the following line to his sqlnet.ora file: BEQUEATH_DETACH = YES Also we will have to ask documentation to add this note in the release notes of 9.3. Also we need to create a new defect to revert the previous fix from all other tools. If the customer can’t wait at all, he can live with fix number 2 until we are done.
Download (untitled) / with headers
text/html 14.8k

Message body is not shown because it is too large.



This service is sponsored and maintained by Best Practical Solutions and runs on Perl.org infrastructure.

Please report any issues with rt.cpan.org to rt-cpan-admin@bestpractical.com.