LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-02-2017, 02:24 PM   #1
kalyan.vangipurapu
LQ Newbie
 
Registered: Jun 2017
Location: India Hyderabd
Posts: 3

Rep: Reputation: Disabled
pthread_cond_wait sys call is getting assert failure when using helgrind tool in valgrind but not failed with memcheck tool with valgrind


Used helgrind tool in the valgrind memeory checker tool.

Our application is using a pthread_conwait system call to synchronize the threads.
which leads to pthread_cond_wait_wrk() method, which is there in helgrind library.
It was failing in that system call and getting assert failure message from helgrind library.

When using helgrind, our application became very slow in execution but not observed the same with memcheck tool.

Why the execution became slow by using the helgrind tool, is this the root cause for getting the assertion failure error.
Or is this a bug in helgrind?

Last edited by kalyan.vangipurapu; 06-02-2017 at 02:28 PM.
 
Old 06-02-2017, 09:14 PM   #2
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,337
Blog Entries: 1

Rep: Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329
Without seeing any context code or knowing what valgrind/helgrind are actually doing, it is not likely that anyone here can answer that question.

Please see this page for guidance in asking a more complete question, and try to provide a better description of your actual case.

Last edited by astrogeek; 06-03-2017 at 12:00 PM. Reason: typo
 
Old 06-03-2017, 07:42 AM   #3
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 8,208
Blog Entries: 4

Rep: Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764
Multithreaded code typically operates very differently in the presence of any debugger than it does in real life.

You should simply assume that "Zed, you have a bug!"
 
Old 06-05-2017, 12:22 AM   #4
kalyan.vangipurapu
LQ Newbie
 
Registered: Jun 2017
Location: India Hyderabd
Posts: 3

Original Poster
Rep: Reputation: Disabled
Source code for the issue
int xxxCore::runxxx()
{
int exit_status = 0;
pthread_mutex_lock( &_CORE_LOCK );
for (; {
{
ClosexxxLockType lock(_close_xxx_mutex);
if (_close_xxx)
break;
}
if(_actions.empty()) {
pthread_cond_wait( &_CORE_COND, &_CORE_LOCK );
}
while ( !_actions.empty() ) {
Action* action = _actions.front();
_actions.pop_front();
action->execute();
delete action;
}
}
// Terminate xxx nicely
exit_status = eventHandler ();
pthread_mutex_unlock( &_CORE_LOCK );
return exit_status;
}

void xxx::addAction( Action* action )
{
if (pthread_mutex_lock( &_CORE_LOCK ) == EDEADLK) {
ERROR_CORE( "Mutex Deadlock, The current thread already owns the mutex");
assert(0);
}
_actions.push_back( action );
pthread_cond_signal( &_CORE_COND );
pthread_mutex_unlock( &_CORE_LOCK );
}


static void* addClosexxxAction(void* sig)
{
int receivedSignal = *(int*)sig;
INFO_CORE( "xxx received signal %d, will close", receivedSignal);
PRINT_STDERR_FMT("xxx received signal %d, will close", receivedSignal);
xxx::xxxCore::instance().addAction( new Closexxx() );
return NULL;
}

void xxx::signalCatcher( int sig )
{
// NOTE: we can not call addAction here, because
// we will deadlock in case xxx is not waiting on the condition.
// I.e. a signal during action->execute();
pthread_t thread_id;
/* Create a new thread. The newthread will run
* addClosexxxAction function
*/
pthread_create (&thread_id, NULL, &addClosexxxAction, &sig);
}

xxx::init()
{
struct sigaction sigact;

// Register a signal handler
sigemptyset( &sigact.sa_mask );
sigact.sa_flags = 0;
sigact.sa_handler = signalCatcher;
sigaction( SIGQUIT, &sigact, &old_sigquit_handler );
sigaction( SIGTERM, &sigact, &old_sigterm_handler );
}

Description:
While running the application, we attempted to run XXX under valgrind using the helgrind tool.
This seems to lead to some race condition in xxx startup code. This problem does not appear under valgrind memcheck tool.

We get either this stack trace:

==1088== Process terminating with default action of signal 6 (SIGABRT): dumping core
==1088== at 0x67E60C7: raise (in /lib64/libc-2.19.so)
==1088== by 0x67E7477: abort (in /lib64/libc-2.19.so)
==1088== by 0x67DF145: __assert_fail_base (in /lib64/libc-2.19.so)
==1088== by 0x67DF1F1: __assert_fail (in /lib64/libc-2.19.so)
==1088== by 0x4C30253: pthread_cond_timedwait_WRK (hg_intercepts.c:1274)
==1088== by 0x4C30E48: pthread_cond_timedwait@* (hg_intercepts.c:1322)
==1088== by 0x5313F1D: Poco::EventImpl::waitImpl(long) (in /opt/nels/lib/poco-1.7.6/libPocoFoundation.so.46)
==1088== by 0x5F2118E: Poco::Event::tryWait(long) (in /opt/nels/lib/libxxx_common.so)
==1088== by 0x5F20BDC: xxx::Timer::run() (in /opt/nels/lib/libxxx_common.so)
==1088== by 0x5F20DA7: xxx::Timer::runnableEntry(void*) (in /opt/nels/lib/libxxx_common.so)
==1088== by 0x4C2FB32: mythread_wrapper (hg_intercepts.c:389)
==1088== by 0x5AF40A3: start_thread (in /lib64/libpthread-2.19.so)

or this one:

==1266== Process terminating with default action of signal 6 (SIGABRT): dumping core
==1266== at 0x67E60C7: raise (in /lib64/libc-2.19.so)
==1266== by 0x67E7477: abort (in /lib64/libc-2.19.so)
==1266== by 0x67DF145: __assert_fail_base (in /lib64/libc-2.19.so)
==1266== by 0x67DF1F1: __assert_fail (in /lib64/libc-2.19.so)
==1266== by 0x4C2FE56: pthread_cond_wait_WRK (hg_intercepts.c:1184)
==1266== by 0x4C30E38: pthread_cond_wait@*20(hg_intercepts.c:1222)
==1266== by 0x5B5CC1: xxx::xxxCore::runxxx() (in /opt/XXX/bin/xxx)
==1266== by 0x58DC53: main (in /opt/XXX/bin/xxx)

Problem is persistent, core comes repeatedly. If we switch to memcheck tool, then things start to work well again.

Valgrind 3.11.0
Helgrind is the tool to verify shared resource access. We realize it is a heavy tool and can slow things down a lot. We tested in a virtual xen environment.

Helgrind options used

# setup valgrind options needed later by valgrind command
export VALGRIND_OPTS="--tool=helgrind \
-v \
--log-file=/home/valgrind/XXX_helgrind_sc1_%p.log"
 
Old 06-05-2017, 12:56 AM   #5
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=14, FreeBSD_10{.0|.1|.2}
Posts: 4,337
Blog Entries: 1

Rep: Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329Reputation: 2329
Please place your code snippets inside [CODE]...[/CODE] tags for better readability. You may type those yourself or click the "#" button in the edit controls.
 
Old 06-06-2017, 09:16 AM   #6
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 8,208
Blog Entries: 4

Rep: Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764
Also – LQ is not a "debugging service."

Your logic contains a number of race conditions and other fundamental problems, as evidenced by (e.g.) its constant checks for "deadlock" in improbable places (such as signal handlers?! and "addAction").

And that, no doubt, is why you are having to use a debugger in the first place!

The best thing to do would be to grab a C++ "thread-safe queue" class off the shelf, which correctly encapsulates the pcond_thread logic that you need and does it "known-correctly." (Consider this discussion on Quora.)

"Thread-safe queues" are something that everyone has need of, but therefore that has "been done to death." You don't need to do it again.

Quote:
Actum Ne Agas:
"Do Not Do A Thing Already Done"

Last edited by sundialsvcs; 06-06-2017 at 09:18 AM.
 
1 members found this post helpful.
Old 06-08-2017, 08:24 AM   #7
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 8,208
Blog Entries: 4

Rep: Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764Reputation: 2764
(Incidentally, one of the most significant problems with this code is that you hold the mutex while you are executing the action . . . )

This logic should simply consist of an endless loop that dequeues Action objects from a thread-safe queue, executes them, and discards them. The mutex will be obtained and released with each operation that is made to the queue, but all of this will be transparently done within the queue-object.
 
Old 06-21-2017, 08:28 AM   #8
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: Debian, Mint, Puppy, Raspbian
Posts: 3,421

Rep: Reputation: 199Reputation: 199
try using openMP it's a lot easier.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Application crashing with assertion failure with helgrind tool but not with memcheck tool kalyan.vangipurapu Linux - Software 2 06-03-2017 07:45 AM
Valgrind is throwing error "failed in UME with error 22 (Invalid argument)." vamsi9042 Programming 1 11-06-2013 04:57 AM
Upgrade failure: Failed to find suitable ramdisk generation tool for kernel version 1Trev27 Debian 2 07-20-2011 10:07 AM
error installing valgrind: failed to start tool 'memcheck'... karatelambda Linux - Software 1 12-16-2010 10:42 AM
[SOLVED] how to use valgrind tool over a exe which is started with start_daemon sri_csy Linux - Newbie 4 07-29-2010 09:49 AM


All times are GMT -5. The time now is 02:58 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration