LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-26-2016, 08:48 AM   #1
p-nitin.verma
LQ Newbie
 
Registered: Oct 2016
Posts: 3

Rep: Reputation: Disabled
Smile CPU core reduction on RHEL 6 causing network problems


Hi All,

I am doing core reduction on RHEL 6 servers, during testing phase what is observed is after core reduction , putty session stops.
ping to server also fails.

however server is up, (logged directly on windows hypervisor).

I ran following script at boot ,(creating a sym link in /etc/rc3.d directory)

#!/bin/bash

for i in {12..15}
do
echo "Reducing core cpu$i"
echo 0 > /sys/devices/system/cpu/cpu$i/online
echo "Core Reduced (cpu$i)"
done
echo "$(nproc) cores available"

##################################

server remains up for sometime then putty session closes

From logs i can see
: CPU 15 is now offline
Oct 26 15:56:41 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:41 NWLOG kdump: stopped
Oct 26 15:56:41 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:41 NWLOG kdump: started up
Oct 26 15:56:41 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:41 NWLOG kdump: stopped
Oct 26 15:56:42 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: started up
Oct 26 15:56:42 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: stopped
Oct 26 15:56:42 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: started up
Oct 26 15:56:42 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: stopped
Oct 26 15:56:42 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: started up
Oct 26 15:56:42 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: stopped
Oct 26 15:56:50 NWLOG irqbalance: WARNING, didn't collect load info for all cpus
, balancing is broken,,
for all cpus i made offline.

Please suggest.

Thanks
 
Old 10-26-2016, 10:22 AM   #2
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 21,944

Rep: Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810
Quote:
Originally Posted by p-nitin.verma View Post
Hi All,
I am doing core reduction on RHEL 6 servers, during testing phase what is observed is after core reduction , putty session stops. ping to server also fails. however server is up, (logged directly on windows hypervisor). I ran following script at boot ,(creating a sym link in /etc/rc3.d directory)

#!/bin/bash

for i in {12..15}
do
echo "Reducing core cpu$i"
echo 0 > /sys/devices/system/cpu/cpu$i/online
echo "Core Reduced (cpu$i)"
done
echo "$(nproc) cores available"

##################################

server remains up for sometime then putty session closes

From logs i can see
: CPU 15 is now offline
Oct 26 15:56:41 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:41 NWLOG kdump: stopped
Oct 26 15:56:41 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:41 NWLOG kdump: started up
Oct 26 15:56:41 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:41 NWLOG kdump: stopped
Oct 26 15:56:42 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: started up
Oct 26 15:56:42 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: stopped
Oct 26 15:56:42 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: started up
Oct 26 15:56:42 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: stopped
Oct 26 15:56:42 NWLOG kdump: kexec: loaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: started up
Oct 26 15:56:42 NWLOG kdump: kexec: unloaded kdump kernel
Oct 26 15:56:42 NWLOG kdump: stopped
Oct 26 15:56:50 NWLOG irqbalance: WARNING, didn't collect load info for all cpus
, balancing is broken,,
for all cpus i made offline.

Please suggest.
You're using RHEL6....the best suggestion would be to CALL RED HAT SUPPORT....you are PAYING FOR RHEL, RIGHT??? They can assist you with an sosreport and analyze things. Otherwise, you don't tell us what kind of VM this is, on what kind of hardware, or what exactly you're trying to accomplish. The message seems fairly clear...you've set the system up with a certain number of CPU's, then (essentially) yanked one out, so things aren't 'balanced' anymore. Reboot the server with the new CPU settings, and see what happens.

Since this is for 'testing', you have done something that causes your system to crash. So...DON'T DO IT, and consider the test over.

Last edited by TB0ne; 10-26-2016 at 10:23 AM.
 
Old 10-26-2016, 12:28 PM   #3
p-nitin.verma
LQ Newbie
 
Registered: Oct 2016
Posts: 3

Original Poster
Rep: Reputation: Disabled
Well, thats the thing, if We had support from redhat i wouldnot be posting this here. Seems logical isnt it ? Anyways if u have solution please suggest else dont bother replying on the post. We have to do the core reduction activity thats why asking at this open forum to seek some expert advise. My organisation didnt take support from redhat. Valuable suggestions are welcome.
 
Old 10-26-2016, 01:21 PM   #4
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 21,944

Rep: Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810Reputation: 5810
Quote:
Originally Posted by p-nitin.verma View Post
Well, thats the thing, if We had support from redhat i wouldnot be posting this here. Seems logical isnt it ? Anyways if u have solution please suggest else dont bother replying on the post.
Read the LQ Rules about text-speak and not using it. And if you are NOT paying for RHEL, then why bother using it, when CentOS is totally free, and identical. Seems logical, doesn't it??? You use RHEL for stability, extensive testing, and SUPPORT FOR WHEN YOU HAVE ISSUES. Otherwise, there are LOTS of free distros you can use, that can do everything RHEL does.

And if you want to get snotty about "dont bother replying" when someone asks you questions or points out the VERY obvious "call support for the product you should be paying for" solution, don't bother POSTING.
Quote:
We have to do the core reduction activity thats why asking at this open forum to seek some expert advise. My organisation didnt take support from redhat. Valuable suggestions are welcome.
..and just saying 'we have to do the core reduction activity', doesn't answer the questions asked. Your 'organization' needs to pay for the commercial products it uses, or use the free products.

AGAIN: what are you trying to accomplish and why??? The 'valuable suggestions' would be for you to answer the questions you've been asked, and think about what's been said. Again...you aren't providing details, and the error seems very clear. Since you aren't paying for RHEL, you obviously don't have the irqbalance bug fix that was reported in the RHEL knowledgebase, you can't even download it, or read the customer-only errata report.

So: pay for RHEL or use CentOS...those are your two fixes. Pick one.
 
1 members found this post helpful.
Old 10-26-2016, 04:56 PM   #5
jefro
Moderator
 
Registered: Mar 2008
Posts: 19,217

Rep: Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926Reputation: 2926
Wonder if the hardware is at fault/not supported. I'm going on this error too. "rqbalance: WARNING, didn't collect load info for all cpus
, balancing is broken,, "

Look up some of the other web pages on that fault maybe??

Last edited by jefro; 10-26-2016 at 04:57 PM.
 
1 members found this post helpful.
Old 10-26-2016, 05:19 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,127

Rep: Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923
I've successfully used that technique in the past on RHEL6 - but never in a tight loop like that. Try adding a sleep to the script to allow the offline action (moving running tasks to another CPU) to complete before offlining the next.
It's possible tasks are being moved to (say) 13 and then you try to offline 13 immediately.
Easy to test - I'd use a decent value, say 30 seconds.
 
1 members found this post helpful.
Old 10-26-2016, 08:57 PM   #7
p-nitin.verma
LQ Newbie
 
Registered: Oct 2016
Posts: 3

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
I've successfully used that technique in the past on RHEL6 - but never in a tight loop like that. Try adding a sleep to the script to allow the offline action (moving running tasks to another CPU) to complete before offlining the next.
It's possible tasks are being moved to (say) 13 and then you try to offline 13 immediately.
Easy to test - I'd use a decent value, say 30 seconds.
I doubted that may be the issue, the tight loop and also the irqbalance Warninig. Can you suggest a good Advanced BAsh Scripting book, I have read many, but didnt find that quality in any of the ones availabe on internet

Thanks for your time
 
Old 10-26-2016, 09:12 PM   #8
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,127

Rep: Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923Reputation: 2923
I have always used the Advanced Bash-Scripting Guide available here. Download as appropriate.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Unable to find why application is loosing CPU causing network traffic drop JagsN Linux - Kernel 9 09-11-2013 12:58 PM
Multi -Core CPU Load Balancing RHEL 5 plemmons Red Hat 5 02-24-2008 04:24 AM
Re install of fedora core 6 on 2nd hard drive causing problems phipper Linux - Newbie 9 01-07-2007 02:01 PM
network interface name causing problems android6011 Linux - Networking 2 09-16-2006 10:41 PM
9.2 installation causing network problems centerice Linux - Networking 1 02-05-2004 07:21 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration