LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 11-19-2010, 06:50 PM   #16
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled

Hi -

Quote:
Right, so my question is now forcing the mount to be correctly recognizable... is this too crazy to ask?
Uh - yes

Because in real world networks - packets get dropped, connections fail and Ca-Ca Happens.

Suggestion:
Check for -1. Delay momentarily, then retry. This should probably work around the "isdir()" problem.

Suggestion 2:
Any number of things could be causing the VFS errors. They could be on the Windows side: in Windows, or in the physical disk itself. It could be on the Linux side. Or it could be with the network.

Carefully check your Windows event logs and your Linux system logs.

Google for (possibly) related errors. For example:
* Problem with SATA driver

* Shutdown/Unmounting issues

* Etc

Suggestion 3:
Every time you get "ret == -1" (error), print the global variable "errno" and post back the results. That, too, might give us a valuable clue.

'Hope that helps!

Last edited by paulsm4; 11-20-2010 at 01:03 AM.
 
Old 11-20-2010, 05:20 PM   #17
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
I figured.

But why does one of the mounted drives *always* fail? It's never worked so far yet... that one isn't intermittent unfortunately.

What exactly do these errors mean?

Code:
[165880.016039] CIFS VFS: No response for cmd 50 mid 31807
[165930.046023] CIFS VFS: Unexpected lookup error -112
[165960.060029] CIFS VFS: Unexpected lookup error -112
[166000.061026] CIFS VFS: Unexpected lookup error -112
[198872.230024] CIFS VFS: No response to cmd 47 mid 57676
[198872.230177] CIFS VFS: Write2 ret -11, wrote 0
[198872.230588] CIFS VFS: No response to cmd 46 mid 57677
[198872.230726] CIFS VFS: Send error in read = -11
[199086.158029] CIFS VFS: No response to cmd 47 mid 21536
[199086.158187] CIFS VFS: Write2 ret -11, wrote 0
[202380.916027] CIFS VFS: No response to cmd 47 mid 42890
[202380.916178] CIFS VFS: Write2 ret -11, wrote 0
[212792.630045] CIFS VFS: No response to cmd 47 mid 9732
[212792.630201] CIFS VFS: Write2 ret -11, wrote 0
[213664.807036] CIFS VFS: No response to cmd 47 mid 52527
[213664.807188] CIFS VFS: Write2 ret -11, wrote 0
[213664.807626] CIFS VFS: No response for cmd 50 mid 52528
[213664.807777] CIFS VFS: No response for cmd 50 mid 52529
[213664.807921] CIFS VFS: No response for cmd 50 mid 52530
[213670.945315] CIFS VFS: Write2 ret -11, wrote 0
[248418.427027] CIFS VFS: No response for cmd 50 mid 28179
I'm not seeing anything on the Windows end...
 
Old 11-22-2010, 03:15 PM   #18
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
Guess I have to give up?
 
Old 11-22-2010, 03:49 PM   #19
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Is it only your code that isn't working, or do other functions (normal filesystem browsing, file copying, etc.) also misbehave? It seems reasonable to speculate that there may be a bona fide problem with the connection. Have you tried un-mounting/mounting the share? What about other networking between the two hosts in question, such as scp tranfers, sftp, etc.? Attach a network sniffer and compare traffic patterns between working and non-working system?

--- rod.
 
Old 11-22-2010, 11:17 PM   #20
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
I've rebooted both machines more than once. I don't have any other transfers ongoing when troubleshooting. And yes, I've manually unmounted and remounted as well.

The code isn't working which is causing a problem with accurate entries in this logfile which is a critical element for the purpose of this server. Nothing else seems to misbehave but the functionality of this code is absolutely key.

How would network sniffing assess a problem with the mount? I'm not even sure what kind of thing to look at or how to do so effectively. Can you offer any tips? I'm a newbie when it comes to this.

Do the errors I've pasted in my last message mean anything or are they too vague?
 
Old 11-23-2010, 11:59 AM   #21
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Well, there is clearly something different about this connection. My strategy would be to compare traffic generated by a working run of your software to a non-working one. Focusing on the difference (if any) may reveal something.

Are all of the parameters of the mount equivalent between work and non-working systems?

Have you adjusted your code to accomodate error code returns (per message #16)? Result?

Right now, my best guess is that there is a low-level networking fault, and the driver and OS software are concealing it through retries, etc.

--- rod.
 
Old 11-23-2010, 09:28 PM   #22
Dogs
Member
 
Registered: Aug 2009
Location: Houston
Distribution: Slackware 13.37 x64
Posts: 105

Rep: Reputation: 25
Rather than do it that way, you could use a system call instead...

system("ls -l deeper/and/deeper/path > somefile.txt");

You can build that command passed to system() in a typical array based on which directories you're going through.

and then simply pass the array to system().

then search through that file for lines beginning with d, and those are your directories.

Last edited by Dogs; 11-23-2010 at 09:29 PM.
 
Old 11-23-2010, 11:12 PM   #23
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
Quote:
My strategy would be to compare traffic generated by a working run of your software to a non-working one. Focusing on the difference (if any) may reveal something.
I'm game. What do I need to do, strace?

Quote:
Are all of the parameters of the mount equivalent between work and non-working systems?
Here's the strange part: it is the same system. It's the same kind of mount. The hard drives are identical makes. I have no idea what differentiates both... I defer to my earlier question.

Quote:
Have you adjusted your code to accomodate error code returns (per message #16)? Result?
Again, I'm a newbie. I'm not a coder. I didn't even post this in the programming forum. It was moved based on the circumstances, but I don't know anymore if it belongs.

I only did troubleshooting to try to identify the problem, but the previous page should provide the results on that.

Quote:
Right now, my best guess is that there is a low-level networking fault, and the driver and OS software are concealing it through retries, etc.
That's promising. How do I confirm?

Quote:
system("ls -l deeper/and/deeper/path > somefile.txt");
Yeah, so here's why I get stuck. I'm NOT a coder. I don't know to implement this and how to edit my code without breaking things.
 
Old 11-24-2010, 09:22 AM   #24
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
This is going to be difficult to fix without some programming. You have code that returns error statuses. In order to properly handle those errors, you need to get the error codes, and act according to what error statuses it sees. Depending on the error status, it may be either recoverable, or not.
One pretty good suggestion has already been made: if the return status is -1, sleep briefly and repeat the call. Repeat until the return status is good or give up. Since there is/are non-deterministic network call(s) involved by the kernel & filesystem, it is entirely reasonable to expect calls to fail (that's why error conditions are returned), and also reasonable to expect the failures to be short in duration. My speculation is that physical problems with the network exist. The errors reported in your log file are the basis for this. Other applications are doing what your code probably needs to do, which is to have some persistence. I suggested putting the call into a tight loop, and print the error status repeatedly, and the results may show that there are transient error conditions; or may show some steady-state condition.
You are going to have to first do some diagnosis of the problem, and only then can you arrive at a viable tact for a solution. Again, looking at the low-level data on the wire, using a packet sniffer, can produce data that shows whether the problem is originating there. If it is, then a hardware based remedy would be indicated. Such problems should affect all networking traffic more or less equally. Have you done any comparative testing using other data-transfer methods?

--- rod.
 
Old 11-24-2010, 11:11 AM   #25
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
No, I haven't done any other comparative testing since I'm not sure of any other things I can do. I'm just running a cifs mount. Is there an alternative?

Again, I really cannot code, so I guess I'm just going to have to diagnose this mount problem to figure out what the issue is -- and I guess now I would hope that a mod moves this back to Linux - Server....

Ideally, I'm looking for a solution that doesn't involve coding. This code works. It's worked for years. The problem is the mounts and whatever else might be getting in the way.
 
Old 11-24-2010, 11:21 AM   #26
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Quote:
Originally Posted by punt View Post
I guess now I would hope that a mod moves this back to Linux - Server....
Hi,

I sent you a PM earlier about the thread move - perhaps you have not noticed you have a message? Not to worry, I sometimes don't notice PM's for days.

Anyhow, as written in the PM I wrote you, you can click the REPORT button on your first post, and ask a moderator to move the thread back to the "Server" forum for you, if that is what you wish.

Cheers!
 
Old 11-24-2010, 11:37 AM   #27
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
OOPS. Thanks Celine - I didn't notice but I the PM from this morning now! (Usually, I get those popups notifying me of PMs... hadn't seen one!)
 
Old 11-24-2010, 12:21 PM   #28
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
Okay, so just an update - it's been awhile since I opened my console and all those CIFS messages have been displaying prominently on my desktop.

There are a lot of messages (I actually am trying to reboot, but I'm getting umount errors):

Code:
CIFS VFS: No response for cmd 50 mid 34276
CIFS VSF: Send error in SessSetup = -11
And a whole variety of other errors as you'd seen earlier in this thread. The machine finally did go down for a reboot but its response was rather slow...

Thoughts?
 
Old 11-24-2010, 01:55 PM   #29
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Quote:
Originally Posted by punt View Post
This code works. It's worked for years. The problem is the mounts and whatever else might be getting in the way.
The code may have worked until the OS reported errors. It clearly disregards those errors, not looking at the return code from system calls. Only by good luck or lack of scrutiny did it survive until now.

--- rod.
 
Old 11-24-2010, 03:19 PM   #30
punt
Member
 
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by theNbomr View Post
The code may have worked until the OS reported errors. It clearly disregards those errors, not looking at the return code from system calls. Only by good luck or lack of scrutiny did it survive until now.

--- rod.
Shouldn't we try to figure out why the OS is reporting errors? I don't think this is a coding issue anymore.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] mount.cifs failing with "Required key not available" error luvshines Fedora 3 12-13-2011 01:48 AM
Copying files and sub-directories of a directory except the directories named ".abc" sri1025 Linux - General 2 08-24-2010 08:53 AM
fc7/apache - cannot access directories only "directory/index.php" debarros Linux - Server 14 01-14-2008 06:39 AM
Is this "system integrity test" really valid Bruce Hill Linux - Security 2 03-22-2005 04:34 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 09:31 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration