Linux - Server This forum is for the discussion of Linux Software used in a server related context. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
|
11-19-2010, 07:50 PM
|
#16
|
LQ Guru
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Rep: 
|
Hi -
Quote:
Right, so my question is now forcing the mount to be correctly recognizable... is this too crazy to ask?
|
Uh - yes
Because in real world networks - packets get dropped, connections fail and Ca-Ca Happens.
Suggestion:
Check for -1. Delay momentarily, then retry. This should probably work around the "isdir()" problem.
Suggestion 2:
Any number of things could be causing the VFS errors. They could be on the Windows side: in Windows, or in the physical disk itself. It could be on the Linux side. Or it could be with the network.
Carefully check your Windows event logs and your Linux system logs.
Google for (possibly) related errors. For example:
* Problem with SATA driver
* Shutdown/Unmounting issues
* Etc
Suggestion 3:
Every time you get "ret == -1" (error), print the global variable "errno" and post back the results. That, too, might give us a valuable clue.
'Hope that helps!
Last edited by paulsm4; 11-20-2010 at 02:03 AM.
|
|
|
11-20-2010, 06:20 PM
|
#17
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
I figured.
But why does one of the mounted drives *always* fail? It's never worked so far yet... that one isn't intermittent unfortunately.
What exactly do these errors mean?
Code:
[165880.016039] CIFS VFS: No response for cmd 50 mid 31807
[165930.046023] CIFS VFS: Unexpected lookup error -112
[165960.060029] CIFS VFS: Unexpected lookup error -112
[166000.061026] CIFS VFS: Unexpected lookup error -112
[198872.230024] CIFS VFS: No response to cmd 47 mid 57676
[198872.230177] CIFS VFS: Write2 ret -11, wrote 0
[198872.230588] CIFS VFS: No response to cmd 46 mid 57677
[198872.230726] CIFS VFS: Send error in read = -11
[199086.158029] CIFS VFS: No response to cmd 47 mid 21536
[199086.158187] CIFS VFS: Write2 ret -11, wrote 0
[202380.916027] CIFS VFS: No response to cmd 47 mid 42890
[202380.916178] CIFS VFS: Write2 ret -11, wrote 0
[212792.630045] CIFS VFS: No response to cmd 47 mid 9732
[212792.630201] CIFS VFS: Write2 ret -11, wrote 0
[213664.807036] CIFS VFS: No response to cmd 47 mid 52527
[213664.807188] CIFS VFS: Write2 ret -11, wrote 0
[213664.807626] CIFS VFS: No response for cmd 50 mid 52528
[213664.807777] CIFS VFS: No response for cmd 50 mid 52529
[213664.807921] CIFS VFS: No response for cmd 50 mid 52530
[213670.945315] CIFS VFS: Write2 ret -11, wrote 0
[248418.427027] CIFS VFS: No response for cmd 50 mid 28179
I'm not seeing anything on the Windows end...
|
|
|
11-22-2010, 04:15 PM
|
#18
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
Guess I have to give up? 
|
|
|
11-22-2010, 04:49 PM
|
#19
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Is it only your code that isn't working, or do other functions (normal filesystem browsing, file copying, etc.) also misbehave? It seems reasonable to speculate that there may be a bona fide problem with the connection. Have you tried un-mounting/mounting the share? What about other networking between the two hosts in question, such as scp tranfers, sftp, etc.? Attach a network sniffer and compare traffic patterns between working and non-working system?
--- rod.
|
|
|
11-23-2010, 12:17 AM
|
#20
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
I've rebooted both machines more than once. I don't have any other transfers ongoing when troubleshooting. And yes, I've manually unmounted and remounted as well.
The code isn't working which is causing a problem with accurate entries in this logfile which is a critical element for the purpose of this server. Nothing else seems to misbehave but the functionality of this code is absolutely key.
How would network sniffing assess a problem with the mount? I'm not even sure what kind of thing to look at or how to do so effectively. Can you offer any tips? I'm a newbie when it comes to this.
Do the errors I've pasted in my last message mean anything or are they too vague? 
|
|
|
11-23-2010, 12:59 PM
|
#21
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Well, there is clearly something different about this connection. My strategy would be to compare traffic generated by a working run of your software to a non-working one. Focusing on the difference (if any) may reveal something.
Are all of the parameters of the mount equivalent between work and non-working systems?
Have you adjusted your code to accomodate error code returns (per message #16)? Result?
Right now, my best guess is that there is a low-level networking fault, and the driver and OS software are concealing it through retries, etc.
--- rod.
|
|
|
11-23-2010, 10:28 PM
|
#22
|
Member
Registered: Aug 2009
Location: Houston
Distribution: Slackware 13.37 x64
Posts: 105
Rep:
|
Rather than do it that way, you could use a system call instead...
system("ls -l deeper/and/deeper/path > somefile.txt");
You can build that command passed to system() in a typical array based on which directories you're going through.
and then simply pass the array to system().
then search through that file for lines beginning with d, and those are your directories.
Last edited by Dogs; 11-23-2010 at 10:29 PM.
|
|
|
11-24-2010, 12:12 AM
|
#23
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
Quote:
My strategy would be to compare traffic generated by a working run of your software to a non-working one. Focusing on the difference (if any) may reveal something.
|
I'm game. What do I need to do, strace?
Quote:
Are all of the parameters of the mount equivalent between work and non-working systems?
|
Here's the strange part: it is the same system. It's the same kind of mount. The hard drives are identical makes. I have no idea what differentiates both... I defer to my earlier question.
Quote:
Have you adjusted your code to accomodate error code returns (per message #16)? Result?
|
Again, I'm a newbie. I'm not a coder. I didn't even post this in the programming forum. It was moved based on the circumstances, but I don't know anymore if it belongs.
I only did troubleshooting to try to identify the problem, but the previous page should provide the results on that.
Quote:
Right now, my best guess is that there is a low-level networking fault, and the driver and OS software are concealing it through retries, etc.
|
That's promising. How do I confirm?
Quote:
system("ls -l deeper/and/deeper/path > somefile.txt");
|
Yeah, so here's why I get stuck. I'm NOT a coder.  I don't know to implement this and how to edit my code without breaking things.
|
|
|
11-24-2010, 10:22 AM
|
#24
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
This is going to be difficult to fix without some programming. You have code that returns error statuses. In order to properly handle those errors, you need to get the error codes, and act according to what error statuses it sees. Depending on the error status, it may be either recoverable, or not.
One pretty good suggestion has already been made: if the return status is -1, sleep briefly and repeat the call. Repeat until the return status is good or give up. Since there is/are non-deterministic network call(s) involved by the kernel & filesystem, it is entirely reasonable to expect calls to fail (that's why error conditions are returned), and also reasonable to expect the failures to be short in duration. My speculation is that physical problems with the network exist. The errors reported in your log file are the basis for this. Other applications are doing what your code probably needs to do, which is to have some persistence. I suggested putting the call into a tight loop, and print the error status repeatedly, and the results may show that there are transient error conditions; or may show some steady-state condition.
You are going to have to first do some diagnosis of the problem, and only then can you arrive at a viable tact for a solution. Again, looking at the low-level data on the wire, using a packet sniffer, can produce data that shows whether the problem is originating there. If it is, then a hardware based remedy would be indicated. Such problems should affect all networking traffic more or less equally. Have you done any comparative testing using other data-transfer methods?
--- rod.
|
|
|
11-24-2010, 12:11 PM
|
#25
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
No, I haven't done any other comparative testing since I'm not sure of any other things I can do. I'm just running a cifs mount. Is there an alternative?
Again, I really cannot code, so I guess I'm just going to have to diagnose this mount problem to figure out what the issue is -- and I guess now I would hope that a mod moves this back to Linux - Server....
Ideally, I'm looking for a solution that doesn't involve coding. This code works. It's worked for years. The problem is the mounts and whatever else might be getting in the way.
|
|
|
11-24-2010, 12:21 PM
|
#26
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
Quote:
Originally Posted by punt
I guess now I would hope that a mod moves this back to Linux - Server....
|
Hi,
I sent you a PM earlier about the thread move - perhaps you have not noticed you have a message? Not to worry, I sometimes don't notice PM's for days.
Anyhow, as written in the PM I wrote you, you can click the REPORT button on your first post, and ask a moderator to move the thread back to the "Server" forum for you, if that is what you wish.
Cheers!
|
|
|
11-24-2010, 12:37 PM
|
#27
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
OOPS. Thanks Celine - I didn't notice but I the PM from this morning now! (Usually, I get those popups notifying me of PMs... hadn't seen one!)
|
|
|
11-24-2010, 01:21 PM
|
#28
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
Okay, so just an update - it's been awhile since I opened my console and all those CIFS messages have been displaying prominently on my desktop.
There are a lot of messages (I actually am trying to reboot, but I'm getting umount errors):
Code:
CIFS VFS: No response for cmd 50 mid 34276
CIFS VSF: Send error in SessSetup = -11
And a whole variety of other errors as you'd seen earlier in this thread. The machine finally did go down for a reboot but its response was rather slow...
Thoughts?
|
|
|
11-24-2010, 02:55 PM
|
#29
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Quote:
Originally Posted by punt
This code works. It's worked for years. The problem is the mounts and whatever else might be getting in the way.
|
The code may have worked until the OS reported errors. It clearly disregards those errors, not looking at the return code from system calls. Only by good luck or lack of scrutiny did it survive until now.
--- rod.
|
|
|
11-24-2010, 04:19 PM
|
#30
|
Member
Registered: Jun 2001
Distribution: Fedora 22
Posts: 371
Original Poster
Rep:
|
Quote:
Originally Posted by theNbomr
The code may have worked until the OS reported errors. It clearly disregards those errors, not looking at the return code from system calls. Only by good luck or lack of scrutiny did it survive until now.
--- rod.
|
Shouldn't we try to figure out why the OS is reporting errors? I don't think this is a coding issue anymore.
|
|
|
All times are GMT -5. The time now is 04:19 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|