[SOLVED] nc (netcat): script works when sourced but not if executed via its path

Heraton · 10-26-2012, 02:39 PM

Hello everybody!

Right now I am completely puzzled because of the behaviour of nc in this script:
(nc version: OpenBSD netcat (Debian patchlevel 1.89-3ubuntu2) )

cat script_test:

Code:

#!/bin/bash
(nc -kl 192.168.2.100 2222) >> /tmp/myfifo &      # /tmp/myfifo is a named pipe
( while true; do read in; if [ "$in" == "" ]; then continue; fi; echo "$in"; done ) < /tmp/myfifo &

If I source the script from PC1 by typing

Code:

. script_test

I can connect from PC2 with

Code:

nc 192.168.2.100 2222

and everything I am typing in the terminal of PC2 is displayed in the terminal of PC1. It is possible to terminate and reconnect to the server at will.

I am very well aware of the fact, that I could simply run

Code:

nc -lk 192.168.2.100 2222

to achieve the same effect, but I decided to reinvent the wheel to isolate the problem from a much longer and more complex script.

If I run the script directly with

Code:

./script_test

I can not connect to PC1. I wiresharked the connection. The first time I try to connect I see a three way handshake followed immediately by a tcp teardown initiated from PC1. Total length of the stream is 0 bytes. Every connection attempt after the first one is answered with TCP RESET.

The difference I noticed between sourcing and running the script is that sourcing will result in two new jobs while running the script results in a new bash instance. This made me think of interactive vs. non-interactive shells. Some research let me find this in ABS-Guide, but the "-i" option did no good to my script. I tried to run nc with "-q -1" and with "-d", but this did only break the sourcing too.

So right now I do not have any idea
1st: what causes nc to terminate immediately in the script
2nd: why it is not starting another listening attempt although -k is set

Any help would be highly appreciated.

Regards, Heraton

ntubski · 10-27-2012, 11:02 AM

You are running 2 commands in the background and nothing else so your main script is exiting immediately and closing its subprocesses. You can fix this by doing one of:

Using the wait command at the end of the main script.
Using disown on each background command so that they are not terminated by the ending of their parent.
Not putting the second command in the background.

Heraton · 10-30-2012, 05:52 AM

Dear ntubski!

Thank you very much for your reply. I apologise for answering so very late, but I was ill for a few days and was unable to work on my project.

I am afraid, but the proposals you made did not solve my problem. The issue you pointed out is indeed likely to cause trouble, so I will keep it in mind when I continue to experiment. Due to your hints I got some hands on experience with wait and disown, so on the bright side I got some learning experience out of this.

I will paste the test scripts I wrote, so you can have a look at it and see if I did break anything. All scripts did establish a TCP connection just to tear it down at once the first time and responded to connection attempts ever after with a RESET.

no_bg_script:

Code:

#!/bin/bash
(nc -kl 192.168.2.100 2222) >> /tmp/myfifo &      # /tmp/myfifo is a named pipe
( while true; do read in; if [ "$in" == "" ]; then continue; fi; echo "$in"; done ) < /tmp/myfifo

switched_no_bg_script:

Code:

#!/bin/bash
( while true; do read in; if [ "$in" == "" ]; then continue; fi; echo "$in"; done ) < /tmp/myfifo &
(nc -kl 192.168.2.100 2222) >> /tmp/myfifo &      # /tmp/myfifo is a named pipe

disown_script:

Code:

#!/bin/bash
(nc -kl 192.168.2.100 2222) >> /tmp/myfifo & disown     # /tmp/myfifo is a named pipe
( while true; do read in; if [ "$in" == "" ]; then continue; fi; echo "$in"; done ) < /tmp/myfifo & disown

wait_script:

Code:

#!/bin/bash
(nc -kl 192.168.2.100 2222) >> /tmp/myfifo &      # /tmp/myfifo is a named pipe
( while true; do read in; if [ "$in" == "" ]; then continue; fi; echo "$in"; done ) < /tmp/myfifo &
# according to bash manpage wait without arguments waits for all child processes
wait
# the next line was never reached in my tests
echo "The End"

Regards, Heraton

ntubski · 10-30-2012, 07:38 AM

I think I was on the wrong track with previous answer. I tried running some of your scripts (with the IP changed to 127.0.0.1) and I'm getting completely bizarre behaviour: The first connection attempt with nc exits immediately, the second time it connects but then every other line is echoed locally instead of on the other end!? And several times I got nc or the script going into a loop taking 100% cpu while trying to quit.

I'm stumped.

Heraton · 10-30-2012, 07:27 PM

Hi!

I would like to know which distro you used to break this even further. My Linux Mint (still release number 10 on my testing-box) had no such problems. In fact, I started of testing with two console windows against localhost too, but switched to a network approach to be able to sniff the traffic with wireshark conveniently.

Well, I am still stuck. I think that maybe something regarding stdin is handled differently in the two situations, so at least the closing of the connection could be explained. But I still have no clue what that difference might be. I am thinking right now about reimplementing nc, to be able to test what the program is "seeing" in each situation. This might take a while, and I do not think this can be done soon, but I am still tempted to figure out what is going on there.

Regards, Heraton

ntubski · 10-31-2012, 02:46 AM

Quote:

Originally Posted by Heraton

I would like to know which distro you used to break this even further. My Linux Mint (still release number 10 on my testing-box) had no such problems.

Debian testing "Wheezy", with netcat-openbsd package version 1.105-7 (the executable doesn't appear to have a version switch).

Quote:

I am thinking right now about reimplementing nc, to be able to test what the program is "seeing" in each situation. This might take a while, and I do not think this can be done soon, but I am still tempted to figure out what is going on there.

Yeah, I was thinking of trying that as well.

Heraton · 10-31-2012, 03:04 AM

Got the source code of bsd netcat. About 1000 lines for the main file does not sound too bad. As far as I have seen, it should not be to hard to add some print lines for debugging as the file is well structured and commented. Sadly I do not have that much time at present, so this might take some time.

On the bright side: As nc is reporting errors on stderr, it could be worth a try to have a closer look at what is logged. I will see into that when i find the time.

<edit>
This is the version I was looking into:
/* $OpenBSD: netcat.c,v 1.109 2012/07/07 15:33:02 haesbaert Exp $ */
</edit>

ntubski · 10-31-2012, 10:57 AM

I took the reimplement nc route and found that backgrounded processes don't get to keep stdin which causes nc to exit. Adding the -d option fixes this. On the other hand, you said -d didn't work for you, so maybe you have some other issues??

Code:

#!/bin/bash

# This worked for me

(nc -dkl 127.0.0.1 2222) >> /tmp/myfifo &      # /tmp/myfifo is a named pipe
nc_pid=$!
( while true; do read in; if [ "$in" == "" ]; then continue; fi; echo "$in"; done) < /tmp/myfifo &
cat_fifo_pid=$!

trap 'kill $nc_pid $cat_fifo_pid' SIGTERM SIGINT

wait

Also I found the following has the exact same behaviour (including the symptoms described in your first post if -d is removed):

Code:

#!/bin/bash

nc -d -l 127.0.0.1 2222 &
wait

My nc implementation (doesn't support -d, -l implies -k):

Code:

#define _POSIX_C_SOURCE 1

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <sys/wait.h>
#include <signal.h>
#include <sys/select.h>


static void die_if(int cond, const char* name) {
    if (cond) {
        perror(name);
        exit(EXIT_FAILURE);
    }
}

/* return 1 on EOF, 0 otherwise */
static int send_from_to(int from_fd, int to_fd) {
    unsigned char* buffer[4096];

    int nread = read(from_fd, buffer, sizeof buffer);
    die_if(nread < 0, "read");
    if (nread == 0) return 1;

    int nwrote = write(to_fd, buffer, nread);
    die_if(nwrote < 0, "write");
    if (nwrote != nread) fprintf(stderr, "TODO: handle partial write\n");

    return 0;
}

static void send_recv_loop(int sockfd) {

    fd_set read_fds;

    for (;;) {
        FD_ZERO(&read_fds);
        FD_SET(STDIN_FILENO, &read_fds);
        FD_SET(sockfd, &read_fds);

        int ready_fds = select(sockfd+1, &read_fds, NULL, NULL, NULL);
        die_if(ready_fds < 0, "select");

        if (FD_ISSET(STDIN_FILENO, &read_fds)) {
            if (send_from_to(STDIN_FILENO, sockfd)) {
                fprintf(stderr, "EOF on stdin\n");
                exit(EXIT_SUCCESS);
            }
        }

        if (FD_ISSET(sockfd, &read_fds)) {
            if (send_from_to(sockfd, STDOUT_FILENO)) {
                fprintf(stderr, "EOF on socket\n");
                break;
            }
        }
    }
}


int main(int argc, char** argv) {
    int server_mode = 0;
    int i = 1;
    int failed;

    if (argc > 1
        && (strcmp(argv[i], "-l") == 0 ||
            strcmp(argv[i], "-kl") == 0 ||
            strcmp(argv[i], "-lk") == 0))
    {
        server_mode = 1;
        i++;
    }

    if (argc < i + 2) {
        fprintf(stderr, "Usage: [-lk] <host> <port>\n");
        exit(EXIT_FAILURE);
    }

    const char* host = argv[i++];
    const char* port = argv[i++];

    struct addrinfo* ainfo = NULL;

    struct addrinfo hints;
    memset(&hints, 0, sizeof hints);
    hints.ai_family = AF_UNSPEC;
    hints.ai_socktype = SOCK_STREAM;

    failed = getaddrinfo(host, port, NULL, &ainfo);
    if (failed) {
        fprintf(stderr, "getaddrinfo(%s, %s): %s\n",
            host, port, gai_strerror(failed));
        exit(EXIT_FAILURE);
    }

    int listen_fd = socket(ainfo->ai_family, ainfo->ai_socktype, ainfo->ai_protocol);
    die_if(listen_fd < 0, "socket");

    if (server_mode) {
        int truth = 1;
        failed = setsockopt(listen_fd, SOL_SOCKET, SO_REUSEADDR,
            &truth, sizeof truth);
        die_if(failed, "setsockopt");

        failed = bind(listen_fd, ainfo->ai_addr, ainfo->ai_addrlen);
        die_if(failed, "bind");

        for (;;) {
            failed = listen(listen_fd, 1);
            die_if(failed, "listen");

            int sockfd = accept(listen_fd, NULL, NULL);
            die_if(sockfd < 0, "accept");

            send_recv_loop(sockfd);
        }
    } else {
        failed = connect(listen_fd, ainfo->ai_addr, ainfo->ai_addrlen);
        die_if(failed, "connect");

        send_recv_loop(listen_fd);
    }

    return 0;
}

Heraton · 11-01-2012, 06:35 PM

Dear ntubski!

Thank you very much for your assistance and all the effort you put into my problem. The script you posted worked for me too.

Interesting was, that the behavior of Linux Mint 10 and 12 is different. When I ran my old script on release 12, suddenly every connection attempt was behaving like the first one: three way handshake and tcp teardown.

I was able to get the old script working with the "-d" switch too, which was very surprising for me. The wait-issue seemingly was not the cause of the trouble. I do not have any explanation for the "-d" switch working right now except that I must have screwed up somehow the first time. I guess I will blame the flu

Although I do still not understand fully what happened here, I will let it be. It suffices to know that stdin is lost whenever a process is backgrounded and netcat needs to be run with "-d" in this case.

All the best, Heraton