LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   My Linux Bash Script Fails to run when automated with OS Startup (https://www.linuxquestions.org/questions/linux-software-2/my-linux-bash-script-fails-to-run-when-automated-with-os-startup-733061/)

sanitynotvanity 06-15-2009 10:12 AM

My Linux Bash Script Fails to run when automated with OS Startup
 
Hi,

I'm pulling my hair out over this (literally). I have a program that i have written which effectively builds up a new HDD for a new computer system. The program is run from a linux OS like SLAX (but my own) that resides on a USB drive.

The idea is that i plug my USB drive into a new computer and my OS automatically builds up the HDD in the new computer system.

This process includes partitioning, formatting and copying the OS onto the drive.

My program works perfectly every time I manually run it from the prompt. However, if i try to automate the running of the program with the startup scripts of the linux (USB) OS it fails.

The error is that fdisk cannot update the partition image because the kernel is still using it.

This is what i have tried:
1) S99 script in rc.5 (5 because its unused in this system)
2) autologin script and .bash_profile
3) & to launch the program as new process
4) sleep 100 command to make the process wait for minute before commencing

i have tried many variations, but its as if the program is running in a completely different environment.

Any ideas?
really could do with a good hint here :|

Cheers,

Andy

unSpawn 06-15-2009 04:55 PM

Attach the script for people to look at?

jlinkels 06-15-2009 05:49 PM

Quote:

Originally Posted by sanitynotvanity (Post 3574551)
The error is that fdisk cannot update the partition image because the kernel is still using it.

Please do not describe the error messages, copy and paste them. Every comma can be significant.

Quote:

Originally Posted by sanitynotvanity (Post 3574551)
1) S99 script in rc.5 (5 because its unused in this system)

I hope you made a mistake here. When runlevel 5 is not used, scripts in rc.5 will [B]not[B] be executed.

put this at the start of your script:
Code:

set -x
call your script from another script and put the other script in rc.d:
Code:

S99yourotherscript
From yourotherscript call:
Code:

/path/to/original/script &> path/to/logfile
Now you should start seeing something. For the good sake you can throw in a mount command in your original script to see what is mounted.

Oh and of course what Unspawn said about attaching the script...

jlinkels

sanitynotvanity 06-16-2009 06:49 AM

Hi jlinkels,
thanks for the time you have taken,

Quote:

Originally Posted by jlinkels (Post 3575123)
Please do not describe the error messages, copy and paste them. Every comma can be significant.

fdisk error:
"Re-reading the partition table failed with error 16: Device or resource busy.
The Kernel still uses the old table.
The new table will be used at the next reboot."


Quote:

Originally Posted by jlinkels (Post 3575123)
I hope you made a mistake here. When runlevel 5 is not used, scripts in rc.5 will [B]not[B] be executed.

Run level 5 is used when I tell it to be used ;) (put a 5 on the cmdline at boot, or run 'init 5' - to invoke it manually). Also, please note that my Linux USB OS is based on LFS, not debian or alike. and so what i ment when i said runlevel 5 was not used, was that it was free for me to use it as mode where this script will be run (as i don't want this script to be run all the time, could be rather dangerous!)

Also, an interesting point. When i invoke runlevel 5 with 'init 5', the script works fine (but to re-iterate, if this is done from bootup it fails)

Quote:

Originally Posted by jlinkels (Post 3575123)
put this at the start of your script:
Code:

set -x

I have done this already, however I already know the line of code that is failing, but the question is why its failing. What difference does it make to run a script automatically compared to running in manually?

Code that fails:

Code:

# Delete last partition
                                echo -e "d\nw\n" | fdisk /dev/$NEWDEV2
                               
                                # Add new partition
                                echo -e "n\np\n1\n\n\nt\nb\na\n1\nw\n" | fdisk /dev/$NEWDEV2

Quote:

Originally Posted by jlinkels (Post 3575123)
call your script from another script and put the other script in rc.d:
Code:

S99yourotherscript
From yourotherscript call:
Code:

/path/to/original/script &> path/to/logfile

I can put another script in between if you think that will help, currently my S99 file is a symbolic link to a script in the init.d folder that switches between start|stop etc, and in the start case it starts the script stored in /usr/sbin/

Quote:

Originally Posted by jlinkels (Post 3575123)
Now you should start seeing something. For the good sake you can throw in a mount command in your original script to see what is mounted.

I've done something simliar using the /proc/partitions. And found no differences. But i shall try this aswell

Quote:

Originally Posted by jlinkels (Post 3575123)
Oh and of course what Unspawn said about attaching the script...

I only didn't attach the script because its a good 250 lines, all of which work. however what i can do is copy the bits + a bit more that I believe is relevant:

Code:

#!/bin/bash
set -x
# Description:
# Copies Linux OS to IDE flash Drive

CURDEV=$1
NEWDEV=$2
shift 2

beep -r2
while [ $# -gt 0 ]
do
        case ${1} in
                -p)
                        shift 1
                        # Parition the Drive

                        # Remove Partition Number
                        NEWDEV2=${NEWDEV:0:3}
                        CheckDevExists $NEWDEV2
                        UnMount $NEWDEV2
                        GetDevSize $NEWDEV2
                        RES=$?


                        if [ "$RES" -eq 0 ] ; then
                                echo "Partitioning /dev/$NEWDEV2..."

                                NUM_PARTITIONS=$(echo "p" | fdisk /dev/$NEWDEV2 | grep -c $NEWDEV2)

                                # The first result must be ignored
                                # The second result means that the default fdisk option will be taken
                                # only loop through 3rd and more results
                                NUM_PARTITIONS=$(( $NUM_PARTITIONS - 2 ))
                                while [ $NUM_PARTITIONS -gt 0 ]
                                do
                                        echo -e "d\n$NUM_PARTITIONS\nw\n" | fdisk /dev/$NEWDEV2
       
                                        NUM_PARTITIONS=$(( $NUM_PARTITIONS - 1 ))
                                done

                                # Delete last partition
                                echo -e "d\nw\n" | fdisk /dev/$NEWDEV2
                               
                                # Add new partition
                                echo -e "n\np\n1\n\n\nt\nb\na\n1\nw\n" | fdisk /dev/$NEWDEV2
                        else
                                echo "Device: /dev/$NEWDEV2 is larger than expected"
                                echo "Operationg Aborted"
                                exit 1
                        fi
                        ;;       
                *)
                        shift 1
                        # Default
                        echo        "Usage: [<Source>] [<Destination>]"
                        echo        "        $0 [-f] FORMAT [<NAME>] DRIVE NAME"
                        echo        "        $0 [-i] INSTALL"
                        echo        "        $0 [-g] GRUB [<Destination>] | $0 [-s] SYSLINUX"
                        echo        "Example: $0 sdc1 sdd1 -p -f ARIES -i -s"
                        echo        "Example: $0 sdc1 sdd1 -p -f IDE -i -g hd0"
       
                        exit 1
                        ;;
        esac
done

echo "Finished!"
beep -r4
exit 1

----------------------------
----------------------------

my current thoughts on this problem is that the script when being invoked by the system is somehow being run with different privileges. I only have a root account on this system (its a specialist system with no regard to its own safety :p) and so when I run the script manually, it is being run from a root account.

Am I on the right lines?

jlinkels 06-16-2009 07:50 AM

Quote:

Originally Posted by sanitynotvanity (Post 3575743)
Also, please note that my Linux USB OS is based on LFS, not debian or alike.

I know that Debian is different on this point, and I even know and work with LFS although I never built a system from scratch. But it is clear you know what you are doing, and that you did not put your script in a runlevel never being entered.

Quote:

Originally Posted by sanitynotvanity (Post 3575743)
my current thoughts on this problem is that the script when being invoked by the system is somehow being run with different privileges. I only have a root account on this system (its a specialist system with no regard to its own safety :p) and so when I run the script manually, it is being run from a root account.
Am I on the right lines?

The script in rc.d are run with root privileges by definition, otherwise noting could be get done.

Considering what you already did for debugging a whole number of trivial tests can be skipped.

There is one big difference between running your script manually and running it during boot time. When you run it during boot time your script is called as a child of init or sysinit. That means that init is still executing while your script is called, and I don't know what init does more before handing over the control to ...uhm ... actually what?

While init is still executing, it might keep the drive locked for some reason. Your idea to put a pause in your script was good, but then again we don't know if init waits for all child processes to be terminated before it terminates itself. You could echo the output of ps -aux or pstree to a log file while running your script during boot time to discover this.

What if you switch manually to runlevel 5 once the system is booted? Is your script being executed correctly? I think that switching runlevels is also performed by init, isn't it? If not, for sure init blocks something during execution.

Have you tried to do a umount -f /dev/yourdisk in your script before calling fdisk?

Have you tried to perform the boot-up on a completely empty disk (dd put 512 zeroes in the mbr?)

jlinkels

sanitynotvanity 06-16-2009 10:50 AM

Quote:

Originally Posted by jlinkels (Post 3575815)
The script in rc.d are run with root privileges by definition, otherwise noting could be get done.

good point!

Quote:

Originally Posted by jlinkels (Post 3575815)
There is one big difference between running your script manually and running it during boot time. When you run it during boot time your script is called as a child of init or sysinit. That means that init is still executing while your script is called, and I don't know what init does more before handing over the control to ...uhm ... actually what?

interesting...

Quote:

Originally Posted by jlinkels (Post 3575815)
You could echo the output of ps -aux or pstree to a log file while running your script during boot time to discover this.

Cheers! i shall look into that too

Quote:

Originally Posted by jlinkels (Post 3575815)
What if you switch manually to runlevel 5 once the system is booted? Is your script being executed correctly? I think that switching runlevels is also performed by init, isn't it? If not, for sure init blocks something during execution.

Well, I've been fiddling with this for sometime. And I found that it ran fine when I changed the runlevel.

I am sure that you are correct that init is performing the task whether manually invoked or not

Quote:

Originally Posted by jlinkels (Post 3575815)
Have you tried to do a umount -f /dev/yourdisk in your script before calling fdisk?

I have been ensuring the drives are unmounted first, and this is scripted before the calls to fdisk. However, I have not been using the force option, I will give that ago. thanks

Quote:

Originally Posted by jlinkels (Post 3575815)
Have you tried to perform the boot-up on a completely empty disk (dd put 512 zeroes in the mbr?)

yes, just tried it on a clean system...no change :(

I have a suspicion that this is just one of those things and that maybe the script is being run fine in both occurrences. I am thinking that if I script in a reboot for when the 'drive or resource' is busy that this may solve the problem.

At first, the script always worked when run manually, however with all the testing I have been doing lately, I have seen it fail manually quite often.

Perhaps this situation is more likely to happen early in the booting process?

It would be nice to know why the kernel is locking the drive out, and how to get around it without a reboot. but at present, its the only solution i can think of.

Again, thank you for your time and direction on this.


All times are GMT -5. The time now is 03:14 PM.