LinuxQuestions.org - Merging many files as one

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - Merging many files as one (https://www.linuxquestions.org/questions/linux-newbie-8/merging-many-files-as-one-897357/)

Merging many files as one

Hi.....I have around 71 files in a folder. I wanted to merge all these files as one. The file's numbers are numerically i.e 1_file.txt, 2_something.txt, 3_someohtername.txt & so on upto 71_lastfile.txt. I ran "cat * > ../notes.txt", but it is not coming serially, I mean if I run any command to merge all of these files as a one file, that the new file content should be started serially i.e 1_*.txt, 2_*.txt, 3_*.txt and so on. So could anybody guide me how can I merge all 71 files as a one file with serially. Otherwise I will have copy-paste single-single file which consume more time.

Try >> instead of >
This will append to a file.

If I understand correctly the issue will be the order they are being passed to cat. Try using a sort did get them in the correct order and then pass them to cat.

Assuming that you are using Bash 3.0 or newer:

Code:

for i in {1..71}

  do

    cat $i*.txt >> newfile

  done

Quote:

Originally Posted by grail (Post 4443362)

If I understand correctly the issue will be the order they are being passed to cat. Try using a sort did get them in the correct order and then pass them to cat.

Ooops, I forgot about that.

I'm assuming that the problem you're experiencing is that filename globbing is based on dictionary sorting, so you get sequences like this:

10 11 12... 18 19 1 20 21 ... 69 70 71 8 9

The only ways to get around this are to use sort with the numerical sorting option, some other technique for matching the actual sequence, like TobiSGD offered, or else to rename your files so that they are all zero-padded (or otherwise in alphanumeric order). I generally prefer the last myself, as it solves the problem permanently.

There are several batch renaming utilities out there for cleaning up filenames, and the topic comes up here regularly, so search around a bit. But here's a quick script I just whipped up that can handle simple jobs.

Code:

#!/bin/bash



shopt -s extglob    #needed for zero-stripping below



#loop through the files given to the script

#(you can use a glob, like "*.txt")

#defaults to globbing everything in the PWD

for file in ${@:-$PWD/*} ; do 



    #ignore any files without numbers

    [[ $file != *[0-9]* ]] && continue 



    #break the filename into (prefix)-(number)-(suffix).

    #the substrings are stored in the BASH_REMATCH array

    [[ $file =~ ([^[0-9]*)([0-9]+)(.*) ]]

                                            

    #pad the number (2 digits by default).  strip any existing

    #leading zeroes first, or bash will treat them as octal

    printf -v numpad "%02d" "${BASH_REMATCH[2]##*(0)}"



    #build the new filename

    newfile="${BASH_REMATCH[1]}${numpad}${BASH_REMATCH[3]}"



    #confirm the result. remove the echo to rename

    echo mv "$file" "$newfile"



done



exit 0

This assumes that the names have only a single number sequence in them. It separates the number string from the non-number strings before and after it, and pads it to 2 places (simply change "%02d" if you want more). Then it reassembles the pieces into a new filename. You can give it a list of files, or else it defaults to everything in the present working directory.

Finally, be careful with it. It might have unforeseen side-effects, so I've disabled the actual renaming operation. Don't remove the echo at the end until you've confirmed that it works.