LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   CAT multiple files with filename pattern (https://www.linuxquestions.org/questions/linux-general-1/cat-multiple-files-with-filename-pattern-4175656372/)

JugaruGabi 06-25-2019 02:32 PM

CAT multiple files with filename pattern
 
Hello,

I have the following scenario:
an application is creating each day a file that is having some records in plain text. The files are being saved with the following name:

Code:

auto_test_01-Jan-2019.txt
auto_test_02-Jan-2019.txt
auto_test_03-Jan-2019.txt
auto_test_04-Jan-2019.txt
..
auto_test_31-Jan-2019.txt
auto_test_01-Feb-2019.txt

I need to build up a bash file that is requesting a starting date, an ending date and, based on the starting date and ending date, it needs to CAT the files on the server between these dates (including the starting and ending date) into a big file which will be processed later using an awk command that is currently stored on a different server.

Can someone help me build up this bash in order for the requirement to be achieved?

I have tried some ideas but I could not get it to work as expected (either it shows the output for the first file, either is showing the output for the last file multiple time).

My knowledge in bash scripting is at the beginner level, so please bare with me. :)

Thanks!

teckk 06-25-2019 03:19 PM

Quote:

I need to build up a bash file that is requesting a starting date, an ending date
Get the dates as variables.
Code:

read -p "Enter starting date" s_date
read -p "Enter ending date" e_date
echo "$s_date"
echo "$e_date"

Quote:

it needs to CAT the files on the server between these dates
Examples:
Code:

a=2
b=20

eval "c=({$a..$b})"; echo ${c[*]}

for ((i=a; i<=b; ++i)); do echo "$i"; done

Quote:

into a big file which will be processed later
Example:
Code:

list=(auto_test_01-Jan-2019.txt
auto_test_02-Jan-2019.txt
auto_test_03-Jan-2019.txt
auto_test_04-Jan-2019.txt
)

for i in "${list[@]}"; do
    cat "$i" >> BigFile.txt
done


BW-userx 06-25-2019 03:21 PM

my draft
Code:

#!/bin/bash
#set -x

testfiles=(
auto_test_01-Feb-2019.txt
auto_test_01-Jan-2019.txt
auto_test_02-Jan-2019.txt
auto_test_03-Jan-2019.txt
auto_test_04-Jan-2019.txt
auto_test_31-Jan-2019.txt
)

wd="$HOME/testing"
wd2=$HOME/filess

mkdir -p "$wd"
mkdir -p "$wd2"

##this loop was just for creating the test files
for i in "${testfiles[@]}"
do
        touch $HOME/testing/$i
done

#populate files in dir into an array
mapfile -t workingArray < <(find "$wd" -type f -name "*.txt" )

#sort the files by dd-mm-yy out into another array
 
arr=($( sort -t- -k 3.1n,3.2 -k 2.1M,2.3 -k 1.1n,1.2  < <(printf '%s\n' "${workingArray[@]}")))
#found that part up there on line and put it together. sort and add to another array.


#here we finalize the process to a main file
#cat contents of files by dd-mm-yy order into a main file with path/file
#to tell where it came from , then contents, then new line for next file
for i in "${arr[@]}"
do
#adds in file name of file and path
        printf "%s\n" "$i" >> $wd2/logfile
#contents of file
        cat $i >> $wd2/logfile
#new line for next file
        printf "%s\n" >>  $wd2/logfile
done


JugaruGabi 06-26-2019 12:57 AM

Unsure if my initial post was too clear:

The files that are in dir1 are:
Code:

auto_test_01-Jan-2019.txt
auto_test_02-Jan-2019.txt
auto_test_03-Jan-2019.txt
auto_test_04-Jan-2019.txt
.
.
.
.
.
.
.
auto_test_15-Jan-2019.txt

The dots from this count represent the files from days up until 31st January.

Based on the string parameter inserted in:
Code:

read -p "Enter starting date" s_date
read -p "Enter ending date" e_date

where
Code:

s_date = 01-Jan-2019
e_date = 15-Jan-2019

the bash must execute the following cat command automatically:
Code:

cat auto_test_01-Jan-2019.txt auto_test_02-Jan-2019.txt auto_test_03-Jan-2019.txt auto_test_04-Jan-2019.txt auto_test_05-Jan-2019.txt auto_test_06-Jan-2019.txt auto_test_07-Jan-2019.txt auto_test_08-Jan-2019.txt auto_test_09-Jan-2019.txt auto_test_10-Jan-2019.txt auto_test_11-Jan-2019.txt auto_test_12-Jan-2019.txt auto_test_13-Jan-2019.txt auto_test_14-Jan-2019.txt auto_test_15-Jan-2019.txt > Period:auto_test$s_date-$e_date.txt
That means, the concatenation command should concatenate automatically the files having the filename pattern auto_test_<date>.txt stored in the period from 1st of January up until 15th of January 2019.

Also, I have seen that both of you have used the list option; isn't there a way to skip this part in the scripting area?
If I am creating the bash file, it is mandatory to create the list with the all 365 days that are present in the calendar?
Isn't there an elegant option to pass the information based on the starting date, ending date and the pattern of the filename?

Most likely, the report will be run with the following parameters as dates:
Code:

s_date=15-Jan-2019
e_date=15-Feb-2019

and it must cat the all of the files in the interval between these 2 dates.

allend 06-26-2019 01:32 AM

Can be done with one line in bash. Keep it in a file and edit as appropriate.
Code:

cat auto_test_{15..31}-Jan-2019 auto_test_{01..15}-Feb-2019 >> Period

evo2 06-26-2019 01:44 AM

Hi,

make use of the date command.
Code:

#!/bin/bash

s_date=15-Jan-2019
e_date=15-Feb-2019

if [ "$1" != "" ] ; then
  s_date=$1
fi
if [ "$2" != "" ] ; then
  e_date=$2
fi

fs=""
i=0
while true ; do
    date=$(date +'%d-%b-%Y' --date "$s_date +$i days")
    ((i+=1))
    f="auto_test_${date}.txt"
    if [ -f $f ] ; then # Check the file exists.. maybe some days are missing?
      fs="$fs $f"
    fi
    if [ "$date" == "$e_date" ] ; then
        break
    fi
done
cat $fs > auto_test$s_date-$e_date.txt

Modify the final line as desired.

Evo2.

JugaruGabi 06-26-2019 03:14 AM

Code:

Code:

#!/bin/bash

s_date=15-Jan-2019
e_date=15-Feb-2019

if [ "$1" != "" ] ; then
  s_date=$1
fi
if [ "$2" != "" ] ; then
  e_date=$2
fi

fs=""
i=0
while true ; do
    date=$(date +'%d-%b-%Y' --date "$s_date +$i days")
    ((i+=1))
    f="auto_test_${date}.txt"
    if [ -f $f ] ; then # Check the file exists.. maybe some days are missing?
      fs="$fs $f"
    fi
    if [ "$date" == "$e_date" ] ; then
        break
    fi
done
cat $fs > auto_test$s_date-$e_date.txt

Made the trick work.
Thanks.

One last question though.
In the meantime, business has decided that in the future they will be switching the pattern to:
auto_test_2019-Jun-15.txt for the files.

1. So, if I am changing the following line:
Code:

date=$(date +'%d-%b-%Y' --date "$s_date +$i days")
to
Code:

date=$(date +'%Y-%b-%d' --date "$s_date +$i days")
the bash hangs and nothing else happens.

2. If am switching the s_date and e_date to:
Code:

s_date=2019-Jan-15
e_date=2019-Feb-15

and
Code:

date=$(date +'%d-%b-%Y' --date "$s_date +$i days")
to
Code:

date=$(date +'%Y-%b-%d' --date "$s_date +$i days")
the bash results in errors like:
Code:

date: invalid date ‘2019-Jan-15 +1199 days’
date: invalid date ‘2019-Jan-15 +1200 days’
date: invalid date ‘2019-Jan-15 +1201 days’
date: invalid date ‘2019-Jan-15 +1202 days’

@evo2 What am I doing wrong?

evo2 06-26-2019 03:32 AM

Hi,

it seems that '%Y-%b-%d' is not a format the date automatically recognises. So, you'll need to use a different format for loop control to what is used in the file names.

Eg. Make the following change in the script:
Code:

((i+=1))
f="auto_test_${date}.txt"

becomes
Code:

f="auto_test_$(date +'%Y-%b-%d' --date "$s_date +$i days").txt"
((i+=1))

And keep the rest of the script as is.

Evo2.

JugaruGabi 06-26-2019 05:24 AM

Yep, this worked.
Thanks a lot.

This can be closed now.

BW-userx 06-26-2019 08:23 AM

Just taking what works, and adding more error checking, ease of use to it, and a log.
Code:

#!/bin/bash

#set -x

#So script does not have to
#be ran in the same dir as
#the files that need to be
#processed are in.
#working dir where files
#are kept
wd="$HOME/dayfiles"
#Where final process file
#is being kept
pf="$HOME/Processed"

#make sure that
#directory is present.
mkdir "$pf"

#Log file of files
#that have been processed
LogFileDir=$HOME/logfile
LogFileName="processedFiles"

mkdir $LogFileDir

#for date header for logfile
FLAG=T

#for ease of changing  specs
#in usage message
dateFormat="dd-M-yy"
example=" 03-Feb-2018 15-Mar-2018"

#default/test dates ??
s_date=15-Jan-2019
e_date=15-Feb-2019
#error checking
#if less than 2 arguments
#print error / usage message

if [[ $# -lt '2' ]] ; then
        printf "
        Needs two arguments
        begining date, ending date
        date format $dateFormat
        example:
        $0 $example\n"
        exit
else
        e_date=$2 ; s_date=$1
fi

fs=""
i=0
while true ; do
    date=$(date +'%d-%b-%Y' --date "$s_date +$i days")
    ((i++))
    #f="auto_test_${date}.txt"
    #add path to files/file
   
    f="$wd"/"auto_test_${date}.txt"
  #############################################
  #to create some test files to ensure this
  #works, Can be removed
    if [[ "$MAKE_TEST" = "YES" ]] ; then
        touch "$f"
        echo "$f" > "$f"
    fi
    #################
    if [ -f $f ] ; then # Check the file exists.. maybe some days are missing?
      fs="$fs $f"
    fi
    if [ "$date" == "$e_date" ] ; then
        break
    fi
 
done
cat $fs > "$pf"/auto_test$s_date-$e_date.txt

#adds time stamp to logfile for each run
[[ $FLAG = T ]] && { echo $(date) >> $LogFileDir/$LogFileName ;
        FLAG=F ; }
#add file name of the file that was just completed being
#processed to a Log file for assurance of its
#completion in case of some type of faliure
#one needs to know where they left off
printf "%s\n" $fs >> $LogFileDir/$LogFileName


MadeInGermany 06-26-2019 01:17 PM

If there is a file each day, you can use simple start and stop matches, if the files are sorted in sequence.
Code:

# example:
s_date=15-Jan-2019
e_date=15-Feb-2019
ls | sort -t- -k 3,3n -k 2,2M -k 1,1 |
  sed -n "/$s_date/,/$e_date/p"

Or, relying on the correct time stamp
Code:

ls -tr |
  sed -n "/$s_date/,/$e_date/p"

If okay, add a pipe to
Code:

  xargs cat


All times are GMT -5. The time now is 07:35 AM.