LinuxQuestions.org - create file list: SED inline vs SED standalone, enormous speed difference

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - create file list: SED inline vs SED standalone, enormous speed difference (https://www.linuxquestions.org/questions/linux-newbie-8/create-file-list-sed-inline-vs-sed-standalone-enormous-speed-difference-4175475456/)

create file list: SED inline vs SED standalone, enormous speed difference

Hello to all the community

I've created this one line command, script, to create files lists, that works, though

Code:

ls -R1 /rootpathname/ | while read l; do case $l in *:) d=${l%:};; "") d=;; *) echo "$d/$l";; esac; done > /tmp/filelistname.txt

since I do use it with disks mounted under /media (ubuntu), the files-lists increases in size since at the begin of every line, there is the

Code:

/media/volumename

string

so I've added some sed, but the speed slowed down in a terrible manner, hundreds times slower

here it is the code, though... and it works

Code:

ls -R1 /media/MAC01/ | while read l; do case $l in *:) d=${l%:};; "") d=;; *) echo $d/$l | sed 's/\/media\/MAC01//';; esac; done >  /tmp/MAC01-file-list.txt

but this command will take ages against doing it into two steps

Code:

ls -R1 /media/MAC01/ | while read l; do case $l in *:) d=${l%:};; "") d=;; *) echo "$d/$l";; esac; done > /tmp/MAC01-file-list.txt

followed by

Code:

sed 's/\/media\/MAC01//' /tmp/MAC01-file-list.txt > /tmp/MAC01-file-list-cleaned.txt

literally some seconds against many minutes (this drive contains more than 400.000 files, which mean that its files lists' length, is more than 400.000 lines)

Do any of you have an technical explanation about this enormous speed difference?

Have I placed the sed command in a wrong position?

Thank you for hinting

Cor

Maybe the command substitution technique could help increasing the speed,

since they write it extracts the stdout of a command, then assigns it to a variable using the = operator.

but I don't find any reference that I could understand to apply it to the above "slow" script.

Seems like a awfully complicated way of producing a file list.
Can't you just use 'find'?

Code:

find "/media/MAC01/" -type f -printf "%P\n" | sort > filelist.txt

(the %P format code will strip off the /media/MAC01/ prefix, so no need for any sed'ing)

The reason your code is so slow is because you placed the call to sed within the body of the loop, so you're asking it to start 400,000 instances of sed, one after the other.

thank you GazL

it works and it is fast

searching I've also found this command

Code:

find . > filelist.txt

it must be run from the root of the tree you want to create the filelist.

to only count the files in a tree , the output could be piped to wc, like this find . | wc -l

Cor

Please remember to mark as SOLVED once you have a solution