Removi

joejobs · 12-31-2020, 10:05 AM

Somehow I messed the name of this post and I can't change it, is it possible to delete it please?

Or to change its name and delete the duplicate post?
Remove a multi-line text from another text file

I need to remove a multi-line text (contained in a file) from another text file but only the first occurrence of the entire multi-line text

The text file looks like this:

Quote:

the
quick
the
brown
quick
brown
fox
jumps
quick
brown
fox

And I need to remove the following lines (contained in a file)

Quote:

quick
brown
fox

And then the result will be:

Quote:

the
quick
the
brown
jumps
quick
brown
fox

TB0ne · 12-31-2020, 10:19 AM

Quote:

Originally Posted by joejobs

I need to remove a multi-line text (contained in a file) from another text file but only the first occurrence
The text file looks like this:

Code:

the
quick
the
brown
quick
brown
fox
jumps
quick
brown
fox

And I need to remove the following lines (contained in a file)

Code:

quick
brown
fox

And then the result will be:

Code:

the
quick
the
brown
jumps
quick
brown
fox

Sounds like an interesting homework question; what have you done/tried so far to get this done?? What language, shell, or utility do you want or need to use?? Read the "Question Guidelines" link in my posting signature...we're happy to help with things, but we aren't going to do homework for you.

I'd suggest looking at sed, and incorporating it into a simple bash script to loop through your input file. And based on your input file and what it takes in, the output SHOULD be:

Code:

the
the
quick
brown
jumps
quick
brown
fox

...since you'd remove the first occurrence of each word.

::EDIT::

Quote:

Originally Posted by joejobs

Somehow I messed the name of this post and I can't change it, is it possible to delete it please?

Again, read the "Question Guidelines" and LQ Rules...don't delete or edit posts like that, since it only makes things hard to follow when others search for things. You can always ask a moderator to edit a subject line, but deleting the whole thing for essentially no reason isn't good.

joejobs · 12-31-2020, 11:01 AM

I reposted the question, sorry for trying to delete it and reposting again

I need to remove the entire multi-line text, I do not need to remove the individual lines of that multi-line text

I am using gnuwin32 on Windows.
I managed to find a very complicated solution:
- transform the newlines into a special character "@"
- read the content of the first file into a variable
- use sed to remove the line from the second file
- transform again the special character "@" into newline

However this is quite a mess for very large files that can contain the "@" character

This is my script:

Quote:

TB0ne · 12-31-2020, 11:37 AM

Quote:

Originally Posted by joejobs

I reposted the question, sorry for trying to delete it and reposting again
I need to remove the entire multi-line text, I do not need to remove the individual lines of that multi-line text

I am using gnuwin32 on Windows.
I managed to find a very complicated solution:
- transform the newlines into a special character "@"
- read the content of the first file into a variable
- use sed to remove the line from the second file
- transform again the special character "@" into newline

However this is quite a mess for very large files that can contain the "@" character

This is my script:

Code:

cat a.txt | tr -d "\r" | tr "\n" "@" > a1.txt
cat b.txt | tr -d "\r" | tr "\n" "@" > b1.txt

set /p MyText=<a1.txt

sed -e "s/%MyText%//" b1.txt | tr "@" "\n"
pause

Bolded a part above for emphasis only. Understand what you're asking now, and thanks for showing your efforts. Sed may be what you're looking for:

Code:

sed '/start word/{:a;N;/end word/!ba}/d'

Replace start/end words as needed, and this is untested. And use caution, since the Windows utilities aren't 100% compatible with their 'real' Linux counterparts. Suggest using Linux in a VM on your Windows system if you really want to test things in a more stable way.

MadeInGermany · 12-31-2020, 12:38 PM

If your system's memory can store both files:

Code:

#!/bin/bash
f1=$( < file1.txt )
f2=$( < file2.txt )
printf "%s\n" "${f1//$f2/}"

shruggy · 12-31-2020, 01:28 PM

With sgrep (probably could be refined further):

Code:

sgrep -N 'start .. end extracting first(1,"quick\nbrown\nfox\n")' file1

Instead of the constant "quick\nbrown\nfox\n", there could be something like

Code:

"$(hexdump -e'"%_c"' file2)"

sgrep comes with a sample config file that includes the following m4 macro definition:

Code:

define(ALL,( start .. end ) )

Putting them together:

Code:

sgrep -N 'ALL extracting first(1,"'"$(hexdump -e'"%_c"' file2)"'")' file1

Some other grep-like tools, e.g. the silver searcher, also allow this kind of matching:

Code:

ag --nonumbers --silent -Qvm1 -- "$(<file2)" file1

Another curious option is udiff from Schily Tools. Because it works slightly different from the standard diff, we can abuse it for the side effect:

Code:

udiff file2 file1|sed -E '/^$/{N;/^\n-{8} .*:$/d}'

Something similar could be achieved with icdiff, too:

Code:

icdiff --no-header file1 file2|sed -n 's/^\x1B\[[0-9;]*m//gp'