LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-31-2020, 10:05 AM   #1
joejobs
LQ Newbie
 
Registered: Dec 2020
Posts: 12

Rep: Reputation: Disabled
Removi


Somehow I messed the name of this post and I can't change it, is it possible to delete it please?

Or to change its name and delete the duplicate post?
Remove a multi-line text from another text file

I need to remove a multi-line text (contained in a file) from another text file but only the first occurrence of the entire multi-line text

The text file looks like this:

Quote:
the
quick
the
brown
quick
brown
fox
jumps
quick
brown
fox
And I need to remove the following lines (contained in a file)
Quote:
quick
brown
fox
And then the result will be:

Quote:
the
quick
the
brown
jumps
quick
brown
fox

Last edited by joejobs; 12-31-2020 at 10:45 AM.
 
Old 12-31-2020, 10:19 AM   #2
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,636

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by joejobs View Post
I need to remove a multi-line text (contained in a file) from another text file but only the first occurrence
The text file looks like this:
Code:
the
quick
the
brown
quick
brown
fox
jumps
quick
brown
fox
And I need to remove the following lines (contained in a file)
Code:
quick
brown
fox
And then the result will be:
Code:
the
quick
the
brown
jumps
quick
brown
fox
Sounds like an interesting homework question; what have you done/tried so far to get this done?? What language, shell, or utility do you want or need to use?? Read the "Question Guidelines" link in my posting signature...we're happy to help with things, but we aren't going to do homework for you.

I'd suggest looking at sed, and incorporating it into a simple bash script to loop through your input file. And based on your input file and what it takes in, the output SHOULD be:
Code:
the
the
quick
brown
jumps
quick
brown
fox
...since you'd remove the first occurrence of each word.

::EDIT::
Quote:
Originally Posted by joejobs
Somehow I messed the name of this post and I can't change it, is it possible to delete it please?
Again, read the "Question Guidelines" and LQ Rules...don't delete or edit posts like that, since it only makes things hard to follow when others search for things. You can always ask a moderator to edit a subject line, but deleting the whole thing for essentially no reason isn't good.

Last edited by TB0ne; 12-31-2020 at 10:21 AM.
 
1 members found this post helpful.
Old 12-31-2020, 11:01 AM   #3
joejobs
LQ Newbie
 
Registered: Dec 2020
Posts: 12

Original Poster
Rep: Reputation: Disabled
I reposted the question, sorry for trying to delete it and reposting again

I need to remove the entire multi-line text, I do not need to remove the individual lines of that multi-line text

I am using gnuwin32 on Windows.
I managed to find a very complicated solution:
- transform the newlines into a special character "@"
- read the content of the first file into a variable
- use sed to remove the line from the second file
- transform again the special character "@" into newline

However this is quite a mess for very large files that can contain the "@" character

This is my script:

Quote:
cat a.txt | tr -d "\r" | tr "\n" "@" > a1.txt
cat b.txt | tr -d "\r" | tr "\n" "@" > b1.txt

set /p MyText=<a1.txt

sed -e "s/%MyText%//" b1.txt | tr "@" "\n"
pause

Last edited by joejobs; 12-31-2020 at 11:02 AM.
 
Old 12-31-2020, 11:37 AM   #4
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,636

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by joejobs View Post
I reposted the question, sorry for trying to delete it and reposting again
I need to remove the entire multi-line text, I do not need to remove the individual lines of that multi-line text

I am using gnuwin32 on Windows.
I managed to find a very complicated solution:
- transform the newlines into a special character "@"
- read the content of the first file into a variable
- use sed to remove the line from the second file
- transform again the special character "@" into newline

However this is quite a mess for very large files that can contain the "@" character

This is my script:
Code:
cat a.txt | tr -d "\r" | tr "\n" "@" > a1.txt
cat b.txt | tr -d "\r" | tr "\n" "@" > b1.txt

set /p MyText=<a1.txt

sed -e "s/%MyText%//" b1.txt | tr "@" "\n"
pause
Bolded a part above for emphasis only. Understand what you're asking now, and thanks for showing your efforts. Sed may be what you're looking for:
Code:
sed '/start word/{:a;N;/end word/!ba}/d'
Replace start/end words as needed, and this is untested. And use caution, since the Windows utilities aren't 100% compatible with their 'real' Linux counterparts. Suggest using Linux in a VM on your Windows system if you really want to test things in a more stable way.

Last edited by TB0ne; 12-31-2020 at 11:41 AM.
 
Old 12-31-2020, 12:38 PM   #5
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,794

Rep: Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201
If your system's memory can store both files:
Code:
#!/bin/bash
f1=$( < file1.txt )
f2=$( < file2.txt )
printf "%s\n" "${f1//$f2/}"
 
Old 12-31-2020, 01:28 PM   #6
shruggy
Senior Member
 
Registered: Mar 2020
Posts: 3,670

Rep: Reputation: Disabled
With sgrep (probably could be refined further):
Code:
sgrep -N 'start .. end extracting first(1,"quick\nbrown\nfox\n")' file1
Instead of the constant "quick\nbrown\nfox\n", there could be something like
Code:
"$(hexdump -e'"%_c"' file2)"
sgrep comes with a sample config file that includes the following m4 macro definition:
Code:
define(ALL,( start .. end ) )
Putting them together:
Code:
sgrep -N 'ALL extracting first(1,"'"$(hexdump -e'"%_c"' file2)"'")' file1
Some other grep-like tools, e.g. the silver searcher, also allow this kind of matching:
Code:
ag --nonumbers --silent -Qvm1 -- "$(<file2)" file1
Another curious option is udiff from Schily Tools. Because it works slightly different from the standard diff, we can abuse it for the side effect:
Code:
udiff file2 file1|sed -E '/^$/{N;/^\n-{8} .*:$/d}'
Something similar could be achieved with icdiff, too:
Code:
icdiff --no-header file1 file2|sed -n 's/^\x1B\[[0-9;]*m//gp'

Last edited by shruggy; 01-08-2021 at 04:59 AM.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:01 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration