LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-18-2019, 02:03 PM   #16
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783

which also takes us back to @pan64's mention of wget mirror mode
 
Old 10-21-2019, 05:35 AM   #17
toothwright
LQ Newbie
 
Registered: Nov 2018
Posts: 11

Original Poster
Rep: Reputation: Disabled
@Firerat and scasey

I would argue that the reason for needing local copies is evident, though, in fairness, maybe not to non-dance playing amateur musicians.

Musicians playing for folk dancing often used to to need to carry bags of sheet music so that they could comply with various callers requirements for tune sets for dances.

This, for many, is not now necessary with digital scores being more convenient.

So, the answer to 'Why ?' is because it is not usual to have a reliable internet connection when you are sitting on a farmers wagon in a barn playing for a dance.

Last edited by toothwright; 10-21-2019 at 05:46 AM.
 
Old 10-21-2019, 05:53 AM   #18
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by toothwright View Post
@scasey and others

I would argue that the reason is evident, though in fairness may be not to non dance playing amateur musicians.

Musicians playing for folk dancing often used to to carry bags of sheet music so that they could comply with the callers wishes.
This is not now necessary, digital scores are more convenient.

So, the answer to 'Why not leave it as it is?' is because it is not usual to have an internet connection when you are sitting on a farmers wagon in a barn-dance.
Aha! So, You have copied the scores to your laptop and now want to be able to display them. So why use a web page to do that? Just open them with the file browser using the PDF viewer. Mayhaps you’re working harder than you need to.

The musicians I know and play with are very comfortable with their three ring binders that contain only lyrics.. Most of them don’t even read music, and a couple are complete Luddites who wouldn’t know how to turn a laptop on...
 
Old 10-21-2019, 06:07 AM   #19
toothwright
LQ Newbie
 
Registered: Nov 2018
Posts: 11

Original Poster
Rep: Reputation: Disabled
Also: many talented amateur dance musicians seem to have an almost infinite memory for scores; others require an android pad.

The database is not as straight forward as it appears. Some entries bring up whole sets, many others are links to individual tunes.

I like to use HTML to arrange the order that I want. Just the browser does not allow this personalisation.
 
Old 10-21-2019, 06:10 AM   #20
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
Hm. If I understand well you want to edit that html? Browsers usually open the pages in read-only mode, but obviously you can edit your own files (using a html editor).
 
Old 10-21-2019, 06:53 AM   #21
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by toothwright View Post
Also: many talented amateur dance musicians seem to have an almost infinite memory for scores; others require an android pad.

The database is not as straight forward as it appears. Some entries bring up whole sets, many others are links to individual tunes.

I like to use HTML to arrange the order that I want. Just the browser does not allow this personalisation.
OK. I get that.
But I don’t how you would do that by changing “the content of lines in an html file using regexp...” I think you’ll just need to sit down and code it.
SciTE has a copy function: ctrl+d will duplicate a line (or what is selected), then the copy can be edited to change the page/document being linked to.

Hopefully you’ve been working on that since you posted in #22 and are almost done...
 
Old 10-21-2019, 06:59 AM   #22
toothwright
LQ Newbie
 
Registered: Nov 2018
Posts: 11

Original Poster
Rep: Reputation: Disabled
@pan64
Yes, that is exactly what I am doing, (using bluefish).

My original question resulted from a hope that automation of the repetitive parts of the edit to reduce manual interference would be possible.

It has proved difficult to analyse the HTML lines I need to change in order to develop a working regex so I have returned to manual edits - 1000 lines to go!.
 
Old 10-21-2019, 07:15 AM   #23
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
again, wget can download a page and all the links contained on that page, store on the local disk/pendrive and also rewrite the links [on the donwloaded page] to use the downloaded files.
https://superuser.com/questions/8006...asnt-specified

from the other hand you only need a bulk search/replace to modify the original url, do not need to do it line by line.

Last edited by pan64; 10-21-2019 at 07:16 AM.
 
1 members found this post helpful.
Old 10-22-2019, 04:00 AM   #24
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
Are the files named in a searchable, recognizable manner?
Maybe you can simply use something like fzf (fuzzy find):
Code:
mupdf "$(find "local directory" -iname '*pdf'|fzf)"
 
Old 10-22-2019, 10:46 AM   #25
toothwright
LQ Newbie
 
Registered: Nov 2018
Posts: 11

Original Poster
Rep: Reputation: Disabled
I started with an original line:

<li>Ashokan farewell<a href="http://www.xx.co.uk/mm/M0245_Ashokan_farewell.htm">MM245</a></li>

Used SciTE (as scasey suggested) to produce :

<li>Ashokan farewell <a href="Tunes/MM0245_Ashokan_farewell.pdf">MM245</a></li>

which is how the file looks now.

The final effort should look like this (done manually):

<li><a href="Tunes/MM0245_Ashokan_farewell.pdf">Ashokan farewell</a></li>

So, in the SciTE processed line for example, I would like to exchange the tune name "Ashokan farewell" with "MM245"
Problem is that both items vary in length in the tuples of the file and the names do not always use the UK font....this is why I'm stuck....
I'll explore fzf next, thank you for the example

Last edited by toothwright; 10-22-2019 at 10:50 AM.
 
Old 10-22-2019, 11:30 AM   #26
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
this

Code:
sed 's@\(<li>\)\(.*\+\)<a href="http.*/\(.*\+\).htm.*@\1<a href="Tunes/M\3.pdf">\2</a></li>@'
does what you want to

Code:
<li>Ashokan farewell<a href="http://www.xx.co.uk/mm/M0245_Ashokan_farewell.htm">MM245</a></li>
however, I did hardcode the extra M
more work would be required if MM needs to replace M

it is difficult to find a pattern in one example

I suppose it could be the UPPPER of mm in co.uk/mm/


input
Code:
<ul>
<li>Beans <a href="http://www.anything.co.uk/Contra_reels.htm">MM339</a></li>
<li>Beaulieu <a href="http://www.anything.co.uk/French_Canadian_reels.htm">MM223</a></li>
<li>Beaumont rag <a href="http://www.anything.co.uk/Rags.htm">MM58</a></li>
<li>Bedd y morwr <a href="http://www.anything.co.uk/A_Welsh_medley.htm">MM321</a></li>
<li>Ashokan farewell<a href="http://www.xx.co.uk/mm/M0245_Ashokan_farewell.htm">MM245</a></li>
</ul>
the sed
Code:
<input sed 's@\(<li>\)\(.*\+\)<a href="http.*/\(.*\+\).htm.*@\1<a href="Tunes/M\3.pdf">\2</a></li>@'
output
Code:
<ul>
<li><a href="Tunes/MContra_reels.pdf">Beans </a></li>
<li><a href="Tunes/MFrench_Canadian_reels.pdf">Beaulieu </a></li>
<li><a href="Tunes/MRags.pdf">Beaumont rag </a></li>
<li><a href="Tunes/MA_Welsh_medley.pdf">Bedd y morwr </a></li>
<li><a href="Tunes/MM0245_Ashokan_farewell.pdf">Ashokan farewell</a></li>
</ul>

Last edited by Firerat; 10-22-2019 at 11:41 AM. Reason: added my sample input and the output of the sed
 
Old 10-22-2019, 12:22 PM   #27
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
there was mention of names being duplicate with variations ( the unique number )

so I pre-empted

Code:
sed 's@\(<li>\)\(.\+\)<a href="http.\+/[[:alpha:]]\+\([0-9]\+\)\(.\+\).htm.\+\([[:alpha:]]\{2\}\)\([0-9]\+\).\+@<li><a href="Tunes/\5\3\4.pdf">\5\3 \2</a></li>@'
this assumes MM is always two Alpha characters

Code:
<ul>
<li>Beans <a href="http://www.anything.co.uk/Contra_reels.htm">MM339</a></li>
<li>Beaulieu <a href="http://www.anything.co.uk/French_Canadian_reels.htm">MM223</a></li>
<li>Beaumont rag <a href="http://www.anything.co.uk/Rags.htm">MM58</a></li>
<li>Bedd y morwr <a href="http://www.anything.co.uk/A_Welsh_medley.htm">MM321</a></li>
<li><a href="Tunes/MM0245_Ashokan_farewell.pdf">MM0245 Ashokan farewell</a></li>
</ul>
notice that the early sample data is no longer touched

essentially the same
Code:
sed 's@\(<li>\)\(.\+\)\(<a href="\)http.\+/[[:alpha:]]\+\([0-9]\+\)\(.\+\).htm.\+\([[:alpha:]]\{2\}\)\([0-9]\+\)\(.\+\)@\1\3Tunes/\6\7\5.pdf">\6\4 \2\8@'
notice that \1 \2 \3 are the "chunks" wrapped in \(\)


Edit3
and this one is more like the original
Code:
<input sed 's@\(<li>\)\(.\+\)\(<a href="\)http.\+/[[:alpha:]]\+\([0-9]\+\)\(.\+\).htm\(.\+\)\([[:alpha:]]\{2\}\)\([0-9]\+\)\(.\+\)@\1\2\3Tunes/\7\4\5.pdf\6\7\8\9@'
Code:
...
<li>Ashokan farewell<a href="Tunes/MM0245_Ashokan_farewell.pdf">MM245</a></li>
...
a 'simpler' ( shorter ) version
Code:
sed 's@http.\+/[[:alpha:]]\+\([0-9]\+\)\(.\+\).htm\(.\+\)\([[:alpha:]]\{2\}\)\([0-9]\+\)@Tunes/\4\1\2.pdf\3\4\5@'

Last edited by Firerat; 10-22-2019 at 03:49 PM. Reason: sed got wrapped over two lines ( copy paste fail )
 
Old 10-22-2019, 03:54 PM   #28
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
my escapes are a very bad habit

Code:
sed -E 's@(<li>)(.+)(<a href=")http.+/[[:alpha:]]+([0-9]+)(.+).htm">([[:alpha:]]{2})([0-9]+)(.+)@\1\3Tunes/\6\7\5.pdf">\6\4 \2\8@'
Code:
sed -E 's@http.+/[[:alpha:]]+([0-9]+)(.+).htm(">)([[:alpha:]]{2})([0-9]+)@Tunes/\4\1\2.pdf\3\4\5@'
technically the MM bit should be ([[:upper:]]{2})
if it is not always two, then
(">)([[:upper:]]+)([0-9]+)

so that is UpperCase letter 1 or more times and digit 1 or more times

you may have noticed I corrected another bad habit, the use of *
that is the previous match zero or more times, and a lot of the time is used incorrectly

I don't use sed much these days, and I have still not got rid of the bad habits I picked up following early examples

? is the previous 0 or 1 times

it does seem very confusing, but once you get your head around it it does make perfect sense,

Last edited by Firerat; 10-22-2019 at 04:32 PM.
 
Old 10-22-2019, 05:22 PM   #29
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
and this fits the "original" original

Code:
curl secretsiteplaceholder/mm/sheets.htm | \
  sed -E '/sheet_list|^\//s@/mm/([[:upper:]]+[0-9]+.+).htm@Tunes/\1.pdf@ \
  > sheetmusic.htm
that is much cleaner

curl is probably not installed by default
you don't *need* it, just run the sed on local copy

edit: if the local copy has http link in it
Code:
sed -E '/sheet_list|^\//s@http.+/mm/([[:upper:]]+[0-9]+.+).htm@Tunes/\1.pdf

real sample
Code:
<p class="sheet_list_title"><a name="MM9001"><a href="/mm/MM09001_pipes_from_the_world_of_bash.htm">MM9001 <span class="sheet_title">pipes from the world of bash</span></a></a></p>
<p class="sheet_list_tunes">The Chelsea Flower Show
/ <span class="disabled">Choo Choo</span>
/ Acid Burns <a href="/mm/MM09002_wooden_shoe_antlerpipes.htm">MM9002</a></p>

Last edited by Firerat; 10-22-2019 at 05:52 PM.
 
Old 10-23-2019, 06:45 PM   #30
toothwright
LQ Newbie
 
Registered: Nov 2018
Posts: 11

Original Poster
Rep: Reputation: Disabled
@Firerat
I should like to thank you for the analysis and programming suggestions.
It is very patient of you to try to lead me to a solution and I am trying to implement your technique.
I shall post when I make any progress.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: How To Empty a File, Delete N Lines From a File, Remove Matching String From a File, And Remove Empty/Blank Lines From a File In Linux LXer Syndicated Linux News 0 11-22-2017 12:30 PM
[SOLVED] differences between shell regex and php regex and perl regex and javascript and mysql golden_boy615 Linux - General 2 04-19-2011 01:10 AM
Perl to find regex and print following 5 lines after regex casperdaghost Linux - Newbie 3 08-29-2010 08:08 PM
grep regex . matches new lines?! lambchops468 Linux - Newbie 3 03-24-2008 09:19 PM
Extracting name and address from html page using grep and regex swiftguy121 Linux - Software 2 03-19-2007 12:41 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 12:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration