LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-17-2020, 05:12 PM   #1
RandomTroll
Senior Member
 
Registered: Mar 2010
Distribution: Slackware
Posts: 1,959

Rep: Reputation: 271Reputation: 271Reputation: 271
Translating %-introduced hex characters into ASCII


I want to extract information from files that render a lot
of non-alphanumeric characters in hex, for example '%3D' for
'=', '%2B' for '+'. Is there a name for this practice? Is
there already a program that translates them into ASCII, which
would save me the trouble of writing one? Here's a sample line:

Quote:
<a href="#" data-db-target-for="7053458b-d78a-4662-8e29-53bc26fbb44e" data-db-switch="" aria-haspopup="true" aria-controls="7053458b-d78a-4662-8e29-53bc26fbb44e_Pop" role="button" id="7053458b-d78a-4662-8e29-53bc26fbb44e_Ctrl" data-slide-target="#7053458b-d78a-4662-8e29-53bc26fbb44e_Pop" class="w-slide__btn visibleOverflow"><i aria-hidden="true" class="icon-build"></i><span>Tools</span></a><div data-db-target-of="7053458b-d78a-4662-8e29-53bc26fbb44e" aria-labelledby="7053458b-d78a-4662-8e29-53bc26fbb44e_Ctrl" role="menu" id="7053458b-d78a-4662-8e29-53bc26fbb44e_Pop"><ul class="rlist w-slide--list"><li role="presentation" class="article-tool"><a href="/personalize/addFavoritePublication?doi=10.7326%2FL19-0550" role="menuitem"><i aria-hidden="true" class="icon-star"></i><span>Add to favorites</span></a></li><li role="presentation" class="article-tool"><a href="/action/showCitFormats?doi=10.7326%2FL19-0550" role="menuitem"><i aria-hidden="true" class="icon-download"></i><span>Download Citations</span></a></li><li role="presentation" class="article-tool"><a href="/action/addCitationAlert?doi=10.7326%2FL19-0550" role="menuitem"><i aria-hidden="true" class="icon-my_location"></i><span>Track Citations</span></a></li><li role="presentation" class="article-tool"><a href="/servlet/linkout?type=rightslink&amp;url=startPage%3D634%26pageCount%3D3%26author%3DAlan%2BJ.%2BHunter%26orde rBeanReset%3Dtrue%26imprint%3DAmerican%2BCollege%2Bof%2BPhysicians%26volumeNum%3D172%26issueNum%3D9% 26contentID%3D10.7326%252FL19-0550%26title%3DUnilateral%2BFacial%2BParalysis%2BDuring%2Ban%2BAirline%2BFlight%26numPages%3D3%26pa% 3D%26issn%3D0003-4819%26publisherName%3DACP%26publication%3D0003-4819%26rpt%3Dn%26endPage%3D636%26publicationDate%3D05%252F05%252F2020" role="menuitem" target="popup" onclick="window.open(&quot;/servlet/linkout?type=rightslink&amp;url=startPage%3D634%26pageCount%3D3%26author%3DAlan%2BJ.%2BHunter%26orde rBeanReset%3Dtrue%26imprint%3DAmerican%2BCollege%2Bof%2BPhysicians%26volumeNum%3D172%26issueNum%3D9% 26contentID%3D10.7326%252FL19-0550%26title%3DUnilateral%2BFacial%2BParalysis%2BDuring%2Ban%2BAirline%2BFlight%26numPages%3D3%26pa% 3D%26issn%3D0003-4819%26publisherName%3DACP%26publication%3D0003-4819%26rpt%3Dn%26endPage%3D636%26publicationDate%3D05%252F05%252F2020&quot;,&quot;popup&quot;,&quot; width=750,height=650&quot; return false;"><i aria-hidden="true" class="icon-lock_open"></i><span>Permissions</span></a></li></ul></div>
 
Old 05-17-2020, 05:34 PM   #2
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,691

Rep: Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894Reputation: 5894
URLs containing reserved characters are encoded to % hex characters.

https://en.wikipedia.org/wiki/Percent-encoding

There are various ways of decoding hex characters using whatever language / tool you want... bash, sed, php, python etc you prefer.
 
1 members found this post helpful.
Old 05-17-2020, 06:07 PM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,124

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Browsers, even the CLI ones, know how to interpret that stuff.
 
Old 05-19-2020, 01:14 AM   #4
RandomTroll
Senior Member
 
Registered: Mar 2010
Distribution: Slackware
Posts: 1,959

Original Poster
Rep: Reputation: 271Reputation: 271Reputation: 271
Quote:
Originally Posted by michaelk View Post
There are various ways of decoding hex characters using whatever language / tool you want... bash, sed, php, python etc you prefer.
I had already written a program to do this but had forgotten about it.

Quote:
Originally Posted by syg00 View Post
Browsers, even the CLI ones, know how to interpret that stuff.
I want to extract information programatically.
 
Old 05-19-2020, 01:30 AM   #5
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,303
Blog Entries: 3

Rep: Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720Reputation: 3720
It's an encoding and decoding process not extraction. See RFC 3986.

Code:
perl -MURI::Escape -n -e 'print uri_escape($_);'
See "man URI::Escape"
 
1 members found this post helpful.
Old 05-20-2020, 12:11 AM   #6
RandomTroll
Senior Member
 
Registered: Mar 2010
Distribution: Slackware
Posts: 1,959

Original Poster
Rep: Reputation: 271Reputation: 271Reputation: 271
Quote:
Originally Posted by Turbocapitalist View Post
It's an encoding and decoding process not extraction.
I started out wanting to extract data from an HTML page that uses this encoding protocol. I had to decode to extract; decoding is a subsidiary process.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[bash] ASCII to HEX and hex to ascii ////// Programming 17 05-08-2018 09:55 PM
Hex output of a hex/ascii input string mlewis Programming 35 04-10-2008 12:05 PM
hex to ascii and ascii to hex ilnli Programming 7 08-31-2007 11:55 AM
display in hex + perl + non ASCII characters kshkid Programming 4 02-06-2007 04:48 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:10 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration