LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 09-18-2016, 12:10 AM   #1
timl
Member
 
Registered: Jan 2009
Location: Sydney, Australia
Distribution: Fedora,CentOS
Posts: 722

Rep: Reputation: 152Reputation: 152
strip out a random part of a file name


Hi,

I have a bunch of files I ripped from youtube. All files contain a random string before the extension. EG.
Quote:
Working Week - Sweet Nothing (Live)-3Hm1VFCYsxw.aac
Working Week - This Time-8kF0yjh-q-Y.aac
(I converted to aac) As you can see there is no fixed length or format to these file names. As I look at the examples I can see the complication is that the second file name contains two dashes, I thought there was consistently only one dash in this random string. My desire is to remove these strings, thus
Quote:
Working Week - Sweet Nothing (Live).aac
Working Week - This Time.aac
It would be really good to employ a command line tool to achieve this. Is there a process I can use which locates and removes the last string before the dot?

I also thought about stripping out the prefix:
Quote:
Working Week -
and then removing all fields between the first dash and the dot. Any ideas?

Apologies for providing a blank template but my sed/awk knowledge is very limited.

Cheers
 
Old 09-18-2016, 12:27 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 19,973

Rep: Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633
The only way you can solve these sort of problems is to precisely define the data so regex can be built.
You've only half defined it.
 
Old 09-18-2016, 12:45 AM   #3
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 5,864
Blog Entries: 3

Rep: Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052Reputation: 3052
Yes, you'll need to precisely identify the pattern in order to move forward.

One tool that will help is the perl-based version of rename. Not only will it take perl regex, which is much more powerful and flexible than "awk", it also has the -n option to do a dry run without changing anything. The dry run allows you to practice before changing anything.
 
Old 09-18-2016, 03:59 AM   #4
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 17,976
Blog Entries: 12

Rep: Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436Reputation: 5436
First of all (!) i'd look if the software used for ripping has options for how to create filenames.
then, it seems to me that these last chars are the youtube video ids. not random.
and looking at some yt video links, they seem to be always 11 characters long.
so just remove 11 characters before the extensions?
i would use bash for that.
see here:
http://www.tldp.org/LDP/abs/html/str...ipulation.html
 
Old 09-18-2016, 04:11 AM   #5
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 16,989

Rep: Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736Reputation: 5736
yes, they told you what to do, so you need to find out the rule what you want to use and construct a regex to do that.
You may try the tool rename which is already available, so you only need to execute it, you do not need to write a script.
The only additional info I can give you is to use an online regexp tester, like http://www.regexr.com/ and you may try this regexp, although I'm not really sure if that fits your needs:
Code:
^[^-]+- ([^-]+)-.*\.([^.]+)$
 
Old 09-18-2016, 05:12 AM   #6
timl
Member
 
Registered: Jan 2009
Location: Sydney, Australia
Distribution: Fedora,CentOS
Posts: 722

Original Poster
Rep: Reputation: 152Reputation: 152
thanks for the suggestions all. I need to look into the links and thoughts provided. It looks like a bit of work involved so I will see how I go.

Cheers
 
Old 09-18-2016, 05:34 AM   #7
hyperhead
Member
 
Registered: Mar 2011
Location: UK
Distribution: Slackware-14.2
Posts: 117

Rep: Reputation: 19
Hi

This worked for me

ls | grep Working | sed 's/[-][0-9].*[.]/./g'

However if you have any file names where the youtube id starts with a letter it wont work, it also assumes all the files start with the word Working.

Its enough to get you started anyhow.
 
Old 09-18-2016, 07:28 AM   #8
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,325

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
If id has always 11 chars length
Code:
file="Working Week - Sweet Nothing (Live)-3Hm1VFCYsxw.aac"
echo "${file::-16}.acc"
 
1 members found this post helpful.
Old 09-18-2016, 07:18 PM   #9
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 19,973

Rep: Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633Reputation: 3633
cute - I always forget about poor old bash.
 
Old 09-18-2016, 08:45 PM   #10
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 633

Rep: Reputation: 316Reputation: 316Reputation: 316Reputation: 316
Others has shown you methods of filtering the information.
I'm going to suggest youtube-dl as you can format the output template.

OUTPUT TEMPLATE parameters
https://github.com/rg3/youtube-dl/bl...utput-template

Example:
Code:
youtube-dl -o "%(title)s.%(ext)s"  --restrict-filenames 'https://www.youtube.com/watch?v=_HONxwhwmgU'
->Come_Together_-_John_Lennon_The_Beatles_Live_In_New_York_City.mp4
Use --restrict-filenames if you want to avoid nasty whitespace and special characters like ()''.
youtube-dl -h for more options
 
Old 09-19-2016, 12:40 AM   #11
timl
Member
 
Registered: Jan 2009
Location: Sydney, Australia
Distribution: Fedora,CentOS
Posts: 722

Original Poster
Rep: Reputation: 152Reputation: 152
Yep, Sefyir, that suggestion works. I'll have a play around with string manipulation as well.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] strip first byte from very large binary file qrange Linux - Newbie 5 05-17-2013 11:18 PM
[SOLVED] Strip HTML tags from XML file corfuitl Programming 6 03-26-2012 04:39 PM
How to strip a specific part of text from a larger file? pepsi_max2k Programming 10 03-27-2009 04:00 AM
using /dev/random to output random numbers on a text file guguma Programming 4 04-02-2007 01:42 PM
Strip comments from a file introuble Programming 9 10-02-2006 04:05 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:10 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration