LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-07-2022, 10:24 PM   #1
rcdawson
LQ Newbie
 
Registered: Jan 2012
Posts: 14

Rep: Reputation: Disabled
Transfer files whose paths or filenames contain characters prohibited by Microsoft.


In transferring files from a Linux computer to a windows computer, the folders and files with names or paths containing characters not allowed on windows were not transferred. There are some 700 that have illegal characters in either their file names, their paths, or possibly both. I found a way to replace all the illegal characters in all folders and files. That would enable me to copy the everything to the windows computer, skipping legal filenames, but I would prefer to create a new folder containing all of the offending folders and files with the their illegal characters replaced but preserving their paths relative to the user’s home directory. This would give me a folder containing all of the files that didn’t transfer originally, that is all the files that had illegal characters in either their path or their file name.

I want to replace the characters < > : \ / ? | { }* with an underscore.

For instance if these were all the files on the Linux computer (Here a ? could be any of the above listed prohibited characters):
/home/user/Documents/file01.txt
/home/user/Documents/folderA/file???02.txt
/home/user/Documents/folder???B/file03.txt
/home/user/Documents/Folder???C/file?04.txt

Then I would like to have a new folder with the following files on the external drive:
Documents/folderA/file___02.txt
Documents/folder___B/file03.txt
Documents/Folder___C/file_04.txt

I could then copy that file to the user’s directory in windows.

Asking a lot, but if someone is so good as to provide a solution, please explain why it works. I have found solutions to similar requests on line, but I don’t understand them, certainly not enough to modify them to suit my purpose.

Thank you
 
Old 03-08-2022, 10:02 AM   #2
computersavvy
Senior Member
 
Registered: Aug 2016
Posts: 3,345

Rep: Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484
I think you will have to do a little script writing to achieve your wishes. Only you have the full access to see the file names and what characters are causing the issue.

With your description it also appears that you may need to either do one directory at a time, or have the script be enabled to parse the directory tree and descend as needed to work one file at a time.

The generic flow would be similar to this for whatever language you use, though using bash is pretty simple. Note that this is an english language summary of what the program would do.

Code:
for FILE in * ; do
  if "$FILE" contains <any of the unwanted characters> ; then
     TMPFILE="$FILE"
     while "$TMPFILE" contains <any of the unwanted characters> ; do
         replace the unwanted characters in "$TMPFILE"
     done
     cp "$FILE" /path/to/new/folder/"$TMPFILE"
  fi
done
Once all the files have been processed and the file names adjusted to your liking then the files in the new path could be moved to windows.

Personally, the very first character I would replace would be the spaces in the file names.

This is an example of one script I did to rename and move some files when I was ripping some audio books to my pc.
Code:
#!/usr/bin/sh

# rename.sh (this file name)
# this is to be called with a structure like the following
#
# for DIR in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 ; do `./rename.sh Elizabeth_Moon/Elizabeth_Moon_-_Kings_of_the_North .._Track Kings_of_the_North ".mp3" $DIR `; done
#
# where DIR is the trailing numbers of the volumes for each CD volume
# MATCH is the filename part of the original filename to be replaced
# SUB is the new filename to be assigned
# and EXT is the extension of the file being renamed.
# The order of the arguments is important
#
# It can be used for a single file or directory by judicious use of the command line 
# structure but was designed to allow for easy renaming and consolidating all the track 
# titles for a single book into a single directory named for the artist and book title. 

if [ "x$5" = "x" ]
then
  DIR=$1
  SUB=$3
else
  DIR=$1_$5
  SUB=$3_$5
fi

MATCH=$2
EXT=$4

# `mkdir $3`

cd $DIR
for i in *$EXT
do
  NAME=`basename $i $EXT`
# `echo "Original file name is $NAME"`
  NEWNAME=`echo $NAME | sed s/$MATCH/$SUB/g`
# `echo "New file name is $NEWNAME"`
  `mv $NAME$EXT $NEWNAME$EXT`
#  `cp $NAME$EXT ../$3/$NEWNAME$EXT`

done
# cd - >/dev/null 2>&1
cd -
 
Old 03-08-2022, 10:21 AM   #3
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,382

Rep: Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761
Quote:
I found a way to replace all the illegal characters in all folders and files.
To me, what you want reduces to copy all files containing the illegal characters to a new directory (folder), so they can be handled separately.
Code:
#!/bin/bash

# Create the new directory
target="/tmp"
mkdir "$target/Documents"

# Set a shell option to descend into subdirectories
shopt -s globstar

# Change to top directory
cd $HOME/Documents

# Loop through files and copy if the path and/or name contains any of :?|{}*<>\ 
for f in **/*; do
  ## Some characters need to be escaped with a preceding backslash in the bracket expression
  [[ "$f" =~ [:?|{}*\<\>\\]+ ]] && cp -a "$f" "$target/Documents/$f"  
done
NB - I have not included the / character as it is the Linux delimiter between path and filename.
 
Old 03-08-2022, 09:54 PM   #4
rcdawson
LQ Newbie
 
Registered: Jan 2012
Posts: 14

Original Poster
Rep: Reputation: Disabled
I think computersavy's suggested outline to solve my problem will copy all files to one directory, but will not include the paths. I need to have all files in a file structure that replicates the original file structure, so that the modified files will find their way into their original folders when I copy them to the new computer.

I think allend's suggestion does what I want it to, but it seemed to choke for a reason that I cannot understand. I created a test folder containing several sample files. It copied most of them and put them in the folders where they belong, but for some reason it didn't copy all the folders. here are the source and target directories:

Here is the target directory:

/home/rcdawson/Documents/kittytest
└── kittyhome
├── FilewithIllegalName?:Level1.txt
├── FileWithLegalNameLevel1.txt
├── Folder1WithIILegal:?*NameLevel1
│** ├── FileWithIllegalName*:?inFolderWithLegalName.txt
│** ├── FileWithLegalNameinFolderWithILLegalName.txt
│** └── SubfolderWithLegalNameInSubfolderWithIllegalName
│** ├── FileWithIllegalName*:?inFolderWithIllegalName.txt
│** └── FileWithLegalNameinFolderWithIlLegalName
└── FolderWithLegalNameLevel1
├── FileWithIllegalName*:?inFolderWithLegalName.txt
└── FileWithLegalNameinFolderWithLegalName

Here is what made it to the target directory:
/tmp
├── Documents
│** ├── FilewithIllegalName?:Level1.txt
│** └── Folder1WithIILegal:?*NameLevel1
│** ├── FileWithIllegalName*:?inFolderWithLegalName.txt
│** ├── FileWithLegalNameinFolderWithILLegalName.txt
│** └── SubfolderWithLegalNameInSubfolderWithIllegalName
│** ├── FileWithIllegalName*:?inFolderWithIllegalName.txt
│** ├── FileWithLegalNameinFolderWithIlLegalName
│** └── SubfolderWithLegalNameInSubfolderWithIllegalName
│** ├── FileWithIllegalName*:?inFolderWithIllegalName.txt
│** └── FileWithLegalNameinFolderWithIlLegalName

Notice the the folder with a legal name didn't make it.

Here is the error message it produces upon exit:

cp: cannot create regular file '/tmp/Documents/FolderWithLegalNameLevel1/FileWithIllegalName*:?inFolderWithLegalName.txt': No such file or directory

Suggestions?

Last edited by rcdawson; 03-08-2022 at 10:01 PM. Reason: Correct error.
 
Old 03-09-2022, 04:16 AM   #5
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,382

Rep: Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761
My apologies. That should have been 'cp -p' not 'cp -a'
Code:
  [[ "$f" =~ [:?|{}*\<\>\\]+ ]] && cp -p "$f" "$target/Documents/$f"
 
Old 03-09-2022, 12:33 PM   #6
computersavvy
Senior Member
 
Registered: Aug 2016
Posts: 3,345

Rep: Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484Reputation: 1484
Note that I explained my use and sent an example that could easily be modified for your needs.
 
Old 03-09-2022, 09:54 PM   #7
rcdawson
LQ Newbie
 
Registered: Jan 2012
Posts: 14

Original Poster
Rep: Reputation: Disabled
It worked better with cp -a. With cp -p I got only the Documents folder with just one text file contained in the Documents folder. No subfolders.

Here the error messages with cp -p:

rcdawson@PantherBox:~/Scripts$ ./RemoveIllegalFileNames4WindowsTest.sh
cp: -r not specified; omitting directory 'Folder1WithIILegal:?*NameLevel1'
cp: cannot create regular file '/tmp/Documents/Folder1WithIILegal:?*NameLevel1/FileWithIllegalName*:?inFolderWithLegalName.txt': No such file or directory
cp: cannot create regular file '/tmp/Documents/Folder1WithIILegal:?*NameLevel1/FileWithLegalNameinFolderWithILLegalName.txt': No such file or directory
cp: -r not specified; omitting directory 'Folder1WithIILegal:?*NameLevel1/SubfolderWithLegalNameInSubfolderWithIllegalName'
cp: cannot create regular file '/tmp/Documents/Folder1WithIILegal:?*NameLevel1/SubfolderWithLegalNameInSubfolderWithIllegalName/FileWithIllegalName*:?inFolderWithIllegalName.txt': No such file or directory
cp: cannot create regular file '/tmp/Documents/Folder1WithIILegal:?*NameLevel1/SubfolderWithLegalNameInSubfolderWithIllegalName/FileWithLegalNameinFolderWithIlLegalName': No such file or directory

Since it complained that -r wasn't specified, I changed to cp -r. This got me pretty much back to where I was with cp -a. It copied more, but not all. Here is the error message.
cp: cannot create regular file '/tmp/Documents/FolderWithLegalNameLevel1/FileWithIllegalName*:?inFolderWithLegalName.txt': No such file or directory
 
Old 03-10-2022, 06:31 AM   #8
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,832

Rep: Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218Reputation: 1218
After reading
Code:
man pax
I was confident with
Code:
cd /home/user && pax -r -w -s'/[:?|\\{}]/_/g' Documents /path/to/destfolder
(The destfolder must exist.)

Unfortunately pax is broken, at least in SUSE Linux: it omits the leading characters before a match, and the /g option does not (or cannot) apply.

EDIT:
Just read it on the Internet:
03.08.2017 — In SLE 12 SP1, pax was replaced with spax (from the package star ).
So the spax is broken.
And it works well in SLE 11 SP4 that has got a Posix-compliant pax.

Last edited by MadeInGermany; 03-10-2022 at 06:51 AM.
 
Old 03-10-2022, 07:27 AM   #9
jmgibson1981
Senior Member
 
Registered: Jun 2015
Location: Tucson, AZ USA
Distribution: Debian
Posts: 1,151

Rep: Reputation: 393Reputation: 393Reputation: 393Reputation: 393
I wrote a script to download files from a specific website. I manually populate a list then it does the rest at certain hours of the night. I wanted it all in plain english with no punctuation save the .mp4/ .mp3 . The script ended up with a nasty looking sed line. Could probably be better. This is how I accomplished the same type of thing only for my own sanity in keeping a single alphabet. Just filters through all the combos on the download site and replaces with it's appropriate english counterpart / or a space.

Code:
op_name_edit=$(echo "$op_name" \
| sed -e 's/á/a/g' -e 's/[öø]/o/g' -e 's/Á/A/g' -e 's/í/i/g' \
-e 's/—/ /g' -e 's/[[:punct:]]//g' -e 's/\b\(.\)/\u\1/g' | tr -s ' ')
It works. Terrifying to look at though.
 
Old 03-10-2022, 08:22 AM   #10
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,382

Rep: Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761Reputation: 2761
Refining the script to include a mkdir command when the file is a regular file.
Code:
[[ "$f" =~ [:?|{}*\<\>\\]+ ]] && [ -f "$f" ] && mkdir -p "$target/Documents/${f%%/*}" &&  cp -p "$f" "$target/Documents/$f"
Given a tree
Code:
kittytest/
├── Folder1WithIILegal:?*NameLevel1
│** ├── FileWithIlLegal:?*NameinFolderWithIlLegalName.txt
│** └── FileWithLegalNameinFolderWithIlLegalName.txt
├── FolderWithLegalNameLevel1
│** ├── FileWithIlLegal:?*NameinFolderWithLegalName.txt
│** └── FileWithLegalNameinFolderWithLegalName.txt
└── kittyhome
    ├── FileWithIlLegal:?*NameinFolderWithIlLegalName.txt
    └── FileWithLegalNameinFolderWithLegalName.txt
then I see in /tmp/Documents
Code:
/tmp/Documents/
├── Folder1WithIILegal:?*NameLevel1
│** ├── FileWithIlLegal:?*NameinFolderWithIlLegalName.txt
│** └── FileWithLegalNameinFolderWithIlLegalName.txt
├── FolderWithLegalNameLevel1
│** └── FileWithIlLegal:?*NameinFolderWithLegalName.txt
└── kittyhome
    └── FileWithIlLegal:?*NameinFolderWithIlLegalName.txt
 
Old 03-10-2022, 06:16 PM   #11
rcdawson
LQ Newbie
 
Registered: Jan 2012
Posts: 14

Original Poster
Rep: Reputation: Disabled
Smile Solved! Transfer files whose paths or filenames contain characters prohibited by Microsoft

At first the pax command stalled, but one, or maybe two, corrections and it works like a charm:

After poring over the pax man page and the examples there, I found that the command needed a "p" after the "g" in order to recognize that there was a pattern involved, and maybe that it needs a space after the "s". The samples had a space after "s", so I incorporated that as well. Putting in the space without the "g" didn't solve the problem.

This moved everything in the test directory without a hitch!

pax -rw . -s '/[:?|\\{}]/_/gp' TargetDirectory

Thank you. Thanks to all who took the time to offer suggestions.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Rename filenames contain special characters (/) senthilvael Linux - Server 6 08-27-2013 05:28 AM
copy file whose file name containing special characters from Linux to Window phuong Linux - Newbie 4 02-28-2012 10:52 AM
[SOLVED] How to find files that contain one string, but don't contain another. PatrickDickey Linux - Newbie 2 09-11-2011 06:00 AM
Listing files whose contains contain a string, and are of a particular extension? SirTristan Linux - Newbie 3 05-06-2010 01:10 AM
Automatically resolving WINDOWS paths to pre-configured Linux paths gazzy Linux - General 1 09-05-2003 10:15 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:34 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration