Tcl: Remove all alphabetic characters at beginning of string
Hello all,
I am working with a Tcl script and have some strings in the following format (RE): [a-zA-Z]+[0-9]{6}-[0-9] There are some leading letters, combinations of capital and lowercase. Then six digits, followed by a hyphen, then one more digit. I would like to remove all of the leading alphabetic characters from the string. The resulting string would then be in this format: [0-9]{6}-[0-9]. In other words, six numeric digits, a hyphen, then one more digit. I have tried: Code:
set newstr [string trimleft $origstr alpha] I couldn't get anything with regsub to work correctly, but I am somewhat of a noob with RE's in general and regsub in particular. There are usually 5 leading letters at the beginning of these strings, and I could in most cases get away with using string replace and constant indices to extract the substring. However, my preference is for this to be robust enough to handle all cases with 1 through n leading alphabetic characters. |
The following code might help (it's part of a regular expression tester that I wrote as part of an application so users can test regular expressions).
Code:
proc try_it {regexp str} { Your regular expression: ^([a-z]+)([0-9]{6}-[0-9])$ Your string: hallo123456-3 matchstr : hallo123456-3 submatch1 : hallo submatch2 : 123456-3 The trick lays in the grouping (the round braces marked in bold red). Hope this helps |
My rather poor solution would be:
cat filewithstrings | tr -d [:alpha:] >filewithstrings You can probably experiment and such and find a better solution though. Edit: Didn't realize Tcl was a language, the above is in bash, sorry :(. |
Thank you to both acvoight and Wim Sturkenboom. Those examples are both useful. Even if not Tcl, still good to know.
I had an epiphany this morning that should work: Code:
regsub {^[a-zA-Z]+} $strwithletters "" noletters I'd still prefer to do this the other way -- search for the pattern that I want to keep, and then store just that pattern into a variable. Right now I am searching for the pattern belonging to the part I want to remove. Wim Sturkenboom's solution of subdividing the entire regular expression (both the letters part and the digits part) into subpatterns is probably the way to go... |
All times are GMT -5. The time now is 12:41 AM. |