Thanks to both of you!
Quote:
Code:
tabCount = len( reMatches.group(1).split('\t') ) |
This is really turning into an embarrasment of riches--now there are 3 different scripts, one of them even having already gone through a revision!
Thanks for providing a python alternative, Dark_Helmet. I get an error when I try to run this one, though: Code:
[me@mymachine ~]$ ./outl2latex.py sample.outl > sample.tex James PS In my outlines I cannot anticipate any scenario where tab spaces would occur anywhere other than at the left margin. |
Odd.
The Python interpreter on my machine is 2.6.6. Though, I don't recall seeing that rstrip() was deprecated or any other change to strings that would account for that syntax error. I'll go check the docs. EDIT: According to Online Python Docs that syntax should be fine--as long as inputLine is still treated as a string. I'll double-check whether they changed how a file iterator is handled. EDIT2: I don't see any changes to file I/O that would explain it either. For jollies, I may download the 3.2.2 source and see what I can find out. In the meantime, the only other thing I can suggest is to check for typos if you manually typed in the script. |
I should be a betting man. Because I could make a lot of money betting that whenever I think I know what the problem is, it's something else.
"Yes, Mr. Bookie, I think strings and file I/O are the problem. So please put this $50 on anything but those two" The syntax error is because the print statement changed. I'll put up a corrected version of the script in a moment. EDIT: The new script modified for 3.2.2's new print() statement style. This ran on my system with a compiled 3.2.2 interpreter. Code:
#!/usr/bin/python |
Thanks for that modified version, Dark_Helmet. Sorry to put you to that extra work: I discovered in the meantime that I do actually have an older version of python on this machine (2.7) and that, when invoked with the path to that version, the script ran fine. Still, I'm wondering whether the updated script will be backward-compatible, i.e., whether it'll run using older versions of python? Or does each version have to be used only with particular versions of python?
James Later edit: Your new version works fine here with my python 3.2.2 as well. It does not work when invoked with the path to python 2.7. |
No need to apologize. To be honest, I'm not sure why the Debian maintainers have not pushed Python 3.2 out yet as a replacement for 2.6. Then again, given the problem we just went through, they could be concerned that the upgrade could break lots of existing scripts.
Such is the case when a fundamental script tool (such as print) is changed and the change breaks backward compatibility. But some of that is to be expected in a major version change. Back to your original problem, I completely forgot to address your question: could you write a bash script to solve your problem? Sure, you can do it. You could do it for the same reason I wrote this in Python: to teach yourself more of the ins-and-outs of the scripting language and its features. As a rule of thumb: if you can accomplish a task with a sequence of commands at a terminal, you can code that task as a bash shell script. There may be languages that are better suited for specific tasks. I think Nominal Animal touched on this (though I only scanned his replies). So, it boils down to your goal. Is your goal a functioning script, to build on your scripting experience, or a combination of both? If I were to start writing a bash shell script for this particular task, a very basic pseudocode outline: Code:
#!/bin/bash |
Thanks for the input on the bash script and for reworking the python script, Dark_Helmet. I'm still kind of inclined to cut my programming teeth, as it were, on something like bash since it's so relevant to administration of my Linux systems. So I'm glad you've provided some kind of starting point and thus some encouragement to pursue it further.
So far, I'm not sure anyone has looked very carefully at the bash script that I found that converts text to html, the one I thought might be adaptable to the task of adding LaTeX mark-up to my outline files (I provided a link above). So why don't I go ahead and paste it here for reference. Code:
#!/bin/bash Any thoughts on the possibility of adapting this script, anyone? James |
Quote:
Quote:
Here's something I've begun to wonder about in relation to the title/header thing. I'm still kind of attached to this pseudo-bullet (equals sign) idea and have even included it now in the color highlighting scheme I developed for my outline files under nano. So, what about the following scenario: if the first line of the file starts at the left margin and does not begin with an equals sign, nano won't color it. That's a good way, while looking at the file under nano, of helping to distinguish that first line as the title; but it could act, additionally, as an indicator to some conversion script that the text contained in that line needs to go into the header, i.e., between the curly brackets at \Large{***Header*title*here***}. Does that make sense? Quote:
Thanks, James |
Quote:
Creating a separate variable for each line of the header? Yikes. The "cat << EOF ... EOF" is, in my mind, far and away a much more suitable approach. Creating a function that issues an echo for each header line? Yikes. See previous point re: cat structure. Defining the functions themselves? I don't see the cost-benefit. The author took the time to write those functions. Why? Because the author had to believe that writing: Code:
write_headers Code:
cat << EOF All that said, I'm not trying to bash the script. I just think the author went out of his way with good intentions, but they were unnecessary and backfired. That, or the author used the script as a learning tool and incorporated some shell features as an academic exercise--as opposed to a functional exercise. Now, if you convert those variables and functions to the "cat << EOF" format, the html script starts to look a lot like the perl, python, and incomplete bash scripts from earlier. Quote:
1. grep is used to decide whether you need to modify a line from your outline 2. sed extracts and/or modifies the contents of the line For the bash script above, you could replace the two sed commands with cut. For instance: Code:
literalTabs=$( echo "${inputLine}" | cut -f 1 -d '=' ) |
Quote:
Consider the following case: Quote:
I hope what I'm saying is clear and that I have, in fact, correctly identified the shortcoming of your script. Now, if the script were looking for the spacing plus equals sign, I think it would be catching all instances where the \outl{#} tag needs to be inserted. Unless there's some other way for it to detect something akin to carriage returns and furthermore, if the text editor's line-wrapping feature does not introduce something like the carriage return. Sorry I was unable, by looking at your code, to identify this. I managed to figure it out by reading over and thinking about the words you used to express your concept. I relate best to that sort of code, it seems :) James |
Quote:
Quote:
All are good solutions, it is just a matter of which one you are most comfortable with. EDIT 2: The script below uses a = before text to start a new paragraph. Spaces prior to the = are considered the indent that defines the outline level; any whitespace after the = is not considered (but is kept in the output). Quote:
Code:
% Template: template.tex Code:
% Defaults, if not specified in the text input file Code:
#!/usr/bin/awk -f Code:
./script.awk input.txt Code:
\documentclass{article} The awk script above now allows defaults to be set in the template file itself. That way you only need to use the initial lines like % Title: if you want to override a string in the template. In the example cases above, you can add a % Date: string line to your input text file, if you want to specify a certain date. By default, the template file uses \today as you can see in the start of the new template file, and the output. |
I think I may have not properly conveyed where the equals sign gets placed in my outlines: it actually precedes immediately the text in each level. Look, for example, at the bullets in the sample outline below:
In case this might help, the color highlighting stipulations I've entered into .nanorc--which also show some regular expressions that help nano to decide where to apply highlighting--is as follows: Code:
syntax "outl" "\.outl$" James |
I edited my post above. Does it match your use case better this way?
Note that I assume that all whitespace after the = is part of the text, not considered "indent" when selecting the outline level. I think, but am not sure, that this matches your use patterns. |
As for me, given that there are so many good scripting tools available, any of which can be used with a simple #!shebang such that the user doesn't have to know or care, I don't choose to use bash (or ksh) for scripting purposes.
The compelling advantage of using "the big guns," in addition to the simple fact that they are intended for the purpose and are therefore often a good bit easier to use (IMHO) than bash, is that you can grab complete and well-developed "packages" to use along with your language-of-choice. For instance, if you need to parse an Apache log file, well (for example, if you decide to use Perl), go to http://search.cpan.org, type in the search term "apache log," and there you have a list of (at the present moment) 373 full-featured packages to choose from. Install the package of your choice, use it in your script, and all of the functionality in that package is something that you didn't have to write. That's huge. And all of the languages are similar ... Perl, Python, Ruby, even PHP. The end-user of your command just invokes the command, identically to the way they'd invoke a bash-script, and doesn't have to know or care how it works. Always remember that, no matter what it is that you're trying to do, it has certainly been done before by someone else, and done very well. These complete and platform-independent packages of well tested tools are free for the asking. |
EDIT: I take back what I've said below. It seems somehow the mark-up got removed from the template. And now, back to your regularly scheduled programming (no pun intended) . . .
Thanks for your continued input, Nominal. The edited version does match better my use case. I've noted one minor anomaly though: it's somehow removing the \today mark-up from the header. I've tried it twice now with the same result each time. Since the code is so complex, I cannot determine from looking at it where/why that might be happening. James |
All times are GMT -5. The time now is 12:45 AM. |