[SOLVED] Problems with getline and Windows generated input
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I’m porting a C++ program from Windows to Linux (Debian/GCC). It gets its ‘instructions’ by reading lines from a text control file. The essentials of that part of the code are these. Irrelevant details such as setting string length and file opening are omitted, there is certainly no problem with them.
ifstream ifs ;
string strbuf ;
While (!ifs.eof())
{
(void) getline (ifs, strbuf) ;
Strbuf += “,@” ; // append guard chars
Do more stuff
}
For convenience, I used a control file that had been prepared on Windows Notepad++, and immediately ran into trouble. The guard characters, rather than being appended, were overwriting the first two characters of the control string, as follows:
Line read: CtrlCommand
Want to get: CtrlCommand,@
What I got: ,@rlCommand
It turned out that the problem was the command file format. Linux appears to terminate lines with 0AH, whereas Windows uses 0DH 0AH. A cout of the lines ahead of the guard char insertion didn't show any irregularities.
In practice, I can’t control where control files come from, so my solution has been to write my own getline routine, parsing the lines in binary, dealing correctly with the line terminators, and casting the asciz string to a std::string. So the problem is solved, I just wonder whether anyone can explain how the file format is causing the append to turn into an overwrite.
Also maybe this might help someone else who encounters the problem.
How are you moving the code from Windows to Linux?
The end-of-line characters are different (Windows uses CRLF, Linux just LF), and must be converted.
Probably the easiest way to do this is with an (S)FTP ASCII transfer, which does just that...converts the line ends...that's the definition of an ASCII transfer.
If you're sneakernetting, run the files(s) through the dos2unix utility:
Code:
dos2unix filename > newfilename
Just re-read...sorry, you already knew about the line end stuff. Note that if you run dos2unix on a file that doesn't need it's line-ends changed, nothing will be changed...so you can run everything through that unconditionally. If you always (ASCII) transfer from Windows via (S)FTP, the line ends will be converted automatically.
The overwriting is caused by the CR. Think of a typewriter (remember those?) being set back to the beginning without advancing the paper...the next line would be typed over the previous one.
A CRLF in a script/program almost always causes the interpreter or compiler to choke.
Thanks for the responses. The dev system is networked off a PC running Ubuntu. I simply inserted a thumb drive with the Windows file in it, then copy/pasted into Users/Bruce. Then used filezilla to move it onto the dev target. The file transferred with no alterations, the CRLFs were still present, that was the problem.
Thanks also for the suggestions re file conversion. in my situation I have no control over who uses the application, and no doubt some people will do what I did without being aware of the EOL mismatch. That's why I chose a bombproof approach of converting the file format myself.
in my situation I have no control over who uses the application, and no doubt some people will do what I did without being aware of the EOL mismatch. That's why I chose a bombproof approach of converting the file format myself.
Don't doubt that your script does the job. Just feel the need to point out (mostly to others) that you re-invented an existing wheel. dos2unix (and unix2dos) are existing utilities for converting EOL as needed.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.