LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-18-2010, 01:52 PM   #1
mattca
Member
 
Registered: Jan 2009
Distribution: Slackware 14.1
Posts: 333

Rep: Reputation: 56
Using Vim to edit files that were created with Dreamweaver - special characters


Not sure if this is the right forum.. probably close enough.

I work as a web developer and am the lone Vim user in a sea of Dreamweaver users. Most of the time this isn't a problem.. the one big exception is special characters. For some ungodly reason, Dreamweaver seems to insert gobbledygook instead of, you know, good old fashioned HTML. As a result, I can't read the characters, and often have to edit and change each one individually or else Vim saves it as the gobbledygook it is, rendering it.. unrenderable.

Can someone explain what these Dreamweaver characters are? Is it possible to configure Dreamweaver to insert proper HTML instead? I'd like to ask my manager if we can do something to avoid the headaches on my end. We're working on the French version of one of our sites, and right now I'm knee deep in gobbledygook created by one of my coworkers

Also, anyone know if there's a Vim plugin that will translate the characters to their proper form?

Thanks!
 
Old 03-18-2010, 02:00 PM   #2
devnull10
Member
 
Registered: Jan 2010
Location: Lancashire
Distribution: Slackware Stable
Posts: 572

Rep: Reputation: 120Reputation: 120
Any chance of screenshotting an example? It could be various things but I would have thought if it was compliant HTML which was output then it should be readable in any application! It's not that they were created in windows is it, so they have the crlf instead of lf to mark the end of the line? Try
Code:
man fromdos
 
Old 03-18-2010, 02:13 PM   #3
mattca
Member
 
Registered: Jan 2009
Distribution: Slackware 14.1
Posts: 333

Original Poster
Rep: Reputation: 56
Quote:
Originally Posted by devnull10 View Post
Any chance of screenshotting an example?
Attached

Quote:
It could be various things but I would have thought if it was compliant HTML which was output then it should be readable in any application!
There is absolutely no way anything we work with is even remotely close to standards compliant. I'm lucky if I don't have to deal with font tags.

Quote:
Try
Code:
man fromdos
I'll check that out, thanks!
Attached Thumbnails
Click image for larger version

Name:	shot.jpg
Views:	18
Size:	24.2 KB
ID:	3070  
 
Old 03-18-2010, 03:02 PM   #4
penguiniator
Member
 
Registered: Feb 2004
Location: Olympia, WA
Distribution: SolydK
Posts: 442
Blog Entries: 3

Rep: Reputation: 60
It might be enlightening to view your files in a hex editor to see what those bytes are.
 
Old 03-18-2010, 03:18 PM   #5
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,824

Rep: Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615Reputation: 615
It's an encoding failure on letters with accents. See if Dreamweaver is set to the wrong encoding or something.
 
Old 03-18-2010, 03:18 PM   #6
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
No expert here, but I'm guessing some kind of encoding incompatibility. Possibly Dreamweaver is using unicode combining characters instead of the more common precomposed versions, and vim can't render them properly. Or vice-versa. Or something like that. I don't see any of the French diacriticals in your screenshot, other than in the corrupted areas, so I'd say it's likely that the problem is somewhere in that area.

Edit: Also, double-check what encoding the files have been saved as (the file command may tell you). It's possible there's a subtle difference there and you may have to convert the files into standard UTF-8 for vim to handle properly.

Last edited by David the H.; 03-18-2010 at 03:24 PM. Reason: addendum
 
Old 03-18-2010, 03:27 PM   #7
smoker
Senior Member
 
Registered: Oct 2004
Distribution: Fedora Core 4, 12, 13, 14, 15, 17
Posts: 2,279

Rep: Reputation: 250Reputation: 250Reputation: 250
This might spread some light :
http://www.adobe.com/support/documen...g_errata2.html
 
Old 03-18-2010, 05:30 PM   #8
mattca
Member
 
Registered: Jan 2009
Distribution: Slackware 14.1
Posts: 333

Original Poster
Rep: Reputation: 56
Thanks for all the suggestions. I tried changing my $LANG to utf8, and setting encoding and fileencoding to utf8 in Vim, and that didn't help. I also ran the file through the file command, and it seems the encoding is us-ascii:

Code:
[matt@hopper] $ file -i index.php
index.php: text/x-php; charset=us-ascii

It might be worthwhile to mention that accented characters copied from Open Office, a text file, or Firefox, are displayed properly by Vim. It seems to only choke on characters inserted by Dreamweaver.

In any case, I have Dreamweaver on the Windows partition on my work computer.. maybe I'll take it home for the weekend and see if I can figure out what's going on
 
Old 03-18-2010, 05:53 PM   #9
mattca
Member
 
Registered: Jan 2009
Distribution: Slackware 14.1
Posts: 333

Original Poster
Rep: Reputation: 56
Alright, I checked things out in hex mode.. I'm not sure what I'm looking for, so hopefully someone will be able to make sense of this.

Here is a line with mangled characters:
Code:
<p align="justify">La famille ne s’arrête pas à un groupe de personne qui prennent soin les uns des autres. En tant que parents et grands-parents, vous êtes aussi des mentors, des éducateurs pour vos enfants, que ce soit par les questions que vous posez, les exemples que vous donnez ou les actions que vous posez. Et vous avez une plus grande famille que vous ne le croyez. Une famille globale. Et tous ses membres ont e potentiel de vous apprendre quelque chose. </p></font><br />
and here is its associated lines in hex mode:
Code:
0000c50: 2f68 323e 0a20 2020 203c 7020 616c 6967  /h2>.    <p alig
0000c60: 6e3d 226a 7573 7469 6679 223e 4c61 2066  n="justify">La f
0000c70: 616d 696c 6c65 206e 6520 73e2 8099 6172  amille ne s...ar
0000c80: 72c3 aa74 6520 7061 7320 c3a0 2075 6e20  r..te pas .. un 
0000c90: 6772 6f75 7065 2064 6520 7065 7273 6f6e  groupe de person
0000ca0: 6e65 2071 7569 2070 7265 6e6e 656e 7420  ne qui prennent 
0000cb0: 736f 696e 206c 6573 2075 6e73 2064 6573  soin les uns des
0000cc0: 2061 7574 7265 732e 2045 6e20 7461 6e74   autres. En tant
0000cd0: 2071 7565 2070 6172 656e 7473 2065 7420   que parents et 
0000ce0: 6772 616e 6473 2d70 6172 656e 7473 2c20  grands-parents, 
0000cf0: 766f 7573 20c3 aa74 6573 2061 7573 7369  vous ..tes aussi
0000d00: 2064 6573 206d 656e 746f 7273 2c20 6465   des mentors, de
0000d10: 7320 c3a9 6475 6361 7465 7572 7320 706f  s ..ducateurs po
0000d20: 7572 2076 6f73 2065 6e66 616e 7473 2c20  ur vos enfants, 
0000d30: 7175 6520 6365 2073 6f69 7420 7061 7220  que ce soit par 
0000d40: 6c65 7320 7175 6573 7469 6f6e 7320 7175  les questions qu
0000d50: 6520 766f 7573 2070 6f73 657a 2c20 6c65  e vous posez, le
0000d60: 7320 6578 656d 706c 6573 2071 7565 2076  s exemples que v
0000d70: 6f75 7320 646f 6e6e 657a 206f 7520 6c65  ous donnez ou le
0000d80: 7320 6163 7469 6f6e 7320 7175 6520 766f  s actions que vo
0000d90: 7573 2070 6f73 657a 2e20 4574 2076 6f75  us posez. Et vou
0000da0: 7320 6176 657a 2075 6e65 2070 6c75 7320  s avez une plus 
0000db0: 6772 616e 6465 2066 616d 696c 6c65 2071  grande famille q
0000dc0: 7565 2076 6f75 7320 6e65 206c 6520 6372  ue vous ne le cr
0000dd0: 6f79 657a 2e20 556e 6520 6661 6d69 6c6c  oyez. Une famill
0000de0: 6520 676c 6f62 616c 652e 2045 7420 746f  e globale. Et to
0000df0: 7573 2073 6573 206d 656d 6272 6573 206f  us ses membres o
0000e00: 6e74 2065 2070 6f74 656e 7469 656c 2064  nt e potentiel d
0000e10: 6520 766f 7573 2061 7070 7265 6e64 7265  e vous apprendre
0000e20: 2071 7565 6c71 7565 2063 686f 7365 2e20   quelque chose. 
0000e30: 3c2f 703e 3c2f 666f 6e74 3e3c 6272 202f  </p></font><br /
Firefox seems to display the characters properly, so if another screenshot is needed let me know. Oh, and if it's relevant, I'm using urxvt.. and get the same behaviour in gvim and console vim.
 
Old 03-18-2010, 10:40 PM   #10
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Quote:
Originally Posted by mattca View Post
I also ran the file through the file command, and it seems the encoding is us-ascii:
That could be part of the problem. If the accented characters were working correctly, then the file should be displaying as UTF-8. Perhaps you could try running it through iconv or recode to change the encoding. But then again, UTF-8 has incorporated the ascii character set into it, and file will only display as such if it detects non-ascii characters in the file. So it's probably just showing up as ascii because the characters weren't rendered properly to begin with.

Also, did you follow the link that smoker provided? It explains how to change Dreamweaver's unicode normalization settings, which I'm fairly convinced is where your problem lies.
 
Old 04-08-2010, 09:38 PM   #11
mattca
Member
 
Registered: Jan 2009
Distribution: Slackware 14.1
Posts: 333

Original Poster
Rep: Reputation: 56
Turns out the files were unicode, and Vim was saving them as ascii. I figured out how to get it to save them as unicode.. but I couldn't get it to display the unicode characters properly
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
renaming files with spaces and special characters. bowens44 Linux - Newbie 8 06-29-2009 06:52 PM
Replacing lines in files that contain special characters arizonagroovejet Linux - General 3 06-22-2009 09:19 PM
not showing files or folders with special characters, like umlauts pinknyunyu Slackware 16 05-06-2009 09:16 AM
Copying files with special characters cornish Linux - Newbie 13 12-21-2007 11:22 AM
Searching for files with special characters Yig Linux - Newbie 4 11-08-2007 05:53 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:01 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration