![]() |
Need to scrub UNICODE NULLS from MySQL tables
Ok, I have searched the google and these forums and haven't found any guidance for someone in my predicament.
We have content that got sucked into our MySQL 5.5 database (UTF8) that came from flat files that were formatted in UNICODE. The impact is that our other software that is supposed to parse the content and do stuff like publish it in PDFs really barfs when it hits these UNICODE NULL characters. I believe when you do a query in mysql, the values appear as so: g o r g o n z o l a Can anyone recommend a select query that would locate the values in varchar columns? Ideally, I'd like some kind of stored procedure that would scrub the UNICODE nulls. All suggestions appreciated, di11rod |
Oh, been there, did that (and it hurt, too!).
For some years I was receiving files from clients that were supposed to be text only, no special characters, no junk. But, as we all know, Microsoft users don't know a slash from a back slant and... I developed a front-end filtering utility that all received files passed through to eliminate every bad character users could come up with, \, ^, tab, back space (yes, back space) and who knows what all. Kept a lot of crap out of our data bases. I believe that doing this in SQL is going to be extremely difficult, even if you use the C-API (it's probably doable but, wow, what a pain). I suspect that the easy way is to unload your affected tables to text files and use one of the editors, sed for example, to clean up the mess. You could write a little C filter program that deletes non-ASCII characters or perhaps make use of this sample program (a subset of the "big" front end filter): Code:
#include <stdio.h>Code:
cc -o filter filter.cCode:
usage: filter [-r remove] [-s sub] [-w warn] infileCode:
filter -r infile > outfileHope this helps some. |
| All times are GMT -5. The time now is 10:54 PM. |