LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 02-24-2017, 01:11 AM   #1
Xeratul
Senior Member
 
Registered: Jun 2006
Location: UNIX
Distribution: FreeBSD
Posts: 2,656

Rep: Reputation: 255Reputation: 255Reputation: 255
How to convert DOCX document to Text with UTF with üöä (german) characters?


Hello,

Would you know how to use locales and console with docx2txt ?
How to convert DOCX document to Text with UTF with üöä (german) characters?

Code:
 locale -a
C
C.UTF-8
POSIX
Regards
 
Old 02-24-2017, 08:19 PM   #2
John VV
LQ Muse
 
Registered: Aug 2005
Location: A2 area Mi.
Posts: 17,623

Rep: Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651
open the " Microsoft Office ©™ " *.docx file in Libre Office and resave as a txt file
the formatting will be off but you do want a plain text file with NO formatting
 
Old 02-25-2017, 01:50 AM   #3
Xeratul
Senior Member
 
Registered: Jun 2006
Location: UNIX
Distribution: FreeBSD
Posts: 2,656

Original Poster
Rep: Reputation: 255Reputation: 255Reputation: 255
Quote:
Originally Posted by John VV View Post
open the " Microsoft Office ©™ " *.docx file in Libre Office and resave as a txt file
the formatting will be off but you do want a plain text file with NO formatting
unfortunately I have to do it with the console, since libreoffice is not installed.
 
Old 02-25-2017, 07:11 AM   #4
John VV
LQ Muse
 
Registered: Aug 2005
Location: A2 area Mi.
Posts: 17,623

Rep: Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651Reputation: 2651
use google docs

they are on line and can convert it
 
Old 02-25-2017, 01:52 PM   #5
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 19,872
Blog Entries: 12

Rep: Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053Reputation: 6053
Quote:
Originally Posted by Xeratul View Post
Would you know how to use locales and console with docx2txt ?
How to convert DOCX document to Text with UTF with üöä (german) characters?

Code:
 locale -a
C
C.UTF-8
POSIX
Regards
just did a quick test 'docx2txt.pl foreign.pdf' and äö look fine.
however, on my system:
Code:
$> locale -a
C
POSIX
en_IE.utf8
en_US.utf8
fi_FI.utf8
it seems you are missing sth in the locale department.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to convert a text to utf-8 byran cheung Linux - Newbie 1 03-25-2015 12:01 AM
gimp text and utf characters dokkalf Slackware 2 09-14-2014 10:18 PM
convert text-file from utf-8 to iso-8859-1 [SOLVED] @ngelot Linux - Server 1 06-12-2007 05:47 AM
I need perlscript to convert text file in UTF-16 cccc Programming 3 07-04-2004 04:08 AM
convert CSV (TEXT) files to UTF-16 cccc Programming 1 07-01-2004 01:54 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration