Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
04-20-2009, 06:24 AM
|
#1
|
Senior Member
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561
Rep:
|
BASH/No X: Using google translate to convert TXT files (translate)
Any ideas how without X we could use the google translator?
eg:
Quote:
googletranslator en de file1.rtf output.rtf
|
(Oh, with the special chars, we shall use doc or rtf)
Hint:
Location maybe got from "http://google.com/translate?langpair=en%7Ces&u="
Best regards
Last edited by frenchn00b; 04-20-2009 at 06:25 AM.
|
|
|
04-20-2009, 07:27 AM
|
#2
|
Member
Registered: Jul 2004
Location: Chennai, India
Posts: 952
|
I had a look at the site mentioned by you. I think you are out of luck. Google doesn't yet provide general translation servies (including file formats and embedded control charactes).
It works only for web pages.
End
|
|
|
05-04-2009, 12:18 AM
|
#3
|
Senior Member
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561
Original Poster
Rep:
|
Quote:
Originally Posted by AnanthaP
I had a look at the site mentioned by you. I think you are out of luck. Google doesn't yet provide general translation servies (including file formats and embedded control charactes).
It works only for web pages.
End
|
It might very certainly be possible. What is in google is certainly meant for users, so we can use it.
Actually I would like to :
enter a TXT file into my console, and that it translates it via google.
Why not? Why no one is interested, because nothing similar is already for linux?
|
|
|
05-04-2009, 12:22 AM
|
#4
|
Member
Registered: Jan 2006
Distribution: debian-lenny
Posts: 37
Rep:
|
hmm...off the top of my head id say you could output said text into an html put it in your /var/www (if you have apache installed) and then link to http://(your ip)/the html and then remove said file after you get your result.
of course i dont know how to put text into a text field from bash.
ah, ok so you can just use the address:
http://translate.google.com/translat...ate0=es|en|foo
so you only need to replace foo and bar in that as well as es and en...
so foo is the domain, bar is any sub domains sepperated by %2F es is the source language (spanish...the default option) and en is the destination language (english...the default option)
so i guess your script needs only output an html into /vaw/www (or wherever your webroot is) then link to it (with appropriate unicode encodings) in the format above (replacing important parts of the link with the link to your html) then promptly clean up the html. (now at this point you can either just use the link to the translated page or you can go a step further and take out the translated text from the resultant html page (which is really just the reverse of the process used to make the page...though the page might be more complicated after translation)
Last edited by the_ultimate_samurai; 05-04-2009 at 12:36 AM.
|
|
|
05-04-2009, 12:33 AM
|
#5
|
LQ Newbie
Registered: May 2007
Posts: 5
Rep:
|
Another option to setting up a web-server is instead to look at the HTML source on the google page and examine the form (if you need to understand how forms work, google "HTML Forms tutorial"). It's a form whose action is of type POST. The textarea 'text' is what you want to set with the input you wish to translate.
if you've not come across wget, you should read about it, it's very useful in automating web-page downloads. It's also capable of simulating form entry (for both forms of action-type GET and POST).
after examining the google web-page translation, you can use wget to request the page (try "man wget" at your prompt to check the syntax).
Finally this can all be setup and automated into a script. I recommend your script is invoked as:
./GoogleTranslate <filename> <from_language> <to_language>
NB: be sure to pass all *hidden* fields necessary from the form in the POST request too - ignoring them may not give you good results.
hope that helps
|
|
|
05-04-2009, 12:43 AM
|
#6
|
Member
Registered: Jan 2006
Distribution: debian-lenny
Posts: 37
Rep:
|
yeah mine was just the first thing that came to mind...im sure there are many better ways
|
|
|
05-04-2009, 02:07 PM
|
#7
|
Senior Member
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561
Original Poster
Rep:
|
we can use elink and specify the type of mozilla 4.0 something into wget...
(sorr for lack info in a rush)
|
|
|
05-05-2009, 05:21 AM
|
#8
|
Amigo developer
Registered: Dec 2003
Location: Germany
Distribution: Slackware
Posts: 4,928
|
I started to say that curl might be the best way to handle both GET and POST requests, but then I remembered you can do this with pur bash as well.
Have a look at bashbrowser:
http://www.pebble.org.uk/linux/bashbrowser
and lastbash:
http://freshmeat.net/projects/lastbash
Here's my adaption of wget using pure bash:
Code:
#!/bin/bash
# Copyright 2008 GilbertAshley <amigo@ibiblio.org>
# BashTrix wget is a minimal implementation of wget
# written in pure BASH, with only a few options.
# The original idea and basic code for this are Copyright 2006 Ainsley Pereira.
# The idea for verify_url is from code which is Copyright 2007 Piete Sartain
# But the above code fragments both still used 'cat'.
# Copyright 2008 Noam Postavsky worked out how to
# get rid of 'cat' and provided other improvements
VERSION=0.2
# Minimum number of arguments needed by this program
MINARGS=1
show_usage() {
echo "Usage: ${0#*/} [OPTIONS] URL"
echo "${0#*/} [-hiOqV] URL"
echo ""
echo " -i FILE --input-file=FILE read filenames from FILE"
echo " -o FILE --output-document=FILE concatenate output to FILE"
echo " -q --quiet Turn off wget's output"
echo " -h --help Show this help page"
echo " -V --version Show BashTrix wget version"
echo
exit
}
show_version() {
echo "BashTrix: wget $VERSION"
echo "BashTrix wget is a minimal implementation of wget"
echo "written in pure BASH, with only a few options."
exit
}
# show usage if '-h' or '--help' is the first argument or no argument is given
case $1 in
""|"-h"|"--help") show_usage ;;
"-V"|"--version") show_version ;;
esac
# get the number of command-line arguments given
ARGC=${#}
# check to make sure enough arguments were given or exit
if [[ $ARGC -lt $MINARGS ]] ; then
echo "Too few arguments given (Minimum:$MINARGS)"
echo
show_usage
fi
# process command-line arguments
for WORD in "$@" ; do
case $WORD in
-*) true ;
case $WORD in
--debug) [[ $DEBUG ]] && echo "Long Option"
DEBUG=1
shift ;;
--input-file=*) [[ $DEBUG ]] && echo "Long FIELD Option using '='"
INPUT_FILE=${WORD:13}
shift ;;
-i) [[ $DEBUG ]] && echo "Short split FIELD Option"
if [[ ${2:0:1} != "-" ]] ; then
INPUT_FILE=$2
shift 2
else
echo "Missing argument"
show_usage
fi ;;
-i*) [[ $DEBUG ]] && echo "Short FIELD Option range -Bad syntax"
echo "Bad syntax. Did you mean this?:"
echo "-i ${WORD:2}"
show_usage
shift ;;
--output-document=*) [[ $DEBUG ]] && echo "Long FIELD Option using '='"
DEST=${WORD:18}
shift ;;
-O) [[ $DEBUG ]] && echo "Short split FIELD Option"
if [[ ${2:0:1} != "-" ]] ; then
DEST=$2
shift 2
else
echo "Missing argument"
show_usage
fi ;;
-O*) [[ $DEBUG ]] && echo "Short FIELD Option range -Bad syntax"
echo "Bad syntax. Did you mean this?:"
echo "-i ${WORD:2}"
show_usage
shift ;;
-q|--quiet) BE_QUIET=1
shift;;
esac
;;
esac
done
# Starts reading from ${HOST}/${URL}. Throws away HTTP headers so
# page contents can be read from file descriptor "$1"
fetch-page()
{
# eval's are necessary so that bash parses expansion of $1<> as a single token
eval "exec $1<>/dev/tcp/${HOST}/80"
eval "echo -e 'GET ${URL} HTTP/0.9\r\n\r\n' >&$1"
# read and throw away HTTP headers, the end of headers is
# indicated by an empty line (all lines are terminated \r\n)
OLD_IFS="$IFS"
IFS=$'\r'$'\n'
while read -u$1 i && [ "${i/$'\r'/}" != "" ]; do : ; done
IFS="$OLD_IFS"
}
# puts contents of ${HOST}/${URL} into ${DEST}
get_it()
{
# make sure $DEST starts empty
: > $DEST
fetch-page 3
fetch-page 4
# clear IFS, otherwise the bytes in it would read as empty
OLD_IFS="$IFS"
IFS=
# we read a single byte at a time from 3 with delimiter 'A',
# and from 4 with delimiter 'B'.
while read -r -n1 -dA -u3 A && read -r -n1 -dB -u4 B ; do
# Now $A is the empty string if the true byte is 'A' or NULL, and
# $B is the empty string if the true byte is 'B' or NULL.
# Therefore if either $A or $B is not empty they have the true byte
if [ -n "$B" ] ; then
echo -n "$B" >> $DEST
elif [ -n "$A" ] ; then
echo -n "$A" >> $DEST
else
# both are empty so the true byte is NULL
echo -en '\0' >> $DEST
fi
done
# restore IFS
IFS="$OLD_IFS"
}
verify_url() {
exec 3<>"/dev/tcp/${HOST}/80"
echo -e "GET ${URL} HTTP/0.9\r\n\r\n" >&3
read -u3 i
if [[ $i =~ "200 OK" ]]; then
echo 1
else
echo 0
fi
}
strip_url() {
# remove the http:// or ftp:// from the RAW_URL
RAW_URL=$1
if [[ ${RAW_URL:0:7} = "http://" ]] ; then
URL=${RAW_URL:7}
elif [[ ${RAW_URL:0:6} = "ftp://" ]] ; then
URL=${RAW_URL:6}
else
URL=${RAW_URL}
fi
}
show_error_404() {
if ! [[ $BE_QUIET ]] ; then
echo "${HOST}/${URL}:"
echo "ERROR 404: Not Found."
fi
}
if [[ $INPUT_FILE ]] ; then
for RAW_URL in $(cat $INPUT_FILE) ; do
# remove the http:// or ftp:// from the RAW_URL
strip_url $RAW_URL
# the HOST is the base name of the website
HOST=${URL%%/*}
# the url is the remaining path to the file(plus the leading '/'
URL=/${URL#*/}
# if the --output-file is not being used, then the DEST is $(basename $URL)
if [[ $DEST = "" ]] ; then
DEST=${URL##*/}
fi
# make sure the URL exists
if [[ "$(verify_url)" = 1 ]] ; then
[[ $DEBUG ]] && echo "${HOST}/${URL} - ${GREEN}found."
get_it
else
show_error_404
fi
done
else
RAW_URL="$@"
# this is the same as above, but for single files
strip_url $RAW_URL
HOST=${URL%%/*}
URL=/${URL#*/}
if [[ $DEST = "" ]] ; then
DEST=${URL##*/}
fi
if [[ "$(verify_url)" = "1" ]] ; then
get_it
else
show_error_404
fi
fi
|
|
|
05-06-2009, 12:08 AM
|
#10
|
Senior Member
Registered: Aug 2006
Posts: 2,697
|
an example using Python, for 1 word translation only from en to de.
Code:
import urllib , httplib,re
pat = re.compile("<.*?>",re.M|re.DOTALL)
params = urllib.urlencode({"hl":"en","ie":"UTF-8","text":"cat","sl":"en","tl":"de"}) #translate the word "cat"
headers = {
"Content-type" : "application/x-www-form-urlencoded",
"Accept" : "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
"Host" : "Host: translate.google.com" }
connection = httplib.HTTPConnection("translate.google.com:80")
connection.request("POST","/translate_t",params,headers)
response = connection.getresponse()
#print response.status,response.reason
data=response.read()
start = data.index("class=thead>Dictionary:")
end = data.index("<a class=morelink")
data = data[start+1:end].split("<li>")
for items in data[1:]:
print pat.sub("",items)
output:
Code:
# ./test.py
Katze
Raubkatze
Typ
Raupe
|
|
|
09-13-2009, 11:55 PM
|
#11
|
Senior Member
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561
Original Poster
Rep:
|
By the time, is there already some packages for us to install, and be present in distro. such as:
Code:
translatetxt en ru --format rtf myfiletotrans.rtf myfiletranslated.rtf
?
|
|
|
All times are GMT -5. The time now is 02:39 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|