LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 03-15-2010, 07:30 AM   #1
General
Member
 
Registered: Aug 2005
Distribution: Debian 6.0
Posts: 465

Rep: Reputation: 31
Thumbs down Python: replacing all words found on a list


I've simplified the code, for the purpose of this example:

Code:
#!/usr/bin/env python
# coding=utf-8-sig

import re

nouns = ['cow','cowboy']

text = 'thecowatethecowboy'

for x in nouns:
	if re.search(x, text):
		text = text.replace(x, 'NOUN')

print text
The result:

theNOUNatetheNOUNboy

Whereas I want:

theNOUNatetheNOUN

The fix I found was:

Code:
nouns = ['cowboy','cow']
This works in my short example, but for some mysterious reason, of which I am unable to discover, when implimented in my full code, the shorter items are still replaced first, thus I get the 'NOUNboy' problem.

In other words, I can't seem to get this solution to work in my actual code, so I think I need a more robust solution. How can I guarantee that is will replace longer items first?
 
Old 03-15-2010, 07:35 AM   #2
troop
Member
 
Registered: Feb 2010
Distribution: gentoo, arch, fedora, freebsd
Posts: 379

Rep: Reputation: 96
Code:
def bylength(word1, word2): return len(word2) - len(word1)
nouns.sort(cmp=bylength)

Last edited by troop; 03-15-2010 at 07:38 AM.
 
Old 03-15-2010, 07:50 AM   #3
General
Member
 
Registered: Aug 2005
Distribution: Debian 6.0
Posts: 465

Original Poster
Rep: Reputation: 31
I tried implimenting your suggestion, then found why I couldn't reproduce the problem: nouns is a dictionary.

Code:
#!/usr/bin/env python
# coding=utf-8-sig

import re

nouns = {'carport': 1,
	'car': 2}

text = 'thecarwentintothecarport'

for x in nouns:
	if re.search(x, text):
		text = text.replace(x, 'NOUN')

print text
It seems this is always processed smallest to largest, regardless of the order.
 
Old 03-15-2010, 09:57 AM   #4
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,654

Rep: Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967Reputation: 1967
How about:

Code:
#!/usr/bin/env python
# coding=utf-8-sig

mylist = ['cow','cowboy']

text = 'thecowatethecowboy'

for x in reversed(mylist):
    text = text.replace(x, 'NOUN')

print text
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Python: pull words from a text, setting them as variables General Programming 2 03-14-2010 10:14 AM
Separate words (Python) General Programming 5 12-19-2009 07:09 PM
Replacing words in a text file Raghavan_sat Programming 3 05-27-2008 04:11 PM
Getting Range of Words from Arg List tonyfreeman Programming 3 10-05-2006 04:35 PM
Problem loading file of words in python Teoryn Programming 1 07-25-2005 08:40 PM


All times are GMT -5. The time now is 08:55 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration