Ok so quick overview
i have a python script which basically counts the amount of times a line occurs and produces a output with the string and occurrences
this is the working code
echo -e "username is richard\nusername is bob\nusername is phil" | awk '{print $3}' | ./lineCount1.py
1 bob
1 phil
1 richard
Code:
#!/usr/bin/python
import sys
names = {}
for name in sys.stdin.readlines():
name = name.strip()
if name in names:
names[name] += 1
else:
names[name] = 1
for name, count in names.iteritems():
sys.stdout.write("%d\t%s\n" % (count, name))
Now i wanted to remove having to format (use awk) before entering the data into the script so i used the .split functionality which seems to break it
echo -e "username is richard\nusername is bob\nusername is phil" | ./lineCount2.py
Traceback (most recent call last):
File "./lineCount2.py", line 8, in <module>
if name in names:
TypeError: unhashable type: 'list'
Code:
#!/usr/bin/python
import sys
names = {}
for line in sys.stdin.readlines():
word = line.split()
name = [line[2]]
if name in names:
names[name] += 1
else:
names[name] = 1
for name, count in names.iteritems():
sys.stdout.write("%d\t%s\n" % (count, name))
Ive read up on this a bit and can see its something to do with it needing to be tuples ? Had a play around but cant seem to get it to work. Any help would be appreciated.