LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 07-06-2023, 08:14 PM   #1
sharky
Member
 
Registered: Oct 2002
Posts: 569

Rep: Reputation: 84
python merge lines


I have to use python because it will be part of existing python code.

Example input
Quote:
[
['dev1', 'devType', 'x1']
['dev1', 'devType', 'x2']
['dev1', 'devType', 'x3']
['dev2', 'devType', 'y1']
['dev2', 'devType', 'y2']
['dev2', 'devType', 'y3']
['dev2', 'devType', 'y4']
['dev2', 'devType', 'y5']
['dev3']
['dev4']
['dev5', 'devType', 'z1']
['dev5', 'devType', 'z2']

]
Desired output
Quote:
[
['dev1', 'devType x1:x2:x3']
['dev2', 'devType y1:y2:y3:y4:y5']
['dev3']
['dev4']
['dev5', 'devType z1:z2']
]
The concept seems simple, just combine dev#s that have devTypes, but the solution eludes me.

Below is the monstrosity that I tried and it fails miserably. In the below example 'regressionList' corresponds to the above sample input.

Code:
def writeTest(regressionList):
  # extract all cellNames
  cellsToPlace = []
  for test in regressionList:
    if test[0] not in cellsToPlace:
      cellsToPlace.append(test[0])

  for cell in cellsToPlace:
    devTypes = []
    for test in regressionList:
      if test[0] == cell and len(test) > 1:
        devTypes.append(test[2])
        print(cell,devTypes)
 
Old 07-06-2023, 09:12 PM   #2
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,226

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
Use a defaultdict where the key is the first column and the default value is an empty list.
 
Old 07-06-2023, 10:08 PM   #3
sharky
Member
 
Registered: Oct 2002
Posts: 569

Original Poster
Rep: Reputation: 84
Quote:
Originally Posted by dugan View Post
Use a defaultdict where the key is the first column and the default value is an empty list.
I'm sure there's something obvious to you in this advice but I have no idea how this moves me toward my goal. I do not know how to build a defaultdict from the input.
 
Old 07-06-2023, 10:47 PM   #4
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,226

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
This looks suspiciously like a job interview question. But whatever:

Code:
import collections
import pprint

original_rows = [
    ["dev1", "devType", "x1"],
    ["dev1", "devType", "x2"],
    ["dev1", "devType", "x3"],
    ["dev2", "devType", "y1"],
    ["dev2", "devType", "y2"],
    ["dev2", "devType", "y3"],
    ["dev2", "devType", "y4"],
    ["dev2", "devType", "y5"],
    ["dev3"],
    ["dev4"],
    ["dev5", "devType", "z1"],
    ["dev5", "devType", "z2"],
]

new_rows = collections.defaultdict(list)

# column 1
for original_row in original_rows:
    if original_row[0] not in new_rows:
        new_rows[original_row[0]].append(original_row[0])

# It looks like this now:
"""
defaultdict(<class 'list'>,
            {'dev1': ['dev1'],
             'dev2': ['dev2'],
             'dev3': ['dev3'],
             'dev4': ['dev4'],
             'dev5': ['dev5']})
"""
# column 2
for original_row in original_rows:
    if len(original_row) > 1 and original_row[0] in new_rows:
        if len(new_rows[original_row[0]]) == 1:
            new_rows[original_row[0]].append(original_row[1])


# Now it looks like this:
"""
defaultdict(<class 'list'>,
            {'dev1': ['dev1', 'devType'],
             'dev2': ['dev2', 'devType'],
             'dev3': ['dev3'],
             'dev4': ['dev4'],
             'dev5': ['dev5', 'devType']})
"""

# And column 3 now
for original_row in original_rows:
    if original_row[0] in new_rows and len(original_row) == 3 and len(new_rows[original_row[0]]) == 2:
        new_rows[original_row[0]][1] += ":" + original_row[2]
    
# And now it's like this:
"""
defaultdict(<class 'list'>,
            {'dev1': ['dev1', 'devType:x1:x2:x3'],
             'dev2': ['dev2', 'devType:y1:y2:y3:y4:y5'],
             'dev3': ['dev3'],
             'dev4': ['dev4'],
             'dev5': ['dev5', 'devType:z1:z2']})
"""

# So, for the final result:

pprint.pprint(list(new_rows.values()))

# That prints:
"""
[['dev1', 'devType:x1:x2:x3'],
 ['dev2', 'devType:y1:y2:y3:y4:y5'],
 ['dev3'],
 ['dev4'],
 ['dev5', 'devType:z1:z2']]
"""
 
Old 07-07-2023, 07:17 AM   #5
sharky
Member
 
Registered: Oct 2002
Posts: 569

Original Poster
Rep: Reputation: 84
That works just as advertised.

Not an interview question - but the concern is understood. It is probably clear that I would never qualify for even an entry level python programming position. I work for a semiconductor design company and 95% of my work consist of SKILL programming for pcell development.

Python was needed for this particular project to extract data from an excel file. SKILL has no built in functions for handling excel spreadsheets so I decided to use python to extract the required data from the excel file and write the result to a csv file.

The final csv file would look like this.

Quote:
dev1, devType x1:x2:x3
dev2, devType y1:y2:y3:y4:y5
dev3
dev4
dev5, devType z1:z2
Thanks for the help and my apologies for basically coaxing you into doing a bit of my job for me.
 
Old 07-07-2023, 07:40 AM   #6
sharky
Member
 
Registered: Oct 2002
Posts: 569

Original Poster
Rep: Reputation: 84
I did notice one small glitch.

There should be a space between 'devType' and the first x1,y1, or z1, not a colon.

I will attempt to modify the code to meet that spec. Will let you know how it goes.
 
Old 07-07-2023, 10:03 AM   #7
sharky
Member
 
Registered: Oct 2002
Posts: 569

Original Poster
Rep: Reputation: 84
Modified # column 2 to add a space after the input.

original: new_rows[original_row[0]].append(original_row[1])
change: new_rows[original_row[0]].append(original_row[1]+" ")

Modified # column 2 to place the colon after the input.
original: new_rows[original_row[0]][1] += ":" + original_row[2]
change: new_rows[original_row[0]][1] += original_row[2] + ":"

The final result still has a trailing ":" that is not needed.

Quote:
[['dev1', 'devType x1:x2:x3:'],
['dev2', 'devType y1:y2:y3:y4:y5:'],
['dev3'],
['dev4'],
['dev5', 'devType z1:z2:']]
 
Old 07-07-2023, 10:05 AM   #8
dugan
LQ Guru
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 11,226

Rep: Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320Reputation: 5320
I can't believe you manually tried to write a diff.
 
  


Reply

Tags
python



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Python Python Python (aka Python 3) LXer Syndicated Linux News 0 08-05-2009 08:30 PM
Merge many files in to one big file. like 20 file merge in one big file Jmor Linux - Newbie 2 10-29-2008 09:41 PM
LXer: kgdb, To Merge Or Not To Merge LXer Syndicated Linux News 0 02-05-2008 06:10 PM
LXer: KHTML Vs Webkit: To Merge or Not To Merge LXer Syndicated Linux News 0 10-27-2007 06:41 AM
convert and merge a fat32 to ext3 and then merge w/ another ext3? nkoplm Linux - General 3 03-23-2006 10:37 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 09:40 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration