LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-08-2014, 07:36 AM   #1
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Rep: Reputation: 12
A script or a program to indent a massive dataset.


Hello everyone,
I am working on my final year project which is to pre-process a massive dataset that looks like this

Quote:
009872688524152197301010000C+45500-118400SAO +1234MEH V0203005N002659999999N9999999N9-00285-00395103265ADDAA101000095AA206000091MA1102781088695QNNK11 1 00025M11 1 26190Q11 1 10326S11 1 00027X11 1 30005
004072688524152197301010100D+45500-118400SAO +1236MEH V0202801N001519999999N9999999N9-00401-00401999999ADDAA101000095AA206000091MA1102881999999
004072688524152197301010200D+45500-118400SAO +1236MEH V0201901N000519999999N9999999N9-00401-00401999999ADDAA101000095AA206000091MA1102881999999
009872688524152197301010300C+45500-118400SAO +1234MEH V0202005N001059999999N9999999N9-00505-00505103455ADDAA101000095AA206000091MA1102981088825QNNK11 1 00023M11 1 26230Q11 1 10345S11 1 00023X11 1 20002
004072688524152197301010400D+45500-118400SAO +1236MEH V0201701N001019999999N9999999N9+99999+99999999999ADDAA101000095AA206
As the first step, I want to break those long lines and delimit by the space or a tab say after 4 columns and 8 columns and so on.. as the specific number of columns determine a specific attribute.
Not expecting the entire code, but what logic do you suggest would be an easier way. Any help appreciated. Thanks for your time.
 
Old 02-08-2014, 07:40 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,492

Rep: Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867
Nice to provide input but often helps if you show desired output too as we may not interpret you correctly.

What have you tried?
 
Old 02-08-2014, 07:48 AM   #3
tckosvic
LQ Newbie
 
Registered: Jun 2011
Posts: 1

Rep: Reputation: Disabled
Suggest Crisp text editor program

abhishekgit,

I have used the Crisp text editor for years in dealing with large files. I has a macro (record) function where you could implement the changes you need by performing your operations on one line and then, by recording those key strokes, the same can be automatically done on the remaining lines.

It handles unlimited file sizes in terms of rows/columns.

It is a commercial program and there is a linux version. There are many other approaches in Crisp that could also get done what you are describing.

Tom Kosvic
 
Old 02-08-2014, 07:54 AM   #4
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,353
Blog Entries: 55

Rep: Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541Reputation: 3541
Also from looking at the format (if this represents ISD data) why would you not use or build on existing tools?
 
1 members found this post helpful.
Old 02-08-2014, 10:18 AM   #5
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 18,785

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
Quote:
Originally Posted by abhishekgit View Post
Hello everyone,
I am working on my final year project which is to pre-process a massive dataset that looks like this

As the first step, I want to break those long lines and delimit by the space or a tab say after 4 columns and 8 columns and so on.. as the specific number of columns determine a specific attribute. Not expecting the entire code, but what logic do you suggest would be an easier way. Any help appreciated. Thanks for your time.
Not to sound nasty, but as a final year student, shouldn't basic text-processing be something you can easily do by now? You have MANY options, like sed and awk, not to mention easily-written perl scripts to do just this. You asked for 'logic'...wouldn't the obvious first step be "Determine the criteria on which you want to act"?

You can't say you want to delimit it by a space...or a tab...maybe after 4 columns...maybe after 8. Pick a SOLID set of criteria, and move forward. If you want to base your splits on the number of columns, wouldn't it be very obvious that you'd have to COUNT the number of columns first??? And since you know you have do to this, this would also rule out sed/awk commands (in my opinion), in favor of a perl script. Perl was written specifically to process big text data sets.
 
Old 02-11-2014, 08:10 AM   #6
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Original Poster
Rep: Reputation: 12
@TBone,
If i knew how to do it, I wouldn't be posting it here, would I? A single reply of yours hasnt been of help(not to sound nasty)
 
Old 02-11-2014, 09:50 AM   #7
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 18,785

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
Quote:
Originally Posted by abhishekgit View Post
@TBone,
If i knew how to do it, I wouldn't be posting it here, would I? A single reply of yours hasnt been of help(not to sound nasty)
You claim to be a 'final year' student, but don't know how to process a text file? And yes, I do believe that you would be posting here, since you seem to keep asking for people to write scripts for you, and show little effort of your own. For example:
http://www.linuxquestions.org/questi...se-4175494534/
http://www.linuxquestions.org/questi...-r-4175492861/
http://www.linuxquestions.org/questi...er-4175484119/
http://www.linuxquestions.org/questi...ct-4175456863/

And it seems odd that less than a year ago, you were a total newbie...and now you're a final year student:
http://www.linuxquestions.org/questi...se-4175455656/

The reason my replies haven't been helpful to you, is that I haven't spoon-fed you answers. I gave you a basic logic starting point, and even gave you some criteria in which to think about. If you want a script written for you, you're in the wrong place. If you want HELP, then post what you have written/tried of your own, and tell us where you're stuck. You asked what logic we recommended...you were given it. What else, exactly, were you expecting???
 
Old 03-11-2014, 11:53 PM   #8
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Original Poster
Rep: Reputation: 12
@TBone,
if you know how to read, I've clearly mentioned in the thread that i need the direction, or the logic. for example, one could say "There's a command in linux to split columns and fields", I would write my own script based on the given logic. I never asked for an entire script. not at least to you. Wonder how you got so many reputation points. And as years pass, I pass semesters and give exams at college, so obviously i move on year by year and now i am in final year in college. It has got nothing to do with being a newbie.
 
Old 03-11-2014, 11:55 PM   #9
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Original Poster
Rep: Reputation: 12
@tckosvic,
Its exactly what i wanted. Thanks!
 
Old 03-12-2014, 08:45 AM   #10
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 18,785

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
Quote:
Originally Posted by abhishekgit View Post
@TBone,
if you know how to read, I've clearly mentioned in the thread that i need the direction, or the logic.
And if you bothered reading/understanding what I told you, you would see I DID give you 'direction' or 'logic'. Or did you miss the parts where I advised you to count the columns? To determine the criteria on which you want to act? To use Perl for such text manipulations? ALL of those things were suggestions as starting points for logic.
Quote:
for example, one could say "There's a command in linux to split columns and fields", I would write my own script based on the given logic.
Then you would be doing it wrong. You *COULD* use any of the Linux utilities out there (sed, awk, etc.), but that is just ONE way to do it. Since you never actually TOLD US what you were writing it in, suggesting a bash tool for a perl script is a bad idea. Even worse if you were using python.
Quote:
but a I never asked for an entire script. not at least to you. Wonder how you got so many reputation points. And as years pass, I pass semesters and give exams at college, so obviously i move on year by year and now i am in final year in college. It has got nothing to do with being a newbie.
You said "Not expecting the entire code"...which sure indicates that you were expecting SOME code. But you again didn't bother to post ANYTHING that you tried/did, provide selections of the output you wanted, or were even clear on the criteria on which you wanted to split things. So, you asked a fairly vague question, showed no effort of your own, and are upset when someone points it out.

You have done this in SEVERAL other of your threads (see just a FEW examples above), and even in this one, grail also asked you for details that you didn't provide, and asked you to show what you tried.

Last edited by TB0ne; 03-12-2014 at 08:46 AM.
 
Old 03-12-2014, 08:49 AM   #11
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Original Poster
Rep: Reputation: 12
@TBone,
I've marked most of my threads 'solved' without your help. I am almost done with this too, without your help. I will post the code shortly.
 
Old 03-12-2014, 10:08 AM   #12
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 18,785

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
Quote:
Originally Posted by abhishekgit View Post
@TBone,
I've marked most of my threads 'solved' without your help. I am almost done with this too, without your help. I will post the code shortly.
Of course..we'll all look forward to it. Before being nasty with people, you should look to what you've been posting first, and think about WHY you get the answers you do.
 
Old 03-12-2014, 12:08 PM   #13
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Original Poster
Rep: Reputation: 12
Quote:
and think about WHY you get the answers you do.
@TBone,
I am happy with all the answers i get, I think about "WHY" to your answers are alone, which are mere combination of alphabets and are of 0 help
 
Old 03-12-2014, 12:19 PM   #14
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 18,785

Rep: Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159Reputation: 4159
Quote:
Originally Posted by abhishekgit View Post
@TBone,
I am happy with all the answers i get, I think about "WHY" to your answers are alone, which are mere combination of alphabets and are of 0 help
Please, stop where you are. My answers are of zero help to you, because I don't spoon-feed you, and expect you to show effort of your own. This very thread is a great example; you asked a question, and I answered it...but you didn't like the answer.

You don't even ACKNOWLEDGE you got it...can you go look through the examples posted, and honestly tell me I'm wrong? I'm done even bothering with you. Good luck.
 
Old 03-12-2014, 12:31 PM   #15
abhishekgit
Member
 
Registered: Jan 2012
Location: India
Distribution: Ubuntu, Gentoo, Fedora, Rhel5,openSUSE
Posts: 165

Original Poster
Rep: Reputation: 12
@TBone,
Based on your reputation, you're obviously an expert and have been of great help to several users. Anyway, thanks for your time.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Massive kernel weight loss program? Mistoffeles Linux - Server 3 08-15-2009 06:09 PM
indent shell script pritu123 Linux - Newbie 2 04-03-2009 07:10 AM
passing a dataset from one sub to another mrobertson Programming 2 03-03-2006 07:39 AM
Printing a datagrid/dataset in c# mrobertson Programming 1 02-27-2006 03:43 PM
massive numbers of email accounts and script js_530 Linux - General 4 07-24-2003 01:42 PM


All times are GMT -5. The time now is 04:29 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration