A script or a program to indent a massive dataset.
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
As the first step, I want to break those long lines and delimit by the space or a tab say after 4 columns and 8 columns and so on.. as the specific number of columns determine a specific attribute.
Not expecting the entire code, but what logic do you suggest would be an easier way. Any help appreciated. Thanks for your time.
I have used the Crisp text editor for years in dealing with large files. I has a macro (record) function where you could implement the changes you need by performing your operations on one line and then, by recording those key strokes, the same can be automatically done on the remaining lines.
It handles unlimited file sizes in terms of rows/columns.
It is a commercial program and there is a linux version. There are many other approaches in Crisp that could also get done what you are describing.
I am working on my final year project which is to pre-process a massive dataset that looks like this
As the first step, I want to break those long lines and delimit by the space or a tab say after 4 columns and 8 columns and so on.. as the specific number of columns determine a specific attribute. Not expecting the entire code, but what logic do you suggest would be an easier way. Any help appreciated. Thanks for your time.
Not to sound nasty, but as a final year student, shouldn't basic text-processing be something you can easily do by now? You have MANY options, like sed and awk, not to mention easily-written perl scripts to do just this. You asked for 'logic'...wouldn't the obvious first step be "Determine the criteria on which you want to act"?
You can't say you want to delimit it by a space...or a tab...maybe after 4 columns...maybe after 8. Pick a SOLID set of criteria, and move forward. If you want to base your splits on the number of columns, wouldn't it be very obvious that you'd have to COUNT the number of columns first??? And since you know you have do to this, this would also rule out sed/awk commands (in my opinion), in favor of a perl script. Perl was written specifically to process big text data sets.
The reason my replies haven't been helpful to you, is that I haven't spoon-fed you answers. I gave you a basic logic starting point, and even gave you some criteria in which to think about. If you want a script written for you, you're in the wrong place. If you want HELP, then post what you have written/tried of your own, and tell us where you're stuck. You asked what logic we recommended...you were given it. What else, exactly, were you expecting???
if you know how to read, I've clearly mentioned in the thread that i need the direction, or the logic. for example, one could say "There's a command in linux to split columns and fields", I would write my own script based on the given logic. I never asked for an entire script. not at least to you. Wonder how you got so many reputation points. And as years pass, I pass semesters and give exams at college, so obviously i move on year by year and now i am in final year in college. It has got nothing to do with being a newbie.
if you know how to read, I've clearly mentioned in the thread that i need the direction, or the logic.
And if you bothered reading/understanding what I told you, you would see I DID give you 'direction' or 'logic'. Or did you miss the parts where I advised you to count the columns? To determine the criteria on which you want to act? To use Perl for such text manipulations? ALL of those things were suggestions as starting points for logic.
for example, one could say "There's a command in linux to split columns and fields", I would write my own script based on the given logic.
Then you would be doing it wrong. You *COULD* use any of the Linux utilities out there (sed, awk, etc.), but that is just ONE way to do it. Since you never actually TOLD US what you were writing it in, suggesting a bash tool for a perl script is a bad idea. Even worse if you were using python.
but a I never asked for an entire script. not at least to you. Wonder how you got so many reputation points. And as years pass, I pass semesters and give exams at college, so obviously i move on year by year and now i am in final year in college. It has got nothing to do with being a newbie.
You said "Not expecting the entire code"...which sure indicates that you were expecting SOME code. But you again didn't bother to post ANYTHING that you tried/did, provide selections of the output you wanted, or were even clear on the criteria on which you wanted to split things. So, you asked a fairly vague question, showed no effort of your own, and are upset when someone points it out.
You have done this in SEVERAL other of your threads (see just a FEW examples above), and even in this one, grail also asked you for details that you didn't provide, and asked you to show what you tried.
I am happy with all the answers i get, I think about "WHY" to your answers are alone, which are mere combination of alphabets and are of 0 help
Please, stop where you are. My answers are of zero help to you, because I don't spoon-feed you, and expect you to show effort of your own. This very thread is a great example; you asked a question, and I answered it...but you didn't like the answer.
You don't even ACKNOWLEDGE you got it...can you go look through the examples posted, and honestly tell me I'm wrong? I'm done even bothering with you. Good luck.