LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-23-2018, 06:43 AM   #1
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Rep: Reputation: 1
Text Manipulation - Increment time/day


I have a file with approximately 16 million rows and need to insert date/time stamps for each row in 1 hour increments for an entire year and then starting over when a new value is in the first column.

ie.
Code:
A,40,30
A,53,20
A,33,10
...
A,55,49
B,11,34
B,43,79
B,59,76
...
B,18,77
C,64,29
to
Code:
A,01/01/2017 01:00,40,30
A,01/01/2017 02:00,53,20
A,01/01/2017 03:00,33,10
...
A,12/31/2017 23:00,55,49
B,01/01/2017 01:00,11,34
B,01/01/2017 02:00,43,79
B,01/01/2017 03:00,59,76
...
B,12/31/2017 23:00,18,77
C,01/01/2017 01:00,64,29
I thought about converting it to epoch and then back but had difficulty with that. But I noticed perl had some date/time functions that might work well. Any thoughts on how this might be accomplished?
 
Old 01-23-2018, 07:16 AM   #2
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
If you use Linux, it's possible to use gawk time functions as well

eg:
Code:
awk '
BEGIN {
  timestamp=mktime("2017 01 01 00 00 00");
  hour=1
}
{
  print( strftime("%m/%d/%Y %R", timestamp + hour++ * 60 * 60) );
}
' file.txt
From this, you test if $1 changed, if yes reset hour to 1
Format output as you wish

Edit: you'll need another test if you want to go from 23:00 to 1:00

Last edited by keefaz; 01-23-2018 at 07:24 AM.
 
1 members found this post helpful.
Old 01-23-2018, 08:04 AM   #3
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,665
Blog Entries: 4

Rep: Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945Reputation: 3945
First of all, I'd stuff that data into a SQLite database file and do further manipulations with it that way.

Then, treating this file as the "database" that it in fact is, you'll need to write a short computer-program using some language other than "Bash scripting."
 
Old 01-23-2018, 10:34 AM   #4
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Original Poster
Rep: Reputation: 1
After playing around a bit, I was able to use the following:
Code:
awk 'BEGIN 
  {FS=OFS=",";timestamp=mktime("2017 01 01 00 00 00");hour=1}
 {print $1,( strftime("%D %R", timestamp + hour++ * 60 * 60)),$2,$3;
}' input.csv > output.csv
This works but continues to the next years until the end of the file. I would like it to restart the date/time when a new value is in the first column. The first column is already sorted so I'm thinking uniq could be used for this. Any thoughts on doing that?
 
Old 01-23-2018, 10:53 AM   #5
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by nooobeee View Post
I would like it to restart the date/time when a new value is in the first column.... Any thoughts on doing that?
This is a code snippet to give you an idea.

With this InFile ...
Code:
2011 apple
2011 pear
2011 peach
2012 blue
2012 red
2012 yellow
2013 salty
2013 sweet
2013 bitter
2013 sour
2014 honda
2014 toyota
2014 nissan
... this awk ...
Code:
awk '{if ($1!=save) counter++; save=$1; 
      print counter,$0;}' $InFile >$OutFile
... produced this OutFile ...
Code:
1 2011 apple
1 2011 pear
1 2011 peach
2 2012 blue
2 2012 red
2 2012 yellow
3 2013 salty
3 2013 sweet
3 2013 bitter
3 2013 sour
4 2014 honda
4 2014 toyota
4 2014 nissan
Daniel B. Martin

.
 
1 members found this post helpful.
Old 01-23-2018, 11:19 AM   #6
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Original Poster
Rep: Reputation: 1
That's good info Daniel and I'm going to take a look at that to see if I can figure out how to adjust. However my desired outcome is the following in your example:

Code:
1 2011 apple
2 2011 pear
3 2011 peach
1 2012 blue
2 2012 red
3 2012 yellow
1 2013 salty
2 2013 sweet
3 2013 bitter
4 2013 sour
1 2014 honda
2 2014 toyota
3 2014 nissan
 
Old 01-23-2018, 11:23 AM   #7
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
Quote:
Originally Posted by nooobeee View Post
After playing around a bit, I was able to use the following:
This works but continues to the next years until the end of the file. I would like it to restart the date/time when a new value is in the first column. The first column is already sorted so I'm thinking uniq could be used for this. Any thoughts on doing that?
Yes, test if $1 has changed, for this you have to compare $1 with a value, as this value will change you need a variable to store this value.
So you have to test $1 each time with a variable value, if changed reset hour to 1
 
1 members found this post helpful.
Old 01-23-2018, 11:42 AM   #8
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Original Poster
Rep: Reputation: 1
Quote:
Originally Posted by keefaz View Post
Yes, test if $1 has changed, for this you have to compare $1 with a value, as this value will change you need a variable to store this value.
So you have to test $1 each time with a variable value, if changed reset hour to 1
Ok, it seems you may be a few steps ahead of me (thanks for your patience). If i'm understanding you correctly, I need to store the first $1 as a user defined variable and add the command to check $1 against the stored variable. If it matches, continue but if it doesn't, then reset the counter, overwrite the variable, and continue. Is that what you're saying?
 
Old 01-23-2018, 11:44 AM   #9
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
It sounds correct, post code if you're stuck
 
1 members found this post helpful.
Old 01-23-2018, 12:15 PM   #10
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by nooobeee View Post
I would like it to restart the date/time when a new value is in the first column.... Any thoughts on doing that?
This is a code snippet to give you an idea.

With this InFile ...
Code:
2011 apple
2011 pear
2011 peach
2012 blue
2012 red
2012 yellow
2013 salty
2013 sweet
2013 bitter
2013 sour
2014 honda
2014 toyota
2014 nissan
... this awk ...
Code:
awk '{if ($1!=save) j=0; save=$1; j++
      print j,$0}' $InFile >$OutFile
... produced this OutFile ...
Code:
1 2011 apple
2 2011 pear
3 2011 peach
1 2012 blue
2 2012 red
3 2012 yellow
1 2013 salty
2 2013 sweet
3 2013 bitter
4 2013 sour
1 2014 honda
2 2014 toyota
3 2014 nissan
Daniel B. Martin

.
 
1 members found this post helpful.
Old 01-23-2018, 06:54 PM   #11
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Original Poster
Rep: Reputation: 1
I appreciate you forcing me to understand it and figure it out Daniel...though frustrating, i'm sure i'll be better off for it. Anyway I think i understand what your code is doing by checking if $1 does not equal save as well as saving the $1 value and finally incrementing the value for j which is initially set at 0. It seems that the mktime and strftime are throwing me off now (perhaps due to lack of understanding). I mangled together the following code with the included errors:

Code:
awk 'BEGIN {FS=OFS=",";hour=1;if ($1!=u) timestamp=mktime("2017 01 01 00 00 00");u=$1;(strftime("%D %R", timestamp + hour++ * 60 * 60))}{print $1,(strftime("%D %R", timestamp)),$2;}' input.csv > output.csv
awk: line 2: function strftime never defined
awk: line 2: function strftime never defined
awk: line 2: function mktime never defined
I seem to recall getting the error about defining one or both of those before but it seems they're defined here. what am i missing?
 
Old 01-23-2018, 07:05 PM   #12
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Post withdrawn. Advice was wrong.

Last edited by danielbmartin; 01-23-2018 at 08:39 PM. Reason: Advice was wrong.
 
Old 01-23-2018, 07:15 PM   #13
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Original Poster
Rep: Reputation: 1
I don't logically see the need for them either, removed them (from both strftime instances) but no change.
 
Old 01-23-2018, 07:16 PM   #14
nooobeee
Member
 
Registered: Oct 2004
Distribution: Debian
Posts: 72

Original Poster
Rep: Reputation: 1
oh wait, i think i need to install gawk or something like that...i'm pretty sure i had this on my system at work and that's how i addressed it
 
Old 01-23-2018, 07:18 PM   #15
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
Quote:
Originally Posted by nooobeee View Post
I appreciate you forcing me to understand it and figure it out Daniel...though frustrating, i'm sure i'll be better off for it. Anyway I think i understand what your code is doing by checking if $1 does not equal save as well as saving the $1 value and finally incrementing the value for j which is initially set at 0. It seems that the mktime and strftime are throwing me off now (perhaps due to lack of understanding). I mangled together the following code with the included errors:

Code:
awk 'BEGIN {FS=OFS=",";hour=1;if ($1!=u) timestamp=mktime("2017 01 01 00 00 00");u=$1;(strftime("%D %R", timestamp + hour++ * 60 * 60))}{print $1,(strftime("%D %R", timestamp)),$2;}' input.csv > output.csv
awk: line 2: function strftime never defined
awk: line 2: function strftime never defined
awk: line 2: function mktime never defined
I seem to recall getting the error about defining one or both of those before but it seems they're defined here. what am i missing?
Close, but you moved things in wrong places

awk syntax is like: condition {action}
so:
Code:
awk '
BEGIN {FS=OFS=",";timestamp=mktime("2017 01 01 00 00 00")}
$1!=u {hour=1;u=$1}
{print $1, strftime("%D %R", timestamp + hour++ * 60 * 60),$2, $3}
' input.csv
BEGIN{}: initialize block
$1 != u: condition
if condtion is true then next block: {hour=1;u=$1}
and following block is executed without condition

Last edited by keefaz; 01-23-2018 at 07:20 PM.
 
1 members found this post helpful.
  


Reply

Tags
awk, csv, gawk



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
increment a value of string each time hruday Linux - Newbie 1 11-23-2017 12:22 PM
bash: time difference and calculate the mid day of the day? frenchn00b Programming 3 05-03-2009 02:03 PM
Easy string/text manipulation/indentation for restructured text brianmcgee Linux - Software 1 04-22-2008 08:27 PM
need help with text manipulation pcorajr Programming 12 12-15-2006 07:33 AM
More text manipulation ice_hockey Linux - General 2 05-28-2005 01:43 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 07:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration