LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 04-10-2021, 07:04 AM   #1
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Rep: Reputation: 0
Efficient file size


Hi,
I have a script which records an IP address. Every couple of hours Cron starts the job which first appends the IP and then a time stamp to a file simple text file.

A what point, number of lines, MB, etc, would it be more efficient to maybe archive this file and start a new one?
 
Old 04-10-2021, 07:21 AM   #2
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
Quote:
Originally Posted by nuxguy View Post
Hi,
I have a script which records an IP address. Every couple of hours Cron starts the job which first appends the IP and then a time stamp to a file simple text file.

A what point, number of lines, MB, etc, would it be more efficient to maybe archive this file and start a new one?
Who can tell?

You haven't described what you need the address and time stamp data for.

Why keep a file at all? How many historical address and time stamp records do you require?

If I was looking for a trend or something I'd process inline at each event and only retain data I absolutely required.

That said, a text file of a few lines takes very little space, even if you make it a few hundred lines, maybe a few kilobytes? Max IP is 15 characters. Timestamp say is no more than 30? 45 characters, times 1000 entries, is 45,000? So 45 kbytes of data?

Every couple of hours? So that's a few months of data?
 
1 members found this post helpful.
Old 04-10-2021, 08:04 AM   #3
rkelsen
Senior Member
 
Registered: Sep 2004
Distribution: slackware
Posts: 4,445
Blog Entries: 7

Rep: Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553
Efficient file size

Is it a log of some sort? if so, how about logrotate?
 
1 members found this post helpful.
Old 04-10-2021, 08:25 AM   #4
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
Thanks for the reply.

It was historical ...
Some time ago my ISP kept dropping my line and, more often than not, assigning a different IP when it came back. I bought a decent router and although that was better it did not cure the problem. Plus the ISP said the problem was caused by me not using their router.
When i complained they insisted that their router showed there was nothing wrong with my connection. It was a while before i noticed that the IP was changing but they said i didn't have any proof. They also insisted i would have to pay the full amount for what remained of the 12 month contract and that a few seconds loss of service wasn't significant enough to warrant a claim.

FYI:
The ISP in question INSIST i was never with them during this period
- even though i have bank records showing that i was sending them regular payments.
And YES, i have issued an SAR request - same "no data" reply.

First entry dated: Mar 2016
Number of lines: 45232
Size: 791.2 Kb
File has been cleared manually down in the past.
 
Old 04-10-2021, 08:27 AM   #5
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by rkelsen View Post
Is it a log of some sort? if so, how about logrotate?
Log, effectively YES. But a "log" i've created myself.
 
Old 04-10-2021, 10:44 AM   #6
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
You can rotate any file. Does not have to be a system log, or even .log

https://www.linuxquestions.org/quest...-system-36026/

Anyways it doesn't appear that this file, if described as in the first post, could get very large even after a year or more. Rotate or just delete will probably be fine.
 
1 members found this post helpful.
Old 04-10-2021, 11:19 AM   #7
Sefyir
Member
 
Registered: Mar 2015
Distribution: Linux Mint
Posts: 634

Rep: Reputation: 316Reputation: 316Reputation: 316Reputation: 316
Quote:
Every couple of hours Cron starts the job which first appends the IP and then a time stamp to a file simple text file.
So for a year of running, take the number of days in a year, converted to hours and divided by the interval.
365 * 24 / 2 = 4380

Code:
$ for i in {1..4380}; do 
 echo "0.0.0.0" >> output.log; 
 echo "$(date)" >> output.log; 
done
$ du -h output.log
200K	output.log
I wouldn't worry about it. If you're still doing this in 5 years, it might hit 1Mb
 
1 members found this post helpful.
Old 04-10-2021, 07:19 PM   #8
rkelsen
Senior Member
 
Registered: Sep 2004
Distribution: slackware
Posts: 4,445
Blog Entries: 7

Rep: Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553Reputation: 2553
Quote:
Originally Posted by nuxguy View Post
Some time ago my ISP kept dropping my line and, more often than not, assigning a different IP when it came back.
Is this a home connection, or business connection? Quite often, ISPs will change IP addresses on home connections to prevent people from running business services. As a home user, you shouldn't even notice it.
 
Old 04-10-2021, 09:30 PM   #9
berndbausch
LQ Addict
 
Registered: Nov 2013
Location: Tokyo
Distribution: Mostly Ubuntu and Centos
Posts: 6,316

Rep: Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002
Quote:
Originally Posted by nuxguy View Post
A what point, number of lines, MB, etc, would it be more efficient to maybe archive this file and start a new one?
Assuming one line is 100 bytes, your script generates 1200 bytes a day, half a megabyte per year, five megabytes per ten years. The data generated in a decade is less than a single JPEG file from your $100 camera. Even if you multiply the line size by ten, who cares? Archive it whenever you change your computer.

In general terms, here are a few factors to consider:

How much space is left in your filesystem.
The filesystem type and its configuration. Small files fit in a single block. XFS seems to be efficient for large files (it was created by Silicon Graphics with the purpose of efficient access to media files), but perhaps ext4 has caught up.
How often you look at old data. If the answer is "practically never", you can rotate often. If you often look at years-old data, keep years-old data in that file.
Is the data valuable enough to require a backup. Perhaps that is the only factor you need to consider.

Last edited by berndbausch; 04-10-2021 at 09:33 PM.
 
1 members found this post helpful.
Old 04-11-2021, 05:54 AM   #10
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by rtmistler View Post
I've never seen that command before, TY.
I've just thought of about a million other uses for it.
Many, many thanks
... well, maybe not quite a million, but still lots more things to play with
(i'm old, single and don't play games, but i do like to play).
 
Old 04-11-2021, 06:06 AM   #11
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Sefyir View Post
So for a year of running, take the number of days in a year, converted to hours and divided by the interval.
365 * 24 / 2 = 4380

Code:
$ for i in {1..4380}; do 
 echo "0.0.0.0" >> output.log; 
 echo "$(date)" >> output.log; 
done
$ du -h output.log
200K	output.log
I wouldn't worry about it. If you're still doing this in 5 years, it might hit 1Mb
What can i say?
Your reply clearly goes above and beyond.
Many, many thanks Sefyir
 
Old 04-11-2021, 06:18 AM   #12
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by rkelsen View Post
Is this a home connection, or business connection? Quite often, ISPs will change IP addresses on home connections to prevent people from running business services. As a home user, you shouldn't even notice it.
Home (as in old and long since retired).
However, in the past i've always been a Linux whore, so dl'ing my latest fancy has peed me off in the past (until i ignored direct dl's and always torrented.
I've also continued to use DC (orig as an easy chat when we had a SETI team) and constant resets are a pain, plus they are quite embarasing when you're the only one with a rubbish connection.

Thanks for the reply though.
 
Old 04-11-2021, 06:28 AM   #13
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by berndbausch View Post
If you often look at years-old data, keep years-old data in that file.
Is the data valuable enough to require a backup. Perhaps that is the only factor you need to consider.
Thanks berndbausch.

TBH i do tend to rarely check the file. As for backups, i'm an old mainframe guy so backups are in my blood. I've probably got a better backup system, with the exception of Faraday cage backup on a remote fallback site, than most large companies.
 
Old 04-11-2021, 06:31 AM   #14
nuxguy
Member
 
Registered: Oct 2009
Location: Notlob
Distribution: MX Linux
Posts: 41

Original Poster
Rep: Reputation: 0
[SOLVED]

TBH, everyone's reply answered my query ...
Many thanks everyone.
 
Old 04-11-2021, 11:03 AM   #15
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,789

Rep: Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201
Code:
log=output.log
# poor man's logrotate:
find $log -size +10000 -exec gzip {} \;
{
echo "0.0.0.0"
echo "$(date)"
} >>$log
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Linux Virtual Memory size; Page size; Resident Data Size; DB2Database; Explanation ANanalanalyzer Linux - Newbie 1 09-28-2018 04:50 PM
[SOLVED] Store the size of largest file from a file/directory listing into variable SIZE lainey Linux - Newbie 3 11-15-2011 12:29 PM
any ideas to reduce log file size or make log file size managed? George2 Programming 2 08-13-2006 06:55 AM
Efficient search technique for text file of size 2 mb or more topworld Programming 5 04-03-2006 01:56 AM
Total partition size - User partition size is not equals to Free partition size navaneethanj Linux - General 5 06-14-2004 12:55 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 10:11 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration