LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-31-2016, 10:58 AM   #16
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,649
Blog Entries: 4

Rep: Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935Reputation: 3935

A basic statistics package, such as "R" or otherwise, is often used to do this sort of thing ... and with data volumes that are this big.

Interestingly, if the number of commands and the number of users is not, itself, outrageously large, "a moderate [Perl?] script" can also be used to tackle this sort of problem, through the use of in-memory hashes. A hash-table keyed by "user-id" could contain an integer count. Likewise, a hash-table keyed by "command." Or, a so-called "hash of hashes" structure, where (say ...) each element in a hash keyed by "user" is itself a hash keyed by "command," containing an integer count.

In this approach, the file can be read sequentially, in situ, without being sorted at all. The only requirement is that enough RAM is available ... probably a very safe assumption these days.
 
Old 03-31-2016, 01:18 PM   #17
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
You can always put the base hashes into disk files...
 
Old 04-02-2016, 06:33 PM   #18
Lsatenstein
Member
 
Registered: Jul 2005
Location: Montreal Canada
Distribution: Fedora 31and Tumbleweed) Gnome versions
Posts: 311
Blog Entries: 1

Rep: Reputation: 59
Quote:
Originally Posted by bosong View Post
The server I am doing has maybe about 1 million records per day, i want to sort out the username along side with the command use and also count for each command, is there anyway i can do it???
If the fields in your file have delimiters, I would use the cut command to create a file with only the columns you need. This presort pass may chop away many million bytes of unnecessary data. Subsequently use the Linux sort utility to achieve your objectives.

If possible, use/write a filter program to extract only the records you need before passing the result to the sort program.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Rapid Listing, alphabetically sorting, dir/files sorting in C ? Xeratul Programming 18 11-24-2014 10:13 AM
sorting mohamad Linux - Newbie 7 06-23-2010 06:46 AM
Java sorting manolakis Programming 2 07-21-2008 11:28 AM
Sorting Beppe83 Linux - Software 7 06-21-2004 09:10 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration