Latest LQ Deal: Latest LQ Deals
Go Back > Forums > Linux Forums > Linux - Software
User Name
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.


  Search this Thread
Old 03-31-2016, 10:58 AM   #16
LQ Guru
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 9,829
Blog Entries: 4

Rep: Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551Reputation: 3551

A basic statistics package, such as "R" or otherwise, is often used to do this sort of thing ... and with data volumes that are this big.

Interestingly, if the number of commands and the number of users is not, itself, outrageously large, "a moderate [Perl?] script" can also be used to tackle this sort of problem, through the use of in-memory hashes. A hash-table keyed by "user-id" could contain an integer count. Likewise, a hash-table keyed by "command." Or, a so-called "hash of hashes" structure, where (say ...) each element in a hash keyed by "user" is itself a hash keyed by "command," containing an integer count.

In this approach, the file can be read sequentially, in situ, without being sorted at all. The only requirement is that enough RAM is available ... probably a very safe assumption these days.
Old 03-31-2016, 01:18 PM   #17
Senior Member
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
You can always put the base hashes into disk files...
Old 04-02-2016, 06:33 PM   #18
Registered: Jul 2005
Location: Montreal Canada
Distribution: Fedora 31and Tumbleweed) Gnome versions
Posts: 310
Blog Entries: 1

Rep: Reputation: 59
Originally Posted by bosong View Post
The server I am doing has maybe about 1 million records per day, i want to sort out the username along side with the command use and also count for each command, is there anyway i can do it???
If the fields in your file have delimiters, I would use the cut command to create a file with only the columns you need. This presort pass may chop away many million bytes of unnecessary data. Subsequently use the Linux sort utility to achieve your objectives.

If possible, use/write a filter program to extract only the records you need before passing the result to the sort program.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Rapid Listing, alphabetically sorting, dir/files sorting in C ? Xeratul Programming 18 11-24-2014 10:13 AM
sorting mohamad Linux - Newbie 7 06-23-2010 06:46 AM
Java sorting manolakis Programming 2 07-21-2008 11:28 AM
Sorting Beppe83 Linux - Software 7 06-21-2004 09:10 AM > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 10:28 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration