LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-28-2019, 06:32 AM   #1
Bartonsen
Member
 
Registered: Oct 2016
Posts: 41

Rep: Reputation: Disabled
How to check if a list of numbers are between a range of numbers from file


I have two files that looks like file1 and file2 below. (Original files are bigger and contain more columns, but you get the picture...)
The 10 digit numbers are epoch timestamps, and the date corresponds to the first timestamp.

Task: For all epoch timestamps in file1, I want to check in which range they belong to in file2.
Range in file2 is between column 3 and 4.
The output could look something like this :

1553084280 is between A B 1553106952 1553149296 2019-03-20
1553161680 is between A B 1553366710 1553408326 2019-03-23
1553253660 is not between any range in file2
etc...

Can anyone give some hints in how to do this?
I guess it can be done with for loops and if and then scroll through the range, but I don't know where to start....

file1
Code:
1553075280 2019-03-20 50
1553084280 2019-03-20 50
1553161680 2019-03-21 50
1553253660 2019-03-22 50
1553265360 2019-03-22 50
1553278380 2019-03-22 50
1553285220 2019-03-22 50
1553461200 2019-03-24 50
1553533680 2019-03-25 50
1553563800 2019-03-26 50
file2
Code:
B A 1553069354 1553102191 2019-03-20
A B 1553106952 1553149296 2019-03-20
B A 1553154911 1553188655 2019-03-21
A B 1553193719 1553235582 2019-03-21
B A 1553241380 1553274996 2019-03-22
A B 1553279511 1553321916 2019-03-22
B A 1553327895 1553361459 2019-03-23
A B 1553366710 1553408326 2019-03-23
B A 1553414000 1553447801 2019-03-24
A B 1553452378 1553494663 2019-03-24
 
Old 03-28-2019, 08:37 AM   #2
Bartonsen
Member
 
Registered: Oct 2016
Posts: 41

Original Poster
Rep: Reputation: Disabled
Eventually, I could make file2 look like this:

Code:
1  B A 1553069354 1553102191 2019-03-20
2  A B 1553106952 1553149296 2019-03-20
3  B A 1553154911 1553188655 2019-03-21
4  A B 1553193719 1553235582 2019-03-21
5  B A 1553241380 1553274996 2019-03-22
6  A B 1553279511 1553321916 2019-03-22
7  B A 1553327895 1553361459 2019-03-23
8  A B 1553366710 1553408326 2019-03-23
9  B A 1553414000 1553447801 2019-03-24
10 A B 1553452378 1553494663 2019-03-24
If the timestamp was 1553249000, how can I get the output to be "timestamp 1553249000 is in range line 5..." ?
 
Old 03-28-2019, 08:53 AM   #3
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
using two things to use one to check against each other one to find if one matches the other or not.

two loops, and a conditional is required. messages to reflect what is found is up to the developer. That is where how to format output comes in handy.

What language are you wanting to use?
Anywhere from a scripting language to a programing language can be used to accomplish this.

(heck, awk might even be able to accomplish this on its own even)
in C.
Code:
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv)
{

    FILE *fptr1, *fptr2;
    char * line1 = NULL;
    char * line2 = NULL;
    size_t len1 = 0;
    size_t len2 = 0;
    ssize_t read1, read2;

    if (argc < 2)
    {
        printf("need two files to open\n");
        exit(EXIT_FAILURE);
    }

    fptr1 = fopen(argv[1],"r");
    fptr2 = fopen(argv[2],"r");

    if (fptr1 == NULL)
    {
        printf("Cannot open file1\n");
        exit(EXIT_FAILURE);
    }
    if (fptr2 == NULL)
    {
        printf("Cannot open file2\n");
        exit(EXIT_FAILURE);
    }
    // here is where the comparing starts.
    //this just opens and read each line and prints
    //it out seperatly. 
    while ((read1 = getline(&line1, &len1, fptr1)) != -1)
    {
        printf("%s\n", line1);
    }

    while ((read2 = getline(&line2, &len2, fptr2)) != -1)
    {
        printf("%s\n", line2);
    }

    fclose(fptr1);
    fclose(fptr2);
    free(line1);
    free(line2);
    return 0;
}

Last edited by BW-userx; 03-28-2019 at 10:10 AM.
 
Old 03-28-2019, 09:37 AM   #4
Bartonsen
Member
 
Registered: Oct 2016
Posts: 41

Original Poster
Rep: Reputation: Disabled
Thanks for your reply.
I'm not very experienced in any programming language, so I would prefer bash scripting, using awk, sed etc, and for/while loops...
 
Old 03-28-2019, 09:55 AM   #5
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,453

Rep: Reputation: 447Reputation: 447Reputation: 447Reputation: 447Reputation: 447
Hi

The output is not exactly like you want, but somewhing like this should get you started.

Read column 1 of file1 in a loop, then print the timestamp. And then print all lines in file2 where the timestamp is >= column 3 and <= column 4

Code:
awk '{print $1}' file1.txt |
while read timestamp ; do
        echo $timestamp
        awk "$timestamp >= \$3 && $timestamp <= \$4 {print}" file2.txt
done
 
2 members found this post helpful.
Old 03-28-2019, 10:12 AM   #6
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
n/m

Last edited by BW-userx; 03-28-2019 at 10:15 AM.
 
Old 03-28-2019, 10:36 AM   #7
Bartonsen
Member
 
Registered: Oct 2016
Posts: 41

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Guttorm View Post
Hi

The output is not exactly like you want, but somewhing like this should get you started.

Read column 1 of file1 in a loop, then print the timestamp. And then print all lines in file2 where the timestamp is >= column 3 and <= column 4

Code:
awk '{print $1}' file1.txt |
while read timestamp ; do
        echo $timestamp
        awk "$timestamp >= \$3 && $timestamp <= \$4 {print}" file2.txt
done
Thanks Guttorm, I think this will do it!
 
Old 03-28-2019, 04:54 PM   #8
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Quote:
Originally Posted by Guttorm View Post
Code:
awk '{print $1}' file1.txt |
while read timestamp ; do
        echo $timestamp
        awk "$timestamp >= \$3 && $timestamp <= \$4 {print}" file2.txt
done
How many lines should the OutFile contain? I think 20, but when I ran this code it produced 17.

Daniel B. Martin

.
 
Old 03-28-2019, 05:05 PM   #9
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
With this InFile1 ...
Code:
1553075280 2019-03-20 50
1553084280 2019-03-20 50
1553161680 2019-03-21 50
1553253660 2019-03-22 50
1553265360 2019-03-22 50
1553278380 2019-03-22 50
1553285220 2019-03-22 50
1553461200 2019-03-24 50
1553533680 2019-03-25 50
1553563800 2019-03-26 50
... and this InFile2 ...
Code:
B A 1553069354 1553102191 2019-03-20
A B 1553106952 1553149296 2019-03-20
B A 1553154911 1553188655 2019-03-21
A B 1553193719 1553235582 2019-03-21
B A 1553241380 1553274996 2019-03-22
A B 1553279511 1553321916 2019-03-22
B A 1553327895 1553361459 2019-03-23
A B 1553366710 1553408326 2019-03-23
B A 1553414000 1553447801 2019-03-24
A B 1553452378 1553494663 2019-03-24
... this code ...
Code:
 sed 's/^/~ ~ /'  $InFile1  \
|sort -nk 3     - $InFile2  \
|sed 's/^~ ~ //' >$OutFile
... produced this OutFile ...
Code:
B A 1553069354 1553102191 2019-03-20
1553075280 2019-03-20 50
1553084280 2019-03-20 50
A B 1553106952 1553149296 2019-03-20
B A 1553154911 1553188655 2019-03-21
1553161680 2019-03-21 50
A B 1553193719 1553235582 2019-03-21
B A 1553241380 1553274996 2019-03-22
1553253660 2019-03-22 50
1553265360 2019-03-22 50
1553278380 2019-03-22 50
A B 1553279511 1553321916 2019-03-22
1553285220 2019-03-22 50
B A 1553327895 1553361459 2019-03-23
A B 1553366710 1553408326 2019-03-23
B A 1553414000 1553447801 2019-03-24
A B 1553452378 1553494663 2019-03-24
1553461200 2019-03-24 50
1553533680 2019-03-25 50
1553563800 2019-03-26 50
Daniel B. Martin

.

Last edited by danielbmartin; 03-28-2019 at 05:07 PM. Reason: Tighten the code, slightly
 
1 members found this post helpful.
Old 04-02-2019, 08:36 PM   #10
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
An all-awk solution...

With this InFile1 ...
Code:
1553075280 2019-03-20 50
1553084280 2019-03-20 50
1553161680 2019-03-21 50
1553253660 2019-03-22 50
1553265360 2019-03-22 50
1553278380 2019-03-22 50
1553285220 2019-03-22 50
1553461200 2019-03-24 50
1553533680 2019-03-25 50
1553563800 2019-03-26 50
... and this InFile2 ...
Code:
B A 1553069354 1553102191 2019-03-20
A B 1553106952 1553149296 2019-03-20
B A 1553154911 1553188655 2019-03-21
A B 1553193719 1553235582 2019-03-21
B A 1553241380 1553274996 2019-03-22
A B 1553279511 1553321916 2019-03-22
B A 1553327895 1553361459 2019-03-23
A B 1553366710 1553408326 2019-03-23
B A 1553414000 1553447801 2019-03-24
A B 1553452378 1553494663 2019-03-24
... this awk ...
Code:
awk '{if (NF==3) a[j++]=$1" "
           else {a[j++]=$3" "$0
                 a[j++]=$4" end"}}
  END{n=asort(a,b)
      for (k=1;k<=n;k++)
        {l=length(b[k])
         if (l<12 &&  flag) print b[k]"is within",Range
         if (l<12 && !flag) print b[k]"is outside any range"
         if (l>12) {flag=!flag;
           Range=substr(b[k],1+index(b[k]," "))}}}'  \
$InFile1 $InFile2 >$OutFile
... produced this OutFile ...
Code:
1553075280 is within B A 1553069354 1553102191 2019-03-20
1553084280 is within B A 1553069354 1553102191 2019-03-20
1553161680 is within B A 1553154911 1553188655 2019-03-21
1553253660 is within B A 1553241380 1553274996 2019-03-22
1553265360 is within B A 1553241380 1553274996 2019-03-22
1553278380 is outside any range
1553285220 is within A B 1553279511 1553321916 2019-03-22
1553461200 is within A B 1553452378 1553494663 2019-03-24
1553533680 is outside any range
1553563800 is outside any range
The awk could be made (slightly) more efficient by coding nested if statements but it would be less readable.

Daniel B. Martin

.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Search for a file contents and a range of numbers. openSUSEuser Linux - Newbie 16 12-20-2018 03:22 PM
[SOLVED] how to filter a text file based on a range of numbers in a specific column using awk nash-c Programming 20 03-24-2016 06:44 AM
How to check that my script can check if it has a specific range of charcters. shirlcurl20 Linux - Newbie 2 11-16-2010 11:15 PM
how to put blank spaces in between list of numbers generated in FORTRAN vjramana Programming 2 08-26-2010 12:35 AM
How to List files between date1 and date2 (range of time)? mystique98ls Linux - Newbie 2 09-10-2004 04:36 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:38 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration