LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   C program: memory leak/ segmentation fault/ memory limit exceeded (https://www.linuxquestions.org/questions/programming-9/c-program-memory-leak-segmentation-fault-memory-limit-exceeded-4175718711/)

AlinaC 11-12-2022 08:51 AM

C program: memory leak/ segmentation fault/ memory limit exceeded
 
Hi all,

thank you for all constructive feedback.

Sorry about "annoying" thing. Haven't remarked it was still there before posting. This was a joke from the university, as our professor told us we could write both input[n]=x or n[input]=x but the latter was annoying :).

I have replaced "realloc" with "malloc", so this piece works (I hope).

But now I see the other line where error appears and it looks like the real issue.
[CODE]long int * frequency = (long int*) malloc(2 * MAX * MAX * sizeof(long int));[CODE]

This is the part where I need to go through my array (where I've saved sums of all pairs on the previous step) and find which numbers/sums are repeating and how many times. I've done it easy but ineffectively and need to change.
As of now, I check an array item, and if it contains the number X, I add +1 to the item X in a frequency array, eg frequency[x]+=1.

I've read some articles on Internet and it looks like dictionary can be a better structure to use for this purpose. (I need to check every item in an array and create a list of unique numbers/sums that are present in this array; each number/sum to be associated with the frequency).

Do you think dictionary will be a good choice for that? I basically need to store pairs int->int, specifically sum(a number stored in an array item) -> frequency (how many times this sum was contained in different array items).

Thanks
Alina

------the task ----------------
The problem is to develop a program to analyze input sequence.

Assume a sequence of numbers. We may choose a contiguous range from the input sequence, for instance, we may choose the range between the 5-th number and the 10-th number. A range must contain at least two numbers. The sum of the numbers may be computed for such range. We choose all possible ranges from the input sequence and we compute the sum for each of them. We are interested how many different ranges exist in the input sequence such that these ranges have the same sum.

For instance, the input sequence may be:
1 5 2 4 2 2 2
there is a total of 21 contiguous ranges of length ≥ 2, in particular:
0..1: 1 5 sum=6
0..2: 1 5 2 sum=8
0..3: 1 5 2 4 sum=12
0..4: 1 5 2 4 2 sum=14
0..5: 1 5 2 4 2 2 sum=16
0..6: 1 5 2 4 2 2 2 sum=18
1..2: 5 2 sum=7
1..3: 5 2 4 sum=11
1..4: 5 2 4 2 sum=13
1..5: 5 2 4 2 2 sum=15
1..6: 5 2 4 2 2 2 sum=17
2..3: 2 4 sum=6
2..4: 2 4 2 sum=8
2..5: 2 4 2 2 sum=10
2..6: 2 4 2 2 2 sum=12
3..4: 4 2 sum=6
3..5: 4 2 2 sum=8
3..6: 4 2 2 2 sum=10
4..5: 2 2 sum=4
4..6: 2 2 2 sum=6
5..6: 2 2 sum=4
Of these 21 ranges, there are 12 pairs such that the ranges in the pair have the same sum:
0..1 ~ 2..3 sum=6
0..1 ~ 3..4 sum=6
0..1 ~ 4..6 sum=6
0..2 ~ 2..4 sum=8
0..2 ~ 3..5 sum=8
0..3 ~ 2..6 sum=12
2..3 ~ 3..4 sum=6
2..3 ~ 4..6 sum=6
2..4 ~ 3..5 sum=8
2..5 ~ 3..6 sum=10
3..4 ~ 4..6 sum=6
4..5 ~ 5..6 sum=4
The input of the program is a sequence of integers (positive, zero, and negative values). The reading of the input sequence is terminated when EOF is reached. The input sequence is at most 2000 numbers long.

The output of the program is the total number of pairs of ranges such that the sums of the ranges are the same.

---------whole code of program [updated] ---------
Code:

#include <stdio.h>
#include <stdlib.h>
#define MAX 2000

//Take in the list of numbers
//As the list is input, loop within the j-arrays
//Loop through the result & get data into frequency array
//Loop through the frequency array to get the result


int choose(int n, int k){
    if (k == 0) return 1;
    return (n * choose(n - 1, k - 1)) / k;
}


int main(){
       

//***************************
//takes input into the array
int * input = (int *) malloc (MAX * sizeof(int));

int n = 0, x;
printf("Input sequence:\n");

while ( scanf ( "%d", & x ) == 1 ) {
                if (n > MAX){
                                free ( input );
                                printf ( "Invalid input.\n" );
                                return EXIT_FAILURE;
                        }

                input [n] = x;
                ++n;
        }
if ( ! feof ( stdin ) ) {
                printf ( "Invalid input.\n" );
                free ( input );
                return EXIT_FAILURE;
        }
       
if ( n == 0 ) {
                printf ( "Invalid input.\n" );
                free ( input );
                return EXIT_FAILURE;
        }
       
//****************************
//Finding sums
long int * pointer  = (long int*) malloc(((n-1) * (n-1)) * sizeof(long int));


for (int i = 0; i < n-1; i ++){
        //i is the column now
        //i + 2 is the N of elements in the sum
        for (int j = 0; j< n-1-i; j ++){
                //Add jth, ..., j+i+1 th element
                for (int k = j; k <= j+i+1; k++){
                *(pointer + (n-1)*i + j)  += input[k];
        }
        }
}

free(input);

//*******************************
//getting frequency
long int * frequency  = (long int*) malloc(2 * MAX * MAX  * sizeof(long int));


for (int i = 0; i < n-1; i ++){
        for (int j = 0; j < n-1-i; j ++){
                *(frequency + MAX + *(pointer + (n-1) * i + j)) += 1;
        }}


//*****************************
//Finding the max N of pairs
free(pointer);

int pairs = 0;

for (int i = 0; i < 2 * MAX * MAX; i++){
        if (i[frequency] == 2){
                pairs += 1;
        }
        else if (i[frequency] > 2){
                pairs += choose(i[frequency], 2);
} }





//******************************
//Printing whatever i need to print

free (frequency);

printf("Total pairs: %d\n", pairs);

return 0;}

//Maybe, it just wasn't meant to be


/*
 for (int i = 0; i < n-1; i ++){
        for (int j = 0; j < n-1-i; j ++){
        printf("%ld ", *(pointer + (n-1)*i + j));
        }
        printf("\n");}
 */


rtmistler 11-12-2022 10:18 AM

I've written blogs about how to debug core dumps and use the debugger.

Please see the link in my signature.

dugan 11-12-2022 10:43 AM

Quote:

Originally Posted by AlinaC (Post 6391863)
Debugging has shown that it relates to the following line of code:

Code:

input = (int * ) realloc ( input, sizeof ( * input) * max );

I'm going to shove a rubber duck in your face. Explain to that duck what that line is supposed to do.

pan64 11-12-2022 12:25 PM

you need to check the return value of malloc, realloc and in general any function. Just to know if they failed or not.
you can use valgrind to check memory related issues
you did not provide enough info, for example what did you exactly entered/tried when it failed.
you need also explain how did you compile/build this code (including the versions and also the OS).

Oh yes, what is this good for?
Code:

n [ input ] = x;

GazL 11-12-2022 01:03 PM

Quote:

Originally Posted by pan64 (Post 6391897)
Oh yes, what is this good for?
Code:

n [ input ] = x;

Yeah, I'm surprised that didn't throw a warning.

ntubski 11-12-2022 06:39 PM

Quote:

Originally Posted by pan64 (Post 6391897)
Oh yes, what is this good for?
Code:

n [ input ] = x;

As clearly explained by the comment on the previous line, it's good for annoying people.

(see also https://c-faq.com/aryptr/joke.html)

NevemTeve 11-13-2022 12:53 AM

Code:

if (n >= max) {
    max += max/2 + 10;
    input = (int * ) realloc (input, sizeof (*input) * max);
}

This wouldn't work in generic case (where n can be greater than max+(max/2)+10) but here n is incremented by one in every step, so 'n<=max' should be true before this part and 'n<max' after it.
Just to be sure, these checks could be added:
Code:

if (n > max) exit (60);
if (n == max) {
    max += max/2 + 10;
    input = (int * ) realloc (input, sizeof (*input) * max);
    if (input==NULL) exit (61);
}


pan64 11-13-2022 02:15 AM

Quote:

Originally Posted by ntubski (Post 6391972)
As clearly explained by the comment on the previous line, it's good for annoying people.

(see also https://c-faq.com/aryptr/joke.html)

That clearly will not annoy me. It has definitely different consequences.

AlinaC 11-13-2022 07:05 AM

Quote:

Originally Posted by rtmistler (Post 6391879)
I've written blogs about how to debug core dumps and use the debugger.

Please see the link in my signature.

Thank you for reference, will read it to learn more about debugging

AlinaC 11-13-2022 07:07 AM

Quote:

Originally Posted by dugan (Post 6391883)
I'm going to shove a rubber duck in your face. Explain to that duck what that line is supposed to do.

I didn't get what you intended to say with this phrase. Was it intentionally rude - or you just was asking to explain what this line of code was intended to do?

AlinaC 11-13-2022 07:16 AM

Quote:

Originally Posted by pan64 (Post 6391897)
you need to check the return value of malloc, realloc and in general any function. Just to know if they failed or not.
you can use valgrind to check memory related issues
you did not provide enough info, for example what did you exactly entered/tried when it failed.
you need also explain how did you compile/build this code (including the versions and also the OS).

Oh yes, what is this good for?
Code:

n [ input ] = x;

Thank you for your questions and advice, this was the most constructive answer!
I've used valgrind, it doesn't find anything.

First I compiled the code with g++ v11.3.0 in Ubuntu, no error. When I tried uploading my code to the university environment, the error appeared. I've tested then this code in Vs Code and then Visual Studio in Windows and got error messages with more info about where the problem could be.

I've tried with 2 examples from below, both failed with the same error
Example program output:
Input sequence:
1 5 2 4
Total pairs: 1

Input sequence:
1 5 2 4 2 2 2
Total pairs: 12

AlinaC 11-13-2022 07:18 AM

Quote:

Originally Posted by NevemTeve (Post 6392004)
Code:

if (n >= max) {
    max += max/2 + 10;
    input = (int * ) realloc (input, sizeof (*input) * max);
}

This wouldn't work in generic case (where n can be greater than max+(max/2)+10) but here n is incremented by one in every step, so 'n<=max' should be true before this part and 'n<max' after it.
Just to be sure, these checks could be added:
Code:

if (n > max) exit (60);
if (n == max) {
    max += max/2 + 10;
    input = (int * ) realloc (input, sizeof (*input) * max);
    if (input==NULL) exit (61);
}


Thank you for advice. After reading more about malloc/realloc, I've decided malloc should work better for my case and changed the code. Now debugging shows an error on another line: long int * frequency = (long int*) malloc(2 * MAX * MAX * sizeof(long int));
I've written about it in the updated text of my question.

NevemTeve 11-13-2022 07:35 AM

At that point, realloc is perfectly valid, as it keeps the already filled values. It's the later malloc-s where 'n' should be used instead of 'MAX'

Anyways, I tried the latest state of your program with valgrind; it showed that 'frequency' is uninitialized: 'malloc' allocates memory, but doesn't clear it; 'calloc' does that:
Code:

- long int * frequency  = (long int*) malloc(2 * MAX * MAX  * sizeof(long int));
+ long int * frequency  = (long int*) calloc(2 * n * n, sizeof(long int));


boughtonp 11-13-2022 07:55 AM

Quote:

Originally Posted by AlinaC (Post 6392034)
I didn't get what you intended to say with this phrase. Was it intentionally rude - or you just was asking to explain what this line of code was intended to do?

It's a reference to a common debugging technique - instead of asking a colleague for help, you first describe the issue to a rubber duck and imagine it responding with questions as that co-worker might, which you then answer.

Frequently, the action of describing a problem and considering common questions is enough to resolve an issue, but even when it's not it means you've done some basic diagnostic steps that allow you to rule certain things out before taking the issue to someone who isn't a duck.


NevemTeve 11-13-2022 08:05 AM

A note: I'd start with ensuring that all elements are positive: if the minimum of all elements is not positive, then add 1-minimum to every elements.

After that we have n-1 series of nested intervals (only the sum of the elements is relevant; as the elements are positive, these sum-series are increasing).
Code:

input:    1 2 4  3  5  7  1 (n=7)
series#1:  3 7 10 15 22 23
series#2:    6  9 14 21 22
series#3:        7 12 19 20
series#4:          8 15 16
series#5:            12 13
series#6:                8

Now we only have to find the common elements in these list.

rtmistler 11-13-2022 08:07 AM

The points people are trying to drive home are:
  • Debug techniques are important to learn, there are multiple methods and they each benefit you. The old printf() can only take you so far.
  • Actually read the code you've written. Consider that you may possibly be making assumptions about how certain sections and library calls work. Review your code to determine if you fully understand every line you wrote and every return from library calls as well as have called those functions properly.
  • There are several sources to check and confirm your interim work: valgrind, clang, compiler warnings and errors and increased scrutiny by optional compilation flags, and the ability to run, freeze execution, and examine variables.

dugan 11-13-2022 10:58 AM

Quote:

Originally Posted by AlinaC (Post 6392034)
I didn't get what you intended to say with this phrase. Was it intentionally rude - or you just was asking to explain what this line of code was intended to do?

https://en.wikipedia.org/wiki/Rubber_duck_debugging

NevemTeve 11-13-2022 11:22 AM

Off: now I think my idea to make elements positive by adding a constant value was wrong... So right now I think we have to calculate the n(n-1)/2 interval-sums, then perform a sort to find the duplications.

GazL 11-13-2022 01:37 PM

I find the exercise a little conceptually odd, and I don't fully get this concept of "pairs" rather than just reporting the number of duplicate results, but my first run using the input from the example gives the same results as described in OPs description, so I assume I've understood the goal correctly.
Code:

$ time -p (echo "1 5 2 4 2 2 2" | ./sum_range )
Input items: 7
Range:  0..1: sum = 6
Paired: 0..1 ~ 2..3: sum = 6
Paired: 0..1 ~ 3..4: sum = 6
Paired: 0..1 ~ 4..6: sum = 6
Range:  0..2: sum = 8
Paired: 0..2 ~ 2..4: sum = 8
Paired: 0..2 ~ 3..5: sum = 8
Range:  0..3: sum = 12
Paired: 0..3 ~ 2..6: sum = 12
Range:  0..4: sum = 14
Range:  0..5: sum = 16
Range:  0..6: sum = 18
Range:  1..2: sum = 7
Range:  1..3: sum = 11
Range:  1..4: sum = 13
Range:  1..5: sum = 15
Range:  1..6: sum = 17
Range:  2..3: sum = 6
Paired: 2..3 ~ 3..4: sum = 6
Paired: 2..3 ~ 4..6: sum = 6
Range:  2..4: sum = 8
Paired: 2..4 ~ 3..5: sum = 8
Range:  2..5: sum = 10
Paired: 2..5 ~ 3..6: sum = 10
Range:  2..6: sum = 12
Range:  3..4: sum = 6
Paired: 3..4 ~ 4..6: sum = 6
Range:  3..5: sum = 8
Range:  3..6: sum = 10
Range:  4..5: sum = 4
Paired: 4..5 ~ 5..6: sum = 4
Range:  4..6: sum = 6
Range:  5..6: sum = 4
Total Ranges: 21
Total Paired: 12
real 0.00
user 0.00
sys 0.00

However, my simple solution doesn't scale well:
Code:

$ time -p ( seq 1 100| ./sum_range | grep Total )
Total Ranges: 4950
Total Paired: 4342
real 0.02
user 0.02
sys 0.00
$ time -p ( seq 1 200| ./sum_range | grep Total )
Total Ranges: 19900
Total Paired: 21302
real 0.24
user 0.24
sys 0.00
$ time -p ( seq 1 400| ./sum_range | grep Total )
Total Ranges: 79800
Total Paired: 100903
real 3.50
user 3.49
sys 0.01
$


dugan 11-13-2022 06:09 PM

Quote:

Originally Posted by AlinaC (Post 6391863)
I've read some articles on Internet and it looks like dictionary can be a better structure to use for this purpose. (I need to check every item in an array and create a list of unique numbers/sums that are present in this array; each number/sum to be associated with the frequency).

Do you think dictionary will be a good choice for that? I basically need to store pairs int->int, specifically sum(a number stored in an array item) -> frequency (how many times this sum was contained in different array items).

Yes :) Although you'd probably call your implementation a hash table (means the same thing).

AlinaC 11-17-2022 02:40 PM

Quote:

Originally Posted by boughtonp (Post 6392046)
It's a reference to a common debugging technique - instead of asking a colleague for help, you first describe the issue to a rubber duck and imagine it responding with questions as that co-worker might, which you then answer.

Frequently, the action of describing a problem and considering common questions is enough to resolve an issue, but even when it's not it means you've done some basic diagnostic steps that allow you to rule certain things out before taking the issue to someone who isn't a duck.


Thank you for explaining! However before posting a question, I've talked already to myself, my friends from the Uni, and even my mom ;) so this condition was fulfilled )

boughtonp 11-17-2022 05:22 PM


 
Yeah, but how many of those are ducks? :)


Anyway, not clear on the current status of the thread... do you still have an outstanding memory leak, or a different issue, or is it solved now?



All times are GMT -5. The time now is 10:24 PM.