[SOLVED] C - Odd results from file parsing function
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
So at this point the memory originally allocated to token is lost, and token now points to an area inside buffer.
Then you free both:
Code:
free(token);
free(buffer);
If token is not NULL at that point you will produce a double free() error, and either way the memory allocated to token at the top of the function is lost.
Maybe you just want token to be declared as an unallocated char*?
My goal is to parse a config file by delimiter. This compiles now without error and I think I got the leak. it also does what I want it to, namely return the found field value to main. Thank you for pointing the leak.
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// declare functions
void file_search_func(FILE * file, const char * searchstr,
const char * delimiter, char ** pointer,
const int field);
int main(void)
{
// set up file and delimiter
FILE * file;
file = fopen("testfile.txt", "r");
char delimiter[1] = ",";
// if file fail to open bail out
if (!file)
{
printf("unable to open file. exiting!\n");
return(-1);
}
// set up pass by reference and return value. free when done
char * retval;
file_search_func(file, "lol", delimiter, &retval, 1);
printf("fail - %s\n", retval);
free(retval);
//close and clean up
fclose(file);
file = NULL;
}
void file_search_func(FILE * file, const char * searchstr,
const char * delimiter, char ** pointer,
const int field)
{
// declare and initialize
char * token;
char * buffer = (char*)calloc(24,sizeof(char));
char * tmp = (char*)malloc(24*sizeof(char));
int counter = 0;
// loop until find identifier in string. typically first field
// tokenize once found
while (fgets(buffer, 24, file)) {
if (strstr(buffer, searchstr)) {
token = strtok(buffer, delimiter);
while (token) {
// stops when field found of specified string
// return and clean up
if (counter == field) {
printf("token = %s\n", token);
strcpy(tmp, token);
*pointer = tmp;
free(buffer);
break;
} else {
// continue to next token until found
counter++;
token = strtok(NULL, delimiter);
}
}
}
}
}
You should at least initialize retval to NULL, otherwise if fgets() in your function fails or the file is empty or the token is not found you will try to free a random location in memory.
Also, it looks like you may try to use buffer after it has been freed if your input is longer than expected or multi-line.
Buffer may also leak if the function is passed an empty file or if the search is not found or if the field value is greater than the number of tokens in the buffer.
All the little duckies have to be in a row at the same time!
As already noted, as written there is no guarantee that file_search_func(...) will set retval, so you may attempt to reference random memory with both the printf(...) function and free(). retval should be initialized and then tested before use.
But the logic here is still confusing because only on a successful search will retval will be set to a found token, but you always report that as a "fail". I cannot tell what you intend with that.
Next, the outer loop in your function may not work as you apparently intend for two reasons:
Code:
while (fgets(buffer, 24, file)) {
if (strstr(buffer, searchstr)) {
...
}
}
First, fgets() will read a maximum of 23 characters into the buffer, but if the string of tokens is longer than this it may break a token in the middle so that it will never be found by strtok().
Also, as written you may read multiple lines via the enclosing while(){...}, but your field and counter values will lose their most obvious meaning after the first line. Counter will continue to increment and what would field mean in a multi-line stream of tokens?
Next, you test the buffer to see if your search string is found in it, and only if found do you tokenize the buffer string. I infer that you are searching for a specific token and you expect strstr() to test if it is there, then use strtok() to find what field it is in. But strtok() may not produce what strstr() matches. In other words strstr() may match something not returned by strtok(), so the match test seems flawed to me. You should ask yourself what you intended here.
Then, inside your tokenizing loop:
Code:
token = strtok(buffer, delimiter);
while (token) {
// stops when field found of specified string
// return and clean up
if (counter == field) {
printf("token = %s\n", token);
strcpy(tmp, token);
*pointer = tmp;
free(buffer);
break;
}
You say it "stops when field found of specified string ... return", which we must take to be your intent, but that is not what it does at all.
It silently tokenizes the buffer until it either reaches the end or until counter == field. But that test is not related at all to the outer searchstr test so it is difficult to see what you might have intended that to do for you.
But you also free the buffer when counter == field, but you only break the inner loop, you do not return, and if there are more characters in the input string you will load them into the buffer you just freed.
My point is to prompt you to think more clearly about how you expect that all to work together by writing a simple spec (i.e. comments in the file or function explaining what it should do), then write the code to perform that purpose under all possible conditions - then look for bugs.
Ok. I need to write comments because it is a bit involved. For reference this is where I am now. It works, at least seems to give me what I want. I've tested it on multiple line files with a few of them having the same identifier. each time it stops after the first line that matches.
Code:
int main()
{
// declare and initialize
FILE * file;
file = fopen("datafile.txt", "r");
// set up pass by reference and return value. free when done
char * retval = NULL;
file_search_func(file, "test", ",", &retval, 2, 48);
printf("test = %s\n", retval);
free(retval);
fclose(file);
file = NULL;
return 0;
}
void file_search_func(FILE * file, const char * searchstr,
const char * delimiter, char ** pointer,
const int field, const int stringmax)
{
// declare and initialize
char * token;
char * buffer = (char*)calloc(stringmax,sizeof(char));
char * tmp = (char*)calloc(stringmax,sizeof(char));
int counter = 0;
int toggledone = 0;
// confirm file opened successfully
file_open_check(file, 'r', 999);
// loop until find identifier in string. typically first field
// tokenize once found
while (fgets(buffer, stringmax, file)) {
if (toggledone == 1) {
break;
}
if (strstr(buffer, searchstr)) {
token = strtok(buffer, delimiter);
while (token) {
// stops when field found of specified string
// return and clean up
if (counter == field) {
strcpy(tmp, token);
*pointer = tmp;
toggledone = 1;
break;
} else {
// continue to next token until found
counter++;
token = strtok(NULL, delimiter);
}
}
}
}
free(buffer);
}
void file_open_check(const FILE *file, const char a, const int errcode)
{
if (!file) {
if (a == 'r') { // read error
printf("error. unable to read file!");
} else if (a == 'w') { // write error
printf("error. unable to write file!");
} else if (a == 't') { // temp file error
printf("error. unable to create temp file!");
}
printf("\n\nfind the error %d in the man page.\n", errcode);
printf("correct the problem before you try again.\n");
// bail out
exit(1);
}
}
Last edited by jmgibson1981; 03-04-2023 at 08:47 PM.
Ultimately I'm trying to build a simple banking type of thing. fields separated by comma in the file. I had googled for "small projects" to build my skillset. This was one of the first suggestions i found.
These file functions in particular i can see being useful in the future so I'm trying to make them for library purposes. Make them lgpl for anyone to use.
Last edited by jmgibson1981; 03-04-2023 at 05:46 PM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.