Splitting a string into tokens and putting tokens into array - strtok










1















char** token_arr(char* str, int n_tokens)

char**arr = malloc((n_tokens+1)*sizeof(char*));
char str2[n_tokens + 1];
strcpy(str2,str);
int i = 0;
char *p = strtok (str2, " ");

while (p != NULL)

arr[i] = p;
//printf("%sn", arr[i]);
p = strtok (NULL, " ");
i++;

return arr;



The purpose of token_arr is to get a string and a number of tokens, then put the tokens into an array. The array of tokens is returned.



int main(void) 
char*str1 = "( 8 + ( 41 - 12 ) )";
char**expression = token_arr(str1, 9);
for(int i = 0; i < 9; i++)
printf("expression[%d] = %cn", i, *expression2[i]);
return 0;



Output:



expression2[0] = (
expression2[1] =
expression2[2] =
expression2[3] =
expression2[4] =
expression2[5] =
expression2[6] =
expression2[7] =
expression2[8] =


Why is only the first value being printed? What's wrong with my code?










share|improve this question

















  • 1





    You are allocating the array of pointers, but the pointers point to a local variable, str2, in the function, and that variable is out of scope once you're back in main(), so you're accessing invalid data that has been reused and presumably zeroed. You'll need to allocate the string where you copy the string literal, and you'll need to find a way to pass both allocated pointers back to the main() so you can free the allocated memory accurately. As it stands, you have 'undefined behaviour'. That means that what you see is a valid response, as are many other possible behaviours.

    – Jonathan Leffler
    Nov 11 '18 at 5:47












  • Unless you are using calloc or memset to preserve a sentinel NULL at the end of arr, there is no need for +1 in char**arr = malloc((n_tokens+1)*sizeof(char*));. arr is simply the number of pointers you are allocating. If you only have n_tokens you only need that number of pointers unless you are ensuring you have the final pointer NULL to allow iteration with a while loop -- but in that case you need to set ALL pointers NULL to begin with.

    – David C. Rankin
    Nov 11 '18 at 5:57











  • You also need to avoid writing to more pointers than you have. So you need to ensure i never exceeds n_tokens+1 as you currently have it, e.g. while (i < n_tokens + 1 && p != NULL). You can allocate for each token with strdup, e.g. arr[i] = strdup (p);. Now the contents of arr[n] will survive the return.

    – David C. Rankin
    Nov 11 '18 at 6:02















1















char** token_arr(char* str, int n_tokens)

char**arr = malloc((n_tokens+1)*sizeof(char*));
char str2[n_tokens + 1];
strcpy(str2,str);
int i = 0;
char *p = strtok (str2, " ");

while (p != NULL)

arr[i] = p;
//printf("%sn", arr[i]);
p = strtok (NULL, " ");
i++;

return arr;



The purpose of token_arr is to get a string and a number of tokens, then put the tokens into an array. The array of tokens is returned.



int main(void) 
char*str1 = "( 8 + ( 41 - 12 ) )";
char**expression = token_arr(str1, 9);
for(int i = 0; i < 9; i++)
printf("expression[%d] = %cn", i, *expression2[i]);
return 0;



Output:



expression2[0] = (
expression2[1] =
expression2[2] =
expression2[3] =
expression2[4] =
expression2[5] =
expression2[6] =
expression2[7] =
expression2[8] =


Why is only the first value being printed? What's wrong with my code?










share|improve this question

















  • 1





    You are allocating the array of pointers, but the pointers point to a local variable, str2, in the function, and that variable is out of scope once you're back in main(), so you're accessing invalid data that has been reused and presumably zeroed. You'll need to allocate the string where you copy the string literal, and you'll need to find a way to pass both allocated pointers back to the main() so you can free the allocated memory accurately. As it stands, you have 'undefined behaviour'. That means that what you see is a valid response, as are many other possible behaviours.

    – Jonathan Leffler
    Nov 11 '18 at 5:47












  • Unless you are using calloc or memset to preserve a sentinel NULL at the end of arr, there is no need for +1 in char**arr = malloc((n_tokens+1)*sizeof(char*));. arr is simply the number of pointers you are allocating. If you only have n_tokens you only need that number of pointers unless you are ensuring you have the final pointer NULL to allow iteration with a while loop -- but in that case you need to set ALL pointers NULL to begin with.

    – David C. Rankin
    Nov 11 '18 at 5:57











  • You also need to avoid writing to more pointers than you have. So you need to ensure i never exceeds n_tokens+1 as you currently have it, e.g. while (i < n_tokens + 1 && p != NULL). You can allocate for each token with strdup, e.g. arr[i] = strdup (p);. Now the contents of arr[n] will survive the return.

    – David C. Rankin
    Nov 11 '18 at 6:02













1












1








1








char** token_arr(char* str, int n_tokens)

char**arr = malloc((n_tokens+1)*sizeof(char*));
char str2[n_tokens + 1];
strcpy(str2,str);
int i = 0;
char *p = strtok (str2, " ");

while (p != NULL)

arr[i] = p;
//printf("%sn", arr[i]);
p = strtok (NULL, " ");
i++;

return arr;



The purpose of token_arr is to get a string and a number of tokens, then put the tokens into an array. The array of tokens is returned.



int main(void) 
char*str1 = "( 8 + ( 41 - 12 ) )";
char**expression = token_arr(str1, 9);
for(int i = 0; i < 9; i++)
printf("expression[%d] = %cn", i, *expression2[i]);
return 0;



Output:



expression2[0] = (
expression2[1] =
expression2[2] =
expression2[3] =
expression2[4] =
expression2[5] =
expression2[6] =
expression2[7] =
expression2[8] =


Why is only the first value being printed? What's wrong with my code?










share|improve this question














char** token_arr(char* str, int n_tokens)

char**arr = malloc((n_tokens+1)*sizeof(char*));
char str2[n_tokens + 1];
strcpy(str2,str);
int i = 0;
char *p = strtok (str2, " ");

while (p != NULL)

arr[i] = p;
//printf("%sn", arr[i]);
p = strtok (NULL, " ");
i++;

return arr;



The purpose of token_arr is to get a string and a number of tokens, then put the tokens into an array. The array of tokens is returned.



int main(void) 
char*str1 = "( 8 + ( 41 - 12 ) )";
char**expression = token_arr(str1, 9);
for(int i = 0; i < 9; i++)
printf("expression[%d] = %cn", i, *expression2[i]);
return 0;



Output:



expression2[0] = (
expression2[1] =
expression2[2] =
expression2[3] =
expression2[4] =
expression2[5] =
expression2[6] =
expression2[7] =
expression2[8] =


Why is only the first value being printed? What's wrong with my code?







c arrays output strtok






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 11 '18 at 5:37









TreavorTreavor

85




85







  • 1





    You are allocating the array of pointers, but the pointers point to a local variable, str2, in the function, and that variable is out of scope once you're back in main(), so you're accessing invalid data that has been reused and presumably zeroed. You'll need to allocate the string where you copy the string literal, and you'll need to find a way to pass both allocated pointers back to the main() so you can free the allocated memory accurately. As it stands, you have 'undefined behaviour'. That means that what you see is a valid response, as are many other possible behaviours.

    – Jonathan Leffler
    Nov 11 '18 at 5:47












  • Unless you are using calloc or memset to preserve a sentinel NULL at the end of arr, there is no need for +1 in char**arr = malloc((n_tokens+1)*sizeof(char*));. arr is simply the number of pointers you are allocating. If you only have n_tokens you only need that number of pointers unless you are ensuring you have the final pointer NULL to allow iteration with a while loop -- but in that case you need to set ALL pointers NULL to begin with.

    – David C. Rankin
    Nov 11 '18 at 5:57











  • You also need to avoid writing to more pointers than you have. So you need to ensure i never exceeds n_tokens+1 as you currently have it, e.g. while (i < n_tokens + 1 && p != NULL). You can allocate for each token with strdup, e.g. arr[i] = strdup (p);. Now the contents of arr[n] will survive the return.

    – David C. Rankin
    Nov 11 '18 at 6:02












  • 1





    You are allocating the array of pointers, but the pointers point to a local variable, str2, in the function, and that variable is out of scope once you're back in main(), so you're accessing invalid data that has been reused and presumably zeroed. You'll need to allocate the string where you copy the string literal, and you'll need to find a way to pass both allocated pointers back to the main() so you can free the allocated memory accurately. As it stands, you have 'undefined behaviour'. That means that what you see is a valid response, as are many other possible behaviours.

    – Jonathan Leffler
    Nov 11 '18 at 5:47












  • Unless you are using calloc or memset to preserve a sentinel NULL at the end of arr, there is no need for +1 in char**arr = malloc((n_tokens+1)*sizeof(char*));. arr is simply the number of pointers you are allocating. If you only have n_tokens you only need that number of pointers unless you are ensuring you have the final pointer NULL to allow iteration with a while loop -- but in that case you need to set ALL pointers NULL to begin with.

    – David C. Rankin
    Nov 11 '18 at 5:57











  • You also need to avoid writing to more pointers than you have. So you need to ensure i never exceeds n_tokens+1 as you currently have it, e.g. while (i < n_tokens + 1 && p != NULL). You can allocate for each token with strdup, e.g. arr[i] = strdup (p);. Now the contents of arr[n] will survive the return.

    – David C. Rankin
    Nov 11 '18 at 6:02







1




1





You are allocating the array of pointers, but the pointers point to a local variable, str2, in the function, and that variable is out of scope once you're back in main(), so you're accessing invalid data that has been reused and presumably zeroed. You'll need to allocate the string where you copy the string literal, and you'll need to find a way to pass both allocated pointers back to the main() so you can free the allocated memory accurately. As it stands, you have 'undefined behaviour'. That means that what you see is a valid response, as are many other possible behaviours.

– Jonathan Leffler
Nov 11 '18 at 5:47






You are allocating the array of pointers, but the pointers point to a local variable, str2, in the function, and that variable is out of scope once you're back in main(), so you're accessing invalid data that has been reused and presumably zeroed. You'll need to allocate the string where you copy the string literal, and you'll need to find a way to pass both allocated pointers back to the main() so you can free the allocated memory accurately. As it stands, you have 'undefined behaviour'. That means that what you see is a valid response, as are many other possible behaviours.

– Jonathan Leffler
Nov 11 '18 at 5:47














Unless you are using calloc or memset to preserve a sentinel NULL at the end of arr, there is no need for +1 in char**arr = malloc((n_tokens+1)*sizeof(char*));. arr is simply the number of pointers you are allocating. If you only have n_tokens you only need that number of pointers unless you are ensuring you have the final pointer NULL to allow iteration with a while loop -- but in that case you need to set ALL pointers NULL to begin with.

– David C. Rankin
Nov 11 '18 at 5:57





Unless you are using calloc or memset to preserve a sentinel NULL at the end of arr, there is no need for +1 in char**arr = malloc((n_tokens+1)*sizeof(char*));. arr is simply the number of pointers you are allocating. If you only have n_tokens you only need that number of pointers unless you are ensuring you have the final pointer NULL to allow iteration with a while loop -- but in that case you need to set ALL pointers NULL to begin with.

– David C. Rankin
Nov 11 '18 at 5:57













You also need to avoid writing to more pointers than you have. So you need to ensure i never exceeds n_tokens+1 as you currently have it, e.g. while (i < n_tokens + 1 && p != NULL). You can allocate for each token with strdup, e.g. arr[i] = strdup (p);. Now the contents of arr[n] will survive the return.

– David C. Rankin
Nov 11 '18 at 6:02





You also need to avoid writing to more pointers than you have. So you need to ensure i never exceeds n_tokens+1 as you currently have it, e.g. while (i < n_tokens + 1 && p != NULL). You can allocate for each token with strdup, e.g. arr[i] = strdup (p);. Now the contents of arr[n] will survive the return.

– David C. Rankin
Nov 11 '18 at 6:02












1 Answer
1






active

oldest

votes


















3














While I think you have probably got most of the issues sorted based on the comments, let's look at a way to address both the validation/return of expressions and a way to return the number of tokens to protect against an error in tokenization resulting in less than n_tokens being found.



As you have learned, when you declare str2 local to token_arr, it has automatic storage duration and is only valid within the scope where it is declared. When token_arr returns, the memory holding str2 is released for re-use and any attempt to reference that memory back in main() invokes Undefined Behavior.



What are your options? (1) use strdup to dynamically allocate storage for each token, copy the token to the new memory allocated, and then assign the starting address for the new block of memory containing the token to arr[i], e.g.



 arr[i] = strdup (p);


or (2) do the same thing manually using strlen, malloc & memcpy, e.g.



 size_t len = strlen(p);
arr[i] = malloc (len + 1);
/* validate - here */
memcpy (arr[i], p, len + 1);


Now each arr[i] points to a block of memory having allocated storage duration which remains valid until free is called on that block -- or the program ends.



What If Less Than n_tokens Are Found?



If less than n_tokens are found within token_arr and you attempt to use n_tokens through expressions back in main() you will likely invoke Undefined Behavior again. To ensure you only use the tokens found in token_arr and made available in main() by the assignment to expression -- Pass A Pointer To n_tokens as the second parameter and update it will the value of i before you return arr;, e.g.



char **token_arr (const char *str, int *n_tokens)

char **arr = malloc(*n_tokens * sizeof *arr);
...
i++;

*n_tokens = i; /* assign i to make tokes assigned available */

return arr;
}


Now n_tokens back in main() contains only the number of tokens actually found and allocated for and assigned to arr[i] in token_arr.



Validate Every Allocation



It is critical that you validate every call to malloc, calloc, realloc, strdup or any other function that allocates memory for you. Allocation can, and does, fail. When it does, it let's you know by returning NULL instead of a pointer containing the beginning address for the new block of memory. Check every allocation.



Putting it altogether, you could do something like:



#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char **token_arr (const char *str, int *n_tokens)

char **arr = malloc(*n_tokens * sizeof *arr);
char str2 [strlen(str) + 1];
int i = 0;

if (!arr) /* validate every allocation */
perror ("malloc-n_tokens");
return NULL;


strcpy (str2, str);

char *p = strtok (str2, " ");

while (i < *n_tokens && p != NULL) /* check used pointers */
arr[i] = strdup (p);
if (!arr[i]) /* strdup allocates -> you must validate */
perror ("strdup-arr[i]");
if (i) /* if tokens stored, break an return */
break;
else /* if no tokes stored, free pointers */
free (arr);
return NULL;


p = strtok (NULL, " ");
i++;

*n_tokens = i; /* assign i to make tokes assigned available */

return arr;


int main (void)

char *str1 = "( 8 + ( 41 - 12 ) )";
int n_tokens = 9;
char **expression = token_arr (str1, &n_tokens);

if (expression) /* validate token_arr succeeded */
for (int i = 0; i < n_tokens; i++) /* n_tokens times */
printf ("expression[%d] = %sn", i, expression[i]);
free (expression[i]); /* free mem allocated by strdup */

free (expression);


return 0;



(note: likewise check the return of token_arr before making use of the return)



Example Use/Output



$ ./bin/token_arr
expression[0] = (
expression[1] = 8
expression[2] = +
expression[3] = (
expression[4] = 41
expression[5] = -
expression[6] = 12
expression[7] = )
expression[8] = )


Memory Use/Error Check



In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.



It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.



For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.



$ valgrind ./bin/token_arr
==8420== Memcheck, a memory error detector
==8420== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==8420== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==8420== Command: ./bin/token_arr
==8420==
expression[0] = (
expression[1] = 8
expression[2] = +
expression[3] = (
expression[4] = 41
expression[5] = -
expression[6] = 12
expression[7] = )
expression[8] = )
==8420==
==8420== HEAP SUMMARY:
==8420== in use at exit: 0 bytes in 0 blocks
==8420== total heap usage: 10 allocs, 10 frees, 92 bytes allocated
==8420==
==8420== All heap blocks were freed -- no leaks are possible
==8420==
==8420== For counts of detected and suppressed errors, rerun with: -v
==8420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


Always confirm that you have freed all memory you have allocated and that there are no memory errors.



Look things over and let me know if you have further questions.






share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53246140%2fsplitting-a-string-into-tokens-and-putting-tokens-into-array-strtok%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    While I think you have probably got most of the issues sorted based on the comments, let's look at a way to address both the validation/return of expressions and a way to return the number of tokens to protect against an error in tokenization resulting in less than n_tokens being found.



    As you have learned, when you declare str2 local to token_arr, it has automatic storage duration and is only valid within the scope where it is declared. When token_arr returns, the memory holding str2 is released for re-use and any attempt to reference that memory back in main() invokes Undefined Behavior.



    What are your options? (1) use strdup to dynamically allocate storage for each token, copy the token to the new memory allocated, and then assign the starting address for the new block of memory containing the token to arr[i], e.g.



     arr[i] = strdup (p);


    or (2) do the same thing manually using strlen, malloc & memcpy, e.g.



     size_t len = strlen(p);
    arr[i] = malloc (len + 1);
    /* validate - here */
    memcpy (arr[i], p, len + 1);


    Now each arr[i] points to a block of memory having allocated storage duration which remains valid until free is called on that block -- or the program ends.



    What If Less Than n_tokens Are Found?



    If less than n_tokens are found within token_arr and you attempt to use n_tokens through expressions back in main() you will likely invoke Undefined Behavior again. To ensure you only use the tokens found in token_arr and made available in main() by the assignment to expression -- Pass A Pointer To n_tokens as the second parameter and update it will the value of i before you return arr;, e.g.



    char **token_arr (const char *str, int *n_tokens)

    char **arr = malloc(*n_tokens * sizeof *arr);
    ...
    i++;

    *n_tokens = i; /* assign i to make tokes assigned available */

    return arr;
    }


    Now n_tokens back in main() contains only the number of tokens actually found and allocated for and assigned to arr[i] in token_arr.



    Validate Every Allocation



    It is critical that you validate every call to malloc, calloc, realloc, strdup or any other function that allocates memory for you. Allocation can, and does, fail. When it does, it let's you know by returning NULL instead of a pointer containing the beginning address for the new block of memory. Check every allocation.



    Putting it altogether, you could do something like:



    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    char **token_arr (const char *str, int *n_tokens)

    char **arr = malloc(*n_tokens * sizeof *arr);
    char str2 [strlen(str) + 1];
    int i = 0;

    if (!arr) /* validate every allocation */
    perror ("malloc-n_tokens");
    return NULL;


    strcpy (str2, str);

    char *p = strtok (str2, " ");

    while (i < *n_tokens && p != NULL) /* check used pointers */
    arr[i] = strdup (p);
    if (!arr[i]) /* strdup allocates -> you must validate */
    perror ("strdup-arr[i]");
    if (i) /* if tokens stored, break an return */
    break;
    else /* if no tokes stored, free pointers */
    free (arr);
    return NULL;


    p = strtok (NULL, " ");
    i++;

    *n_tokens = i; /* assign i to make tokes assigned available */

    return arr;


    int main (void)

    char *str1 = "( 8 + ( 41 - 12 ) )";
    int n_tokens = 9;
    char **expression = token_arr (str1, &n_tokens);

    if (expression) /* validate token_arr succeeded */
    for (int i = 0; i < n_tokens; i++) /* n_tokens times */
    printf ("expression[%d] = %sn", i, expression[i]);
    free (expression[i]); /* free mem allocated by strdup */

    free (expression);


    return 0;



    (note: likewise check the return of token_arr before making use of the return)



    Example Use/Output



    $ ./bin/token_arr
    expression[0] = (
    expression[1] = 8
    expression[2] = +
    expression[3] = (
    expression[4] = 41
    expression[5] = -
    expression[6] = 12
    expression[7] = )
    expression[8] = )


    Memory Use/Error Check



    In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.



    It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.



    For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.



    $ valgrind ./bin/token_arr
    ==8420== Memcheck, a memory error detector
    ==8420== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
    ==8420== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
    ==8420== Command: ./bin/token_arr
    ==8420==
    expression[0] = (
    expression[1] = 8
    expression[2] = +
    expression[3] = (
    expression[4] = 41
    expression[5] = -
    expression[6] = 12
    expression[7] = )
    expression[8] = )
    ==8420==
    ==8420== HEAP SUMMARY:
    ==8420== in use at exit: 0 bytes in 0 blocks
    ==8420== total heap usage: 10 allocs, 10 frees, 92 bytes allocated
    ==8420==
    ==8420== All heap blocks were freed -- no leaks are possible
    ==8420==
    ==8420== For counts of detected and suppressed errors, rerun with: -v
    ==8420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


    Always confirm that you have freed all memory you have allocated and that there are no memory errors.



    Look things over and let me know if you have further questions.






    share|improve this answer



























      3














      While I think you have probably got most of the issues sorted based on the comments, let's look at a way to address both the validation/return of expressions and a way to return the number of tokens to protect against an error in tokenization resulting in less than n_tokens being found.



      As you have learned, when you declare str2 local to token_arr, it has automatic storage duration and is only valid within the scope where it is declared. When token_arr returns, the memory holding str2 is released for re-use and any attempt to reference that memory back in main() invokes Undefined Behavior.



      What are your options? (1) use strdup to dynamically allocate storage for each token, copy the token to the new memory allocated, and then assign the starting address for the new block of memory containing the token to arr[i], e.g.



       arr[i] = strdup (p);


      or (2) do the same thing manually using strlen, malloc & memcpy, e.g.



       size_t len = strlen(p);
      arr[i] = malloc (len + 1);
      /* validate - here */
      memcpy (arr[i], p, len + 1);


      Now each arr[i] points to a block of memory having allocated storage duration which remains valid until free is called on that block -- or the program ends.



      What If Less Than n_tokens Are Found?



      If less than n_tokens are found within token_arr and you attempt to use n_tokens through expressions back in main() you will likely invoke Undefined Behavior again. To ensure you only use the tokens found in token_arr and made available in main() by the assignment to expression -- Pass A Pointer To n_tokens as the second parameter and update it will the value of i before you return arr;, e.g.



      char **token_arr (const char *str, int *n_tokens)

      char **arr = malloc(*n_tokens * sizeof *arr);
      ...
      i++;

      *n_tokens = i; /* assign i to make tokes assigned available */

      return arr;
      }


      Now n_tokens back in main() contains only the number of tokens actually found and allocated for and assigned to arr[i] in token_arr.



      Validate Every Allocation



      It is critical that you validate every call to malloc, calloc, realloc, strdup or any other function that allocates memory for you. Allocation can, and does, fail. When it does, it let's you know by returning NULL instead of a pointer containing the beginning address for the new block of memory. Check every allocation.



      Putting it altogether, you could do something like:



      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>

      char **token_arr (const char *str, int *n_tokens)

      char **arr = malloc(*n_tokens * sizeof *arr);
      char str2 [strlen(str) + 1];
      int i = 0;

      if (!arr) /* validate every allocation */
      perror ("malloc-n_tokens");
      return NULL;


      strcpy (str2, str);

      char *p = strtok (str2, " ");

      while (i < *n_tokens && p != NULL) /* check used pointers */
      arr[i] = strdup (p);
      if (!arr[i]) /* strdup allocates -> you must validate */
      perror ("strdup-arr[i]");
      if (i) /* if tokens stored, break an return */
      break;
      else /* if no tokes stored, free pointers */
      free (arr);
      return NULL;


      p = strtok (NULL, " ");
      i++;

      *n_tokens = i; /* assign i to make tokes assigned available */

      return arr;


      int main (void)

      char *str1 = "( 8 + ( 41 - 12 ) )";
      int n_tokens = 9;
      char **expression = token_arr (str1, &n_tokens);

      if (expression) /* validate token_arr succeeded */
      for (int i = 0; i < n_tokens; i++) /* n_tokens times */
      printf ("expression[%d] = %sn", i, expression[i]);
      free (expression[i]); /* free mem allocated by strdup */

      free (expression);


      return 0;



      (note: likewise check the return of token_arr before making use of the return)



      Example Use/Output



      $ ./bin/token_arr
      expression[0] = (
      expression[1] = 8
      expression[2] = +
      expression[3] = (
      expression[4] = 41
      expression[5] = -
      expression[6] = 12
      expression[7] = )
      expression[8] = )


      Memory Use/Error Check



      In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.



      It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.



      For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.



      $ valgrind ./bin/token_arr
      ==8420== Memcheck, a memory error detector
      ==8420== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
      ==8420== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
      ==8420== Command: ./bin/token_arr
      ==8420==
      expression[0] = (
      expression[1] = 8
      expression[2] = +
      expression[3] = (
      expression[4] = 41
      expression[5] = -
      expression[6] = 12
      expression[7] = )
      expression[8] = )
      ==8420==
      ==8420== HEAP SUMMARY:
      ==8420== in use at exit: 0 bytes in 0 blocks
      ==8420== total heap usage: 10 allocs, 10 frees, 92 bytes allocated
      ==8420==
      ==8420== All heap blocks were freed -- no leaks are possible
      ==8420==
      ==8420== For counts of detected and suppressed errors, rerun with: -v
      ==8420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


      Always confirm that you have freed all memory you have allocated and that there are no memory errors.



      Look things over and let me know if you have further questions.






      share|improve this answer

























        3












        3








        3







        While I think you have probably got most of the issues sorted based on the comments, let's look at a way to address both the validation/return of expressions and a way to return the number of tokens to protect against an error in tokenization resulting in less than n_tokens being found.



        As you have learned, when you declare str2 local to token_arr, it has automatic storage duration and is only valid within the scope where it is declared. When token_arr returns, the memory holding str2 is released for re-use and any attempt to reference that memory back in main() invokes Undefined Behavior.



        What are your options? (1) use strdup to dynamically allocate storage for each token, copy the token to the new memory allocated, and then assign the starting address for the new block of memory containing the token to arr[i], e.g.



         arr[i] = strdup (p);


        or (2) do the same thing manually using strlen, malloc & memcpy, e.g.



         size_t len = strlen(p);
        arr[i] = malloc (len + 1);
        /* validate - here */
        memcpy (arr[i], p, len + 1);


        Now each arr[i] points to a block of memory having allocated storage duration which remains valid until free is called on that block -- or the program ends.



        What If Less Than n_tokens Are Found?



        If less than n_tokens are found within token_arr and you attempt to use n_tokens through expressions back in main() you will likely invoke Undefined Behavior again. To ensure you only use the tokens found in token_arr and made available in main() by the assignment to expression -- Pass A Pointer To n_tokens as the second parameter and update it will the value of i before you return arr;, e.g.



        char **token_arr (const char *str, int *n_tokens)

        char **arr = malloc(*n_tokens * sizeof *arr);
        ...
        i++;

        *n_tokens = i; /* assign i to make tokes assigned available */

        return arr;
        }


        Now n_tokens back in main() contains only the number of tokens actually found and allocated for and assigned to arr[i] in token_arr.



        Validate Every Allocation



        It is critical that you validate every call to malloc, calloc, realloc, strdup or any other function that allocates memory for you. Allocation can, and does, fail. When it does, it let's you know by returning NULL instead of a pointer containing the beginning address for the new block of memory. Check every allocation.



        Putting it altogether, you could do something like:



        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>

        char **token_arr (const char *str, int *n_tokens)

        char **arr = malloc(*n_tokens * sizeof *arr);
        char str2 [strlen(str) + 1];
        int i = 0;

        if (!arr) /* validate every allocation */
        perror ("malloc-n_tokens");
        return NULL;


        strcpy (str2, str);

        char *p = strtok (str2, " ");

        while (i < *n_tokens && p != NULL) /* check used pointers */
        arr[i] = strdup (p);
        if (!arr[i]) /* strdup allocates -> you must validate */
        perror ("strdup-arr[i]");
        if (i) /* if tokens stored, break an return */
        break;
        else /* if no tokes stored, free pointers */
        free (arr);
        return NULL;


        p = strtok (NULL, " ");
        i++;

        *n_tokens = i; /* assign i to make tokes assigned available */

        return arr;


        int main (void)

        char *str1 = "( 8 + ( 41 - 12 ) )";
        int n_tokens = 9;
        char **expression = token_arr (str1, &n_tokens);

        if (expression) /* validate token_arr succeeded */
        for (int i = 0; i < n_tokens; i++) /* n_tokens times */
        printf ("expression[%d] = %sn", i, expression[i]);
        free (expression[i]); /* free mem allocated by strdup */

        free (expression);


        return 0;



        (note: likewise check the return of token_arr before making use of the return)



        Example Use/Output



        $ ./bin/token_arr
        expression[0] = (
        expression[1] = 8
        expression[2] = +
        expression[3] = (
        expression[4] = 41
        expression[5] = -
        expression[6] = 12
        expression[7] = )
        expression[8] = )


        Memory Use/Error Check



        In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.



        It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.



        For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.



        $ valgrind ./bin/token_arr
        ==8420== Memcheck, a memory error detector
        ==8420== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
        ==8420== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
        ==8420== Command: ./bin/token_arr
        ==8420==
        expression[0] = (
        expression[1] = 8
        expression[2] = +
        expression[3] = (
        expression[4] = 41
        expression[5] = -
        expression[6] = 12
        expression[7] = )
        expression[8] = )
        ==8420==
        ==8420== HEAP SUMMARY:
        ==8420== in use at exit: 0 bytes in 0 blocks
        ==8420== total heap usage: 10 allocs, 10 frees, 92 bytes allocated
        ==8420==
        ==8420== All heap blocks were freed -- no leaks are possible
        ==8420==
        ==8420== For counts of detected and suppressed errors, rerun with: -v
        ==8420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


        Always confirm that you have freed all memory you have allocated and that there are no memory errors.



        Look things over and let me know if you have further questions.






        share|improve this answer













        While I think you have probably got most of the issues sorted based on the comments, let's look at a way to address both the validation/return of expressions and a way to return the number of tokens to protect against an error in tokenization resulting in less than n_tokens being found.



        As you have learned, when you declare str2 local to token_arr, it has automatic storage duration and is only valid within the scope where it is declared. When token_arr returns, the memory holding str2 is released for re-use and any attempt to reference that memory back in main() invokes Undefined Behavior.



        What are your options? (1) use strdup to dynamically allocate storage for each token, copy the token to the new memory allocated, and then assign the starting address for the new block of memory containing the token to arr[i], e.g.



         arr[i] = strdup (p);


        or (2) do the same thing manually using strlen, malloc & memcpy, e.g.



         size_t len = strlen(p);
        arr[i] = malloc (len + 1);
        /* validate - here */
        memcpy (arr[i], p, len + 1);


        Now each arr[i] points to a block of memory having allocated storage duration which remains valid until free is called on that block -- or the program ends.



        What If Less Than n_tokens Are Found?



        If less than n_tokens are found within token_arr and you attempt to use n_tokens through expressions back in main() you will likely invoke Undefined Behavior again. To ensure you only use the tokens found in token_arr and made available in main() by the assignment to expression -- Pass A Pointer To n_tokens as the second parameter and update it will the value of i before you return arr;, e.g.



        char **token_arr (const char *str, int *n_tokens)

        char **arr = malloc(*n_tokens * sizeof *arr);
        ...
        i++;

        *n_tokens = i; /* assign i to make tokes assigned available */

        return arr;
        }


        Now n_tokens back in main() contains only the number of tokens actually found and allocated for and assigned to arr[i] in token_arr.



        Validate Every Allocation



        It is critical that you validate every call to malloc, calloc, realloc, strdup or any other function that allocates memory for you. Allocation can, and does, fail. When it does, it let's you know by returning NULL instead of a pointer containing the beginning address for the new block of memory. Check every allocation.



        Putting it altogether, you could do something like:



        #include <stdio.h>
        #include <stdlib.h>
        #include <string.h>

        char **token_arr (const char *str, int *n_tokens)

        char **arr = malloc(*n_tokens * sizeof *arr);
        char str2 [strlen(str) + 1];
        int i = 0;

        if (!arr) /* validate every allocation */
        perror ("malloc-n_tokens");
        return NULL;


        strcpy (str2, str);

        char *p = strtok (str2, " ");

        while (i < *n_tokens && p != NULL) /* check used pointers */
        arr[i] = strdup (p);
        if (!arr[i]) /* strdup allocates -> you must validate */
        perror ("strdup-arr[i]");
        if (i) /* if tokens stored, break an return */
        break;
        else /* if no tokes stored, free pointers */
        free (arr);
        return NULL;


        p = strtok (NULL, " ");
        i++;

        *n_tokens = i; /* assign i to make tokes assigned available */

        return arr;


        int main (void)

        char *str1 = "( 8 + ( 41 - 12 ) )";
        int n_tokens = 9;
        char **expression = token_arr (str1, &n_tokens);

        if (expression) /* validate token_arr succeeded */
        for (int i = 0; i < n_tokens; i++) /* n_tokens times */
        printf ("expression[%d] = %sn", i, expression[i]);
        free (expression[i]); /* free mem allocated by strdup */

        free (expression);


        return 0;



        (note: likewise check the return of token_arr before making use of the return)



        Example Use/Output



        $ ./bin/token_arr
        expression[0] = (
        expression[1] = 8
        expression[2] = +
        expression[3] = (
        expression[4] = 41
        expression[5] = -
        expression[6] = 12
        expression[7] = )
        expression[8] = )


        Memory Use/Error Check



        In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.



        It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.



        For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.



        $ valgrind ./bin/token_arr
        ==8420== Memcheck, a memory error detector
        ==8420== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
        ==8420== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
        ==8420== Command: ./bin/token_arr
        ==8420==
        expression[0] = (
        expression[1] = 8
        expression[2] = +
        expression[3] = (
        expression[4] = 41
        expression[5] = -
        expression[6] = 12
        expression[7] = )
        expression[8] = )
        ==8420==
        ==8420== HEAP SUMMARY:
        ==8420== in use at exit: 0 bytes in 0 blocks
        ==8420== total heap usage: 10 allocs, 10 frees, 92 bytes allocated
        ==8420==
        ==8420== All heap blocks were freed -- no leaks are possible
        ==8420==
        ==8420== For counts of detected and suppressed errors, rerun with: -v
        ==8420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


        Always confirm that you have freed all memory you have allocated and that there are no memory errors.



        Look things over and let me know if you have further questions.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 11 '18 at 6:37









        David C. RankinDavid C. Rankin

        40.9k32648




        40.9k32648



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53246140%2fsplitting-a-string-into-tokens-and-putting-tokens-into-array-strtok%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

            ữḛḳṊẴ ẋ,Ẩṙ,ỹḛẪẠứụỿṞṦ,Ṉẍừ,ứ Ị,Ḵ,ṏ ṇỪḎḰṰọửḊ ṾḨḮữẑỶṑỗḮṣṉẃ Ữẩụ,ṓ,ḹẕḪḫỞṿḭ ỒṱṨẁṋṜ ḅẈ ṉ ứṀḱṑỒḵ,ḏ,ḊḖỹẊ Ẻḷổ,ṥ ẔḲẪụḣể Ṱ ḭỏựẶ Ồ Ṩ,ẂḿṡḾồ ỗṗṡịṞẤḵṽẃ ṸḒẄẘ,ủẞẵṦṟầṓế

            ⃀⃉⃄⃅⃍,⃂₼₡₰⃉₡₿₢⃉₣⃄₯⃊₮₼₹₱₦₷⃄₪₼₶₳₫⃍₽ ₫₪₦⃆₠₥⃁₸₴₷⃊₹⃅⃈₰⃁₫ ⃎⃍₩₣₷ ₻₮⃊⃀⃄⃉₯,⃏⃊,₦⃅₪,₼⃀₾₧₷₾ ₻ ₸₡ ₾,₭⃈₴⃋,€⃁,₩ ₺⃌⃍⃁₱⃋⃋₨⃊⃁⃃₼,⃎,₱⃍₲₶₡ ⃍⃅₶₨₭,⃉₭₾₡₻⃀ ₼₹⃅₹,₻₭ ⃌