Anyone know C ? Quick question...

Discussion in 'General Off-Topic Chat' started by 877, Sep 8, 2018.

  1. 877
    OP

    877 GBAtemp Regular

    Member
    3
    Mar 8, 2017
    United Kingdom
    Following a youtube guide and I am learning about arrays, the code is:

    char name[8] = "Shopping";
    printf("My array name is %s" ,name);


    But when it prints out is says:

    My array name is Shopping)

    Why is the end bracket showing?

    Thx
     
  2. antiNT

    antiNT a.k.a Johnny El Pollo Loco

    Member
    6
    Sep 14, 2015
    Qatar
    Doha - Qatar
    I know C#not C but what happens if you replace "Shopping" by "Shoppin" ?
     
    877 likes this.
  3. Ryccardo

    Ryccardo and his tropane alkaloids

    Member
    13
    Feb 13, 2015
    Italy
    Imola
    You have created a string of 8 bytes, but it should have been 9 (one for each letter + the terminator), and the extra ")" is by chance the consequence of displaying your unterminated string (it will go on reading memory until it finds a terminator)

    When using strings set directly in the source code at the time of creation, you don't need to specify their size - any modern compiler will auto calculate it, avoiding this kind of error
     
  4. 877
    OP

    877 GBAtemp Regular

    Member
    3
    Mar 8, 2017
    United Kingdom
    It shows:

    My array name is Shoppin
     
  5. Durelle

    Durelle Acadien Einnnn~!

    Member
    4
    GBAtemp Patron
    Durelle is a Patron of GBAtemp and is helping us stay independent!

    Our Patreon
    Dec 22, 2016
    Canada
    NB, Canada
    I think you forgot a " at the end
     
    877 likes this.
  6. 877
    OP

    877 GBAtemp Regular

    Member
    3
    Mar 8, 2017
    United Kingdom
    Thanks, if I set it to 9 bytes it displays correctly:
    My array name is Shopping

    That makes sense that it carries on reading memory and by chance found the ) symbol.

    I am following this series of Youtube tutorials here and it does say that the byte length is not required, but I thought it may be required as the array got longer and I would run out of variable names. Probably talking sh*t as I am just learning :)

    — Posts automatically merged - Please don't double post! —

    The " is after the %s
     
  7. Ryccardo

    Ryccardo and his tropane alkaloids

    Member
    13
    Feb 13, 2015
    Italy
    Imola
    If you plan on editing the string later, you should indeed NOT write:
    char[] lol = "Viagra"
    (some text between double quotes is a "string literal" placed somewhere in memory, which is NOT guaranteed to be editable later: and when you assign a string literal to a string, the only thing written to the variable is the memory position of that constant)
    (You will likely only understand that sentence once you are fluent with pointers, don't worry too much about it... yet)
    (The proper way to manually reserve some memory, for example for a string of certain length you are guaranteed to be rewritable, or file to be loaded/saved - is also something that needs both knowledge of malloc/free and pointers...)
     
    CuriousTommy and 877 like this.
  8. 877
    OP

    877 GBAtemp Regular

    Member
    3
    Mar 8, 2017
    United Kingdom
    Correct, I do not understand it yet, thanks for the help I will no doubt be back :)
     
  9. Cyan

    Cyan GBATemp's lurking knight

    Global Moderator
    22
    Oct 27, 2002
    France
    Engine room, learning
    understanding pointers was a little hard for me at first, I needed a visual representation of the memory chipset (RAM).
    the RAM is a continuous set of "banks" with values. to access a value, you are using variable names (it's easier for human), but the CPU uses memory addresses.

    Each memory bank has a value and an address.


    Code:
    Memory value...: 0x07 | 0x20 |  0x56 | 0x00 | 0x00 | ...... | 0x53     | 0x68     | 0x6F    | 0x70    |  0x70  |  0x69  |  0x6E  |  0x67  |  0x00   |  0x00   |            
    Memory address : 0x00 | 0x01 |  0x02 | 0x03 | 0x04 | ...... | 0x145823 | 0x145824 | 0x145825| 0x145826|0x145827|0x145828|0x145829|0x14582A|0x14582B |0x14582C |etc.
                              ^
    this memory bank has address 0x01 and value 0x20
    
    Like you know, a letter is not stored as a letter, but as a value. you use the ASCII Table to convert a value to its corresponding letter.
    strings are actually pointers to the memory.

    When you create a new array :
    char name[8] = "Shopping";
    it assigned the start of "name" to a memory address, in the example above, at "0x145823", and fill one letter's value per adjacent memory bank. (one letter per bank)

    you made an array with size of "8", but you can store shorter text in it. sometime you don't know how big a string can be.
    For example, creating an array to store a path to a file , you need to allow the end user to have a big path, path[255]; but it can store shorter strings too like "c:\file.txt", the array is still 255 characters long even if the string it contains is shorter.

    Creating
    char name[8] = "test";
    it also reserves 8 bytes to "name", but it doesn't mean it contains 8 letters. it just means creating a new variable will not overlap. a new variable will always be created at least 8 byte after "name" position in memory.

    When you run a command
    printf(name);
    It doesn't print the full array, because if you reserved a big array (name[255]), you don't want the print command to print 255 letters if you only use it to write "test" in it. you'll have lot of spaces after that word and before the next word to print.
    to determine the end of a string starting at a specific memory position, the printf command uses a "terminating character", the null character \0 (or 0x00 in hexadecimal).
    you need to store "test\0", or "Shopping\0".


    so, when you run
    printf(name);
    the CPU first locate the memory position of that variable. A string is an array of "char" (8bytes) values, so it's pointing to the first memory address used by your array. it's located at "145823" in the example I wrote above.
    It then read each successive "bank's value" in order without knowing in advance how big the array or the string are, and convert each value into letter using the ASCII table, until it find the end of the string (0x00).
    0x53 -> S
    0x68 -> h
    etc.
    until it finds a "0x00" located at memory address 0x14582B


    If you define your string as name[8], then the address 0x14582B could be assigned to another variable name (int nbr=41; ) for example.
    if the memory 0x14582B contains the value 0x29 (41 in hex), then printf would convert it into "ASCII" letter and display it as ")".
    by chance, you had the next memory bank with value 0x00, or else you would had a bigger string displayed on your result, or even a crash (if you try to display ASCII letter from non ascii values for example).

    To fix this problem you need to create the array one byte bigger than your text.
    char name[9]="Shopping"; // 8 letters, array size 9
    The compiler should add the terminating null character automatically at 9th position (Shopping\0).

    If your compiler does not add the terminating character automatically, you can have a buffer overflow bug.
    char name[9]="Shopping";
    name[9]=0; // force 0x00 manually on the 9th position. prevent buffer overflow ! Should be unneeded if the compiler takes care of it.



    Sorry, it was a big text, maybe hard to understand.
    I tried to make it as clear as I could, but I'm not always the best to make explanations :P
    I didn't really explained the pointers, just how strings are stored, and the importance of the end character.


    for now, remember that "pointer" is used to work with memory address.
    you can access a memory address if needed. it'll be useful later, when you'll have to make bigger programs.


    if you are ready for more headache :
    When you use a variable, you let the CPU locate the memory bank for you, and read/write the value in that bank.
    When you use a pointer, you access a bank number yourself, and read/write the value in that bank.
    to work with pointers, you often use "*" character.

    the "*" was the problem for me.
    I didn't understood when it meant "pointer" and when it meant "value".

    to define a pointer, you use *
    to access a pointer's value, you use *
    to access a pointer's memory address, you don't use anything
    to access the variable's memory address, you use &

    3 examples:

    int *ptr=0x145823; // address of a pointer, you define a memory. "ptr" is not a value, but a memory location. When you define a variable, use the * between the type and the variable name to specify that you want it as a pointer. int* ptr, int * ptr, or int *ptr are all identical.
    u8 var=41; // you let the CPU choose where to store "var", and it sets that memory bank to value 41. note that u8 or char are (usually) identical. it's a "8bits" long data, 1 byte, 1 memory bank. You can also use "char var=41;" because "char" does not mean you'll store an ASCII letter but that the variable should have the same size than the size 1 character takes in memory and store unsigned integers.
    ptr=0x145825; // change the memory address of the pointer. ptr is a pointer now, you can assign a different memory address.
    *ptr=var; // set value 41 in memory address 0x145825. when NOT defining a variable, if you use * it access the value of the memory bank number, not the memory address.


    int mem; // create a variable. let the CPU choose where it's stored in the memory.
    mem=&var; // store the address address of "var" in mem's value.
    printf(mem); // will print the address of var. we don't know the address of "var", we defined var as "int" and let the CPU choose where it's stored in the RAM.


    var=12;
    ptr=&var; // set the memory address of the "var" variable into the pointer "ptr". ptr and var have both the same memory address now.
    *ptr=50; // set the value of the bank located at the memory address of the ptr to 50. var and ptr sharing the same bank, editing the "value of ptr" is like editing the "value of var variable".
    printf(var); // prints 50. var has been edited using the pointer to its address.
     
    Last edited by Cyan, Sep 9, 2018
    DarkDengar, 877, Lucifer666 and 3 others like this.
  10. Attila13

    Attila13 Praise the Creep!

    Member
    6
    Oct 11, 2010
    Romania
    Zalau,Romania
    Sorry for my language, but this is the most n00b friendly answer I can give you.
    Always set your character array size, 1 size bigger than your input's length.
    You have:
    Code:
    char name[8] = "Shopping";
    
    where Shopping is 8 in length, so if you make your character array to be 9 in size, than it should be alright. :)

    Believe me, when I was messing with data structures and I had to make a phone book structure, where the numbers where stored in character arrays with a 10 in size, and my numbers where 10 in length as well...The code worked good enough, but the phone numbers never showed properly. It gave me lots of headaches, but I had a major brain fart, when I finally realized what was my problem. :lol:

    Hope it was helpful enough. I'm not a very good "professor", so sorry. :P
     
    Last edited by Attila13, Sep 8, 2018
    877 likes this.
  11. jimmyj

    jimmyj Official founder of altariaism. Copyright jimmyj

    Member
    7
    May 26, 2017
    Italy
    Hyrule
    although this has been solved,it is always good to add \n right before the final "
     
  12. Cyan

    Cyan GBATemp's lurking knight

    Global Moderator
    22
    Oct 27, 2002
    France
    Engine room, learning
    you mean \0
    if you add \n it only adds a new line and the string is not terminated yet.
     
  13. jimmyj

    jimmyj Official founder of altariaism. Copyright jimmyj

    Member
    7
    May 26, 2017
    Italy
    Hyrule
    I mean in printf
     
  14. Cyan

    Cyan GBATemp's lurking knight

    Global Moderator
    22
    Oct 27, 2002
    France
    Engine room, learning
    ohh, sorry. yes it's usually a good practice to add it at the end.

    printf("this is a %s to print a text.\n", txt);
    it assure your next sentence will be on a new line.
     
  15. jimmyj

    jimmyj Official founder of altariaism. Copyright jimmyj

    Member
    7
    May 26, 2017
    Italy
    Hyrule
    yep :P
     
  16. fuyukaidesu

    fuyukaidesu Member

    Newcomer
    1
    Mar 2, 2015
    France
    This is incorrect.

    char var[] = "anything";
    The compiler will insert instructions to reserve enough space (9 bytes) on the stack to hold the whole array and copy values from a possibly read only memory location, but in the end **every addressable element from "var" is guaranteed to be read/writable**.

    char *var = "anything";
    On the other hand will allocate space to store a single pointer to the string on the stack (4/8/other bytes depending on the target architecture) and assign an address to that string to this pointer.
    Typically on a linux binary, this string will be stored on the same section as the compiled instruction, and thus be read/executable, but not writable.
     
    Ryccardo likes this.
  17. Ryccardo

    Ryccardo and his tropane alkaloids

    Member
    13
    Feb 13, 2015
    Italy
    Imola
    Yep, double checked and it works the way you said:
    Capture.PNG
    http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf
     
  18. kuwanger

    kuwanger GBAtemp Maniac

    Member
    8
    Jul 26, 2006
    United States
    I'm pretty sure that's wrong. If you do sizeof(name) on char name[]="Shopping\0" vs sizeof(name) char name[]="Shopping", you get different values. Perhaps the preprocessor is dumb or the C compiler is smart enough to truncate to 9? If there's some part of the spec that specifies this, I'd be interested to know.
     
    Lucifer666 likes this.
  19. Cyan

    Cyan GBATemp's lurking knight

    Global Moderator
    22
    Oct 27, 2002
    France
    Engine room, learning
    I might be wrong.
    I don't know any specs, I did that by logic.

    but look at the picture posted by Ryccardo.
    if you don't specify the array size, and store 3 chars, it actually creates 4 and adds the \0 automatically.
    if you specify the array of size 3, it doesn't create the final character.

    char name[]="Shopping\0"
    char name[]="Shopping"
    running sizeof obviously gives different size, the first has 10, the second 9.


    So, I thought adding the terminating character directly in a determined size array was fine.

    maybe it's a bad practice. I don't know very well all the good practices.
    Usually, I use the second method, array[array_size]=0;

    Maybe it's not even needed and the compiler knows it has to init the unused pos to 0x00 ?
    I just prefer to force the terminating character myself, to be sure I have what I want. or in case I use another compiler one day.
    if I do something wrong, let me know, I wouldn't want to explain something wrong to someone who want to learn.
     
    Last edited by Cyan, Sep 8, 2018
  20. kuwanger

    kuwanger GBAtemp Maniac

    Member
    8
    Jul 26, 2006
    United States
    AFAIK, it allocates the string literal, allocates the buffer, then does a memcpy* of the specified size or the auto-generated size. Basically, you're adding a superfluous '\0';

    So:

    Code:
    char str[8] = "foo\0bar";
    
    str[3] = ' ';
    printf("%s\n", str);
    
    returns "foo bar"

    * And just to be clear, I don't mean actually calling memcpy. That'd be silly. :)
     
    Last edited by kuwanger, Sep 8, 2018
Loading...