In-depth Understanding of Structures in C Language

Time: Column:Mobile & Frontend views:239

In C language, a structure (struct) is a powerful tool for organizing data, allowing you to combine different types of data into a single entity. Whether dealing with complex data, designing data models, or performing memory optimization, structures can help you better manage and organize data. In this article, we will delve into structures in C language.

1. Definition and Basic Usage of Structures

What is a Structure?A structure is a user-defined data type that allows us to group logically related data together. Each item of data is called a member of the structure. The members of a structure can be basic data types (like int, float, char, etc.), or other composite data types (like arrays, pointers, or even other structures).

  1. Structure DeclarationIn C language, the declaration of a structure is used to define a new data type composed of multiple different data members. The basic syntax for declaring a structure is as follows:

struct StructureName {
    DataType Member1;
    DataType Member2;
    // More members
};

Example:

#include <stdio.h>
 
// Declare a structure type "Student"
struct Student {
   char name[20];  // Name
   int age;        // Age
   char sex[5];    // Gender
   char id[20];    // Student ID
};

In the above code, Student is a named structure that can be used to create multiple structure variables, while point is an anonymous structure that does not have an explicit name, meaning it cannot be used to create other variables.

Note: When defining a named structure, you can directly create variables at the end of the structure declaration. For example:

struct Student {
   char name[20];  // Name
   int age;        // Age
   char sex[5];    // Gender
   char id[20];    // Student ID
} student1, student2;

In this case, Student is the structure type name, and student1 and student2 are the variables created directly in the structure declaration.

  1. Creating and Initializing Structure VariablesAfter declaring the structure type, you can create structure variables and initialize them. A structure variable is an instance of a structure type, which can be initialized at the time of declaration or assigned values at runtime.

#include <stdio.h>
struct Stu {
    char name[20];  // Name
    int age;        // Age
    char sex[5];    // Gender
    char id[20];    // Student ID
};

int main() {
    // Initialize according to the order of structure members
    struct Stu s = { "Zhang San", 20, "Male", "20230818001" };
    printf("name: %s\n", s.name);
    printf("age : %d\n", s.age);
    printf("sex : %s\n", s.sex);
    printf("id : %s\n", s.id);
 
    // Initialize according to specified order
    struct Stu s2 = { .age = 18, .name = "Li Si", .id = "20230818002", .sex = "Female" };
    printf("name: %s\n", s2.name);
    printf("age : %d\n", s2.age);
    printf("sex : %s\n", s2.sex);
    printf("id : %s\n", s2.id);
    return 0; 
}



  1. Structure Member Access OperatorsC provides two operators to access the members of a structure:

  • Dot operator (.): Used to access members through a structure variable.

  • Arrow operator (->): Used to access members through a structure pointer.

Example:

#include <stdio.h>
struct Stu {
    char name[20];  // Name
    int age;        // Age
    char sex[5];    // Gender
    char id[20];    // Student ID
};

int main() {
    struct Stu s = { "Zhang San", 20, "Male", "20230818001" };
    struct Stu* ptr = &s;
    printf("name: %s\n", ptr->name);
    printf("age : %d\n", ptr->age);
    printf("sex : %s\n", ptr->sex);
    printf("id : %s\n", ptr->id);
    return 0; 
}



  1. Special Declarations of Structures

  2. Anonymous StructuresWhen you define an anonymous structure, you can only create a variable at the same time. The structure does not have a name, so it cannot be used to create new variables elsewhere.

struct {
    int x;
    int y;
} point;

Here, point is a structure variable, but the structure itself has no name.

  1. Nested StructuresA nested structure is when one structure contains another structure as a member. Structures can nest other structures, including anonymous structures.

struct Date {
    int day;
    int month;
    int year;
};
 
struct Person {
    char name[50];
    struct Date birthday;  // Nested structure
    float height;
};

In this example, the Person structure includes the Date structure as one of its members.

  1. Self-referencing StructuresA self-referencing structure is one where one or more members of the structure are pointers that point to the same structure type.

struct Node {
    int value;
    struct Node* next;  // Self-referencing: pointer to the same structure type
};

In this example, the Node structure contains a pointer named next, which points to another instance of the Node structure.

  1. typedef DeclarationUsing the typedef keyword, you can define a new type name for a structure, making the structure declaration more concise.

typedef struct {
    char* name;
    int age;
} Person;
Person p1, p2;  // Create two structure variables

In this example, Person becomes an alias for the structure type struct { char* name; int age; }, allowing the creation of multiple structure variables like Person p1, p2;.


2. Struct Memory Alignment

What is Memory Alignment?

Memory alignment refers to storing data at specific memory addresses in a way that the starting address of the data meets certain alignment requirements. These alignment requirements are usually related to the size of the data type. For example, a 4-byte integer is typically required to be stored at an address that is a multiple of 4.

  1. Alignment Rules

First, let's understand the alignment rules for structures:

  1. The first member of a structure is aligned to the address where the structure variable starts, with an offset of 0.

  2. Other members must be aligned to an integer multiple of some number (alignment number).

    • In VS, the default alignment number is 8.

    • In Linux with GCC, there is no default alignment number, so the alignment number is the size of the member itself.

    • Alignment number = the smaller value between the compiler's default alignment number and the size of the member.

  3. The overall size of the structure is a multiple of the largest alignment number among its members.

  4. If a structure is nested, the nested structure's members must be aligned to the largest alignment number in its own members, and the overall size of the structure will be a multiple of the largest alignment number (including the alignment of nested structure members).

Example:

#include <stdio.h>

struct S1
{
    char c1; // 1 byte
    int i;   // 4 bytes
    char c2; // 1 byte
};

int main()
{
    printf("%d\n", sizeof(struct S1)); // Result is 12
    return 0;
}

Memory Distribution:

In-depth Understanding of Structures in C Language


  1. Why Does Memory Alignment Exist?

  2. Platform Reasons (Portability Issues):Not all hardware platforms can access data at arbitrary addresses. Some hardware platforms can only access data of specific types at certain addresses, otherwise, a hardware exception occurs.

  3. Performance Reasons:Data structures (especially stacks) should be aligned to natural boundaries as much as possible. This is because when accessing misaligned memory, the processor needs to perform two memory accesses. However, aligned memory requires only one access. For example, if a processor always fetches 8 bytes from memory, the address must be a multiple of 8. If we ensure that all double data types are aligned to multiples of 8, one memory operation is sufficient to read or write a value. Otherwise, we might need to perform two memory accesses, as the object might be split across two 8-byte blocks in memory.

Overall, structure memory alignment is a trade-off, sacrificing space for speed.

So, when designing structures, we need to meet the alignment requirements while also saving space. How can we achieve this? By grouping smaller members together:

#include <stdio.h>

struct S1
{
    char c1; // 1 byte
    int i;   // 4 bytes
    char c2; // 1 byte
};

struct S2 // Members with small space usage are grouped together
{
    char c1; // 1 byte
    char c2; // 1 byte
    int i;   // 4 bytes
};

int main()
{
    printf("Size of S1: %d\n", sizeof(struct S1));
    printf("Size of S2: %d\n", sizeof(struct S2));
}

S1 and S2 have the same member types, but the sizes differ:

In-depth Understanding of Structures in C Language


  1. Modify the Default Alignment Number

The #pragma preprocessor directive can be used to change the compiler’s default alignment number.

#include <stdio.h>

#pragma pack(1) // Set alignment to 1 byte
struct MyStruct {
    char a;  // 1 byte
    int b;   // 4 bytes
    double c; // 8 bytes
};
#pragma pack() // Restore default alignment

int main() {
    printf("Size of MyStruct: %zu\n", sizeof(struct MyStruct));
    return 0;
}

Output:

In-depth Understanding of Structures in C Language


The effect of #pragma pack(1) is limited to the code between it and the following #pragma pack(). After #pragma pack(), the alignment number is restored to the default setting. However, this does not affect the definition of MyStruct, as it was defined under the #pragma pack(1) directive.

Thus, the size of MyStruct is calculated as follows:

  • char a: 1 byte

  • int b: Since the alignment is set to 1, it immediately follows char a, occupying 4 bytes.

  • double c: Again, due to the alignment of 1, it follows int b, occupying 8 bytes.

Therefore, the total size of MyStruct is 1 + 4 + 8 = 13 bytes. No padding bytes are added because the alignment is set to 1, which means the members are stored contiguously.


3. Passing Structures as Arguments

  1. Passing by Value vs. Passing by Pointer

#include<stdio.h>

struct S {
    int data[1000];
    int num;
};

struct S s = { {1, 2, 3, 4}, 1000 };

// Passing structure by value
void print1(struct S s)
{
    printf("%d\n", s.num);
}

// Passing structure address (pointer)
void print2(struct S* ps)
{
    printf("%d\n", ps->num);
}

int main()
{
    print1(s); // Pass by value
    print2(&s); // Pass by address
    return 0;
}

Which function is better, print1 or print2?

Answer: The preferred function is print2.

Reason: When passing arguments to a function, the arguments need to be pushed onto the stack, which incurs both time and space overhead. If a large structure is passed by value, the system overhead for pushing the structure onto the stack can be significant, leading to performance degradation.

Conclusion: When passing a structure as an argument, it's better to pass its address (pointer).


4. Implementing Bit Fields in Structures

  1. Defining Bit Fields

The declaration of a bit field is similar to a structure, with two differences:

  1. Bit field members must be of type int, unsigned int, or signed int. In C99, other types can be used for bit fields.

  2. After the member name, there is a colon and a number, indicating the width of the bit field.

The bit field in a structure is defined as follows:

struct bit_field_struct {
    type member_name : width;
};

Where type is the data type of the bit field, usually unsigned int or int, member_name is the name of the bit field, and width is the width (in bits) of the bit field.

  1. Memory Allocation for Bit Fields

  2. Bit field members can be of types like int, unsigned int, signed int, or char.

  3. Memory for bit fields is allocated according to the type size. For example, 4 bytes for int or 1 byte for char.

  4. Bit fields are platform-dependent and not portable. Programs that require portability should avoid using bit fields.

#include<stdio.h>

struct S {
    char a : 3;
    char b : 4;
    char c : 5;
    char d : 4;
};

int main() {
    struct S s = { 0 };
    s.a = 10;
    s.b = 12;
    s.c = 3;
    s.d = 4;
    return 0;
}
//How is space allocated?

In-depth Understanding of Structures in C Language


  1. Important Considerations

  • Bit Field Types: Bit field types must be int, unsigned int, or signed int.

  • Bit Field Width: The width of a bit field must be a non-negative integer constant expression.

  • Bit Field Alignment: Bit field members may span the natural boundaries of their type, depending on the specific compiler.

  • Unnamed Bit Fields: Unnamed bit fields (e.g., unsigned int : 0;) can be used to force the next bit field to start at the next storage unit, aiding in alignment.

  • Accessing Bit Fields: Bit fields can be accessed using the structure variable name and the dot operator, just like normal structure members.

  • Bit Field Size: The total size of the structure with bit fields might be larger than the sum of all the bit field widths because the compiler may add padding bits for alignment.

Bit fields are an effective way to save memory, especially in embedded systems or when many boolean flags are needed. However, due to implementation details and portability issues, bit fields should be used with caution.


Summary

Through a detailed exploration of C language structures, we have learned about structure declaration, creation, initialization, member access, anonymous structures, self-referential structures, memory alignment, passing structures as arguments, and implementing bit fields. This knowledge will help you organize and manage data more efficiently in C programming and write clearer, more efficient code. Mastering these concepts is essential for any C developer. If you have any questions or want to discuss further, feel free to leave a comment, and we will explore together.