In C language, a structure (
struct
) is a powerful tool for organizing data, allowing you to combine different types of data into a single entity. Whether dealing with complex data, designing data models, or performing memory optimization, structures can help you better manage and organize data. In this article, we will delve into structures in C language.
1. Definition and Basic Usage of Structures
What is a Structure?A structure is a user-defined data type that allows us to group logically related data together. Each item of data is called a member of the structure. The members of a structure can be basic data types (like int
, float
, char
, etc.), or other composite data types (like arrays, pointers, or even other structures).
Structure DeclarationIn C language, the declaration of a structure is used to define a new data type composed of multiple different data members. The basic syntax for declaring a structure is as follows:
struct StructureName { DataType Member1; DataType Member2; // More members };
Example:
#include <stdio.h> // Declare a structure type "Student" struct Student { char name[20]; // Name int age; // Age char sex[5]; // Gender char id[20]; // Student ID };
In the above code, Student
is a named structure that can be used to create multiple structure variables, while point
is an anonymous structure that does not have an explicit name, meaning it cannot be used to create other variables.
Note: When defining a named structure, you can directly create variables at the end of the structure declaration. For example:
struct Student { char name[20]; // Name int age; // Age char sex[5]; // Gender char id[20]; // Student ID } student1, student2;
In this case, Student
is the structure type name, and student1
and student2
are the variables created directly in the structure declaration.
Creating and Initializing Structure VariablesAfter declaring the structure type, you can create structure variables and initialize them. A structure variable is an instance of a structure type, which can be initialized at the time of declaration or assigned values at runtime.
#include <stdio.h> struct Stu { char name[20]; // Name int age; // Age char sex[5]; // Gender char id[20]; // Student ID }; int main() { // Initialize according to the order of structure members struct Stu s = { "Zhang San", 20, "Male", "20230818001" }; printf("name: %s\n", s.name); printf("age : %d\n", s.age); printf("sex : %s\n", s.sex); printf("id : %s\n", s.id); // Initialize according to specified order struct Stu s2 = { .age = 18, .name = "Li Si", .id = "20230818002", .sex = "Female" }; printf("name: %s\n", s2.name); printf("age : %d\n", s2.age); printf("sex : %s\n", s2.sex); printf("id : %s\n", s2.id); return 0; }
Structure Member Access OperatorsC provides two operators to access the members of a structure:
Dot operator (.): Used to access members through a structure variable.
Arrow operator (->): Used to access members through a structure pointer.
Example:
#include <stdio.h> struct Stu { char name[20]; // Name int age; // Age char sex[5]; // Gender char id[20]; // Student ID }; int main() { struct Stu s = { "Zhang San", 20, "Male", "20230818001" }; struct Stu* ptr = &s; printf("name: %s\n", ptr->name); printf("age : %d\n", ptr->age); printf("sex : %s\n", ptr->sex); printf("id : %s\n", ptr->id); return 0; }
Special Declarations of Structures
Anonymous StructuresWhen you define an anonymous structure, you can only create a variable at the same time. The structure does not have a name, so it cannot be used to create new variables elsewhere.
struct { int x; int y; } point;
Here, point
is a structure variable, but the structure itself has no name.
Nested StructuresA nested structure is when one structure contains another structure as a member. Structures can nest other structures, including anonymous structures.
struct Date { int day; int month; int year; }; struct Person { char name[50]; struct Date birthday; // Nested structure float height; };
In this example, the Person
structure includes the Date
structure as one of its members.
Self-referencing StructuresA self-referencing structure is one where one or more members of the structure are pointers that point to the same structure type.
struct Node { int value; struct Node* next; // Self-referencing: pointer to the same structure type };
In this example, the Node
structure contains a pointer named next
, which points to another instance of the Node
structure.
typedef DeclarationUsing the
typedef
keyword, you can define a new type name for a structure, making the structure declaration more concise.
typedef struct { char* name; int age; } Person; Person p1, p2; // Create two structure variables
In this example, Person
becomes an alias for the structure type struct { char* name; int age; }
, allowing the creation of multiple structure variables like Person p1, p2;
.
2. Struct Memory Alignment
What is Memory Alignment?
Memory alignment refers to storing data at specific memory addresses in a way that the starting address of the data meets certain alignment requirements. These alignment requirements are usually related to the size of the data type. For example, a 4-byte integer is typically required to be stored at an address that is a multiple of 4.
Alignment Rules
First, let's understand the alignment rules for structures:
The first member of a structure is aligned to the address where the structure variable starts, with an offset of 0.
Other members must be aligned to an integer multiple of some number (alignment number).
In VS, the default alignment number is 8.
In Linux with GCC, there is no default alignment number, so the alignment number is the size of the member itself.
Alignment number = the smaller value between the compiler's default alignment number and the size of the member.
The overall size of the structure is a multiple of the largest alignment number among its members.
If a structure is nested, the nested structure's members must be aligned to the largest alignment number in its own members, and the overall size of the structure will be a multiple of the largest alignment number (including the alignment of nested structure members).
Example:
#include <stdio.h> struct S1 { char c1; // 1 byte int i; // 4 bytes char c2; // 1 byte }; int main() { printf("%d\n", sizeof(struct S1)); // Result is 12 return 0; }
Memory Distribution:
Why Does Memory Alignment Exist?
Platform Reasons (Portability Issues):Not all hardware platforms can access data at arbitrary addresses. Some hardware platforms can only access data of specific types at certain addresses, otherwise, a hardware exception occurs.
Performance Reasons:Data structures (especially stacks) should be aligned to natural boundaries as much as possible. This is because when accessing misaligned memory, the processor needs to perform two memory accesses. However, aligned memory requires only one access. For example, if a processor always fetches 8 bytes from memory, the address must be a multiple of 8. If we ensure that all
double
data types are aligned to multiples of 8, one memory operation is sufficient to read or write a value. Otherwise, we might need to perform two memory accesses, as the object might be split across two 8-byte blocks in memory.
Overall, structure memory alignment is a trade-off, sacrificing space for speed.
So, when designing structures, we need to meet the alignment requirements while also saving space. How can we achieve this? By grouping smaller members together:
#include <stdio.h> struct S1 { char c1; // 1 byte int i; // 4 bytes char c2; // 1 byte }; struct S2 // Members with small space usage are grouped together { char c1; // 1 byte char c2; // 1 byte int i; // 4 bytes }; int main() { printf("Size of S1: %d\n", sizeof(struct S1)); printf("Size of S2: %d\n", sizeof(struct S2)); }
S1 and S2 have the same member types, but the sizes differ:
Modify the Default Alignment Number
The #pragma
preprocessor directive can be used to change the compiler’s default alignment number.
#include <stdio.h> #pragma pack(1) // Set alignment to 1 byte struct MyStruct { char a; // 1 byte int b; // 4 bytes double c; // 8 bytes }; #pragma pack() // Restore default alignment int main() { printf("Size of MyStruct: %zu\n", sizeof(struct MyStruct)); return 0; }
Output:
The effect of #pragma pack(1)
is limited to the code between it and the following #pragma pack()
. After #pragma pack()
, the alignment number is restored to the default setting. However, this does not affect the definition of MyStruct
, as it was defined under the #pragma pack(1)
directive.
Thus, the size of MyStruct
is calculated as follows:
char a
: 1 byteint b
: Since the alignment is set to 1, it immediately followschar a
, occupying 4 bytes.double c
: Again, due to the alignment of 1, it followsint b
, occupying 8 bytes.
Therefore, the total size of MyStruct
is 1 + 4 + 8 = 13 bytes. No padding bytes are added because the alignment is set to 1, which means the members are stored contiguously.
3. Passing Structures as Arguments
Passing by Value vs. Passing by Pointer
#include<stdio.h> struct S { int data[1000]; int num; }; struct S s = { {1, 2, 3, 4}, 1000 }; // Passing structure by value void print1(struct S s) { printf("%d\n", s.num); } // Passing structure address (pointer) void print2(struct S* ps) { printf("%d\n", ps->num); } int main() { print1(s); // Pass by value print2(&s); // Pass by address return 0; }
Which function is better, print1
or print2
?
Answer: The preferred function is print2
.
Reason: When passing arguments to a function, the arguments need to be pushed onto the stack, which incurs both time and space overhead. If a large structure is passed by value, the system overhead for pushing the structure onto the stack can be significant, leading to performance degradation.
Conclusion: When passing a structure as an argument, it's better to pass its address (pointer).
4. Implementing Bit Fields in Structures
Defining Bit Fields
The declaration of a bit field is similar to a structure, with two differences:
Bit field members must be of type
int
,unsigned int
, orsigned int
. In C99, other types can be used for bit fields.After the member name, there is a colon and a number, indicating the width of the bit field.
The bit field in a structure is defined as follows:
struct bit_field_struct { type member_name : width; };
Where type
is the data type of the bit field, usually unsigned int
or int
, member_name
is the name of the bit field, and width
is the width (in bits) of the bit field.
Memory Allocation for Bit Fields
Bit field members can be of types like
int
,unsigned int
,signed int
, orchar
.Memory for bit fields is allocated according to the type size. For example, 4 bytes for
int
or 1 byte forchar
.Bit fields are platform-dependent and not portable. Programs that require portability should avoid using bit fields.
#include<stdio.h> struct S { char a : 3; char b : 4; char c : 5; char d : 4; }; int main() { struct S s = { 0 }; s.a = 10; s.b = 12; s.c = 3; s.d = 4; return 0; } //How is space allocated?
Important Considerations
Bit Field Types: Bit field types must be
int
,unsigned int
, orsigned int
.Bit Field Width: The width of a bit field must be a non-negative integer constant expression.
Bit Field Alignment: Bit field members may span the natural boundaries of their type, depending on the specific compiler.
Unnamed Bit Fields: Unnamed bit fields (e.g.,
unsigned int : 0;
) can be used to force the next bit field to start at the next storage unit, aiding in alignment.Accessing Bit Fields: Bit fields can be accessed using the structure variable name and the dot operator, just like normal structure members.
Bit Field Size: The total size of the structure with bit fields might be larger than the sum of all the bit field widths because the compiler may add padding bits for alignment.
Bit fields are an effective way to save memory, especially in embedded systems or when many boolean flags are needed. However, due to implementation details and portability issues, bit fields should be used with caution.
Summary
Through a detailed exploration of C language structures, we have learned about structure declaration, creation, initialization, member access, anonymous structures, self-referential structures, memory alignment, passing structures as arguments, and implementing bit fields. This knowledge will help you organize and manage data more efficiently in C programming and write clearer, more efficient code. Mastering these concepts is essential for any C developer. If you have any questions or want to discuss further, feel free to leave a comment, and we will explore together.