universal character name (arbitrary Unicode value); may result in several characters
code point U+nnnn
\Unnnnnnnn
universal character name (arbitrary Unicode value); may result in several characters
code point U+nnnnnnnn
For example:
The following source skips to the start of the next page. (Applies
mostly to terminals where the output device is a printer rather than a
VDU (Virtual Display Unit))
Global variables are powerful but have the risk of being altered carelessly. Under most cases, we can add static
modifier on this global variables such that these variables can only be
altered in the file. However, there do have some situations that we
have to use global variables across different files. In this case, we
usually encounter an error happened in linking phase, i.e., error:multiple definition.
Some of these errors are obvious and easily to debug while others can
be really puzzling. Here I will give you an example that I encountered.
Suppose we have two source files, and the content is
1
2
3
4
5
6
7
#include<stdio.h>
// main.c
int gvar;
intmain(){
printf("shared var is %d\n",gvar);
return0;
}
1
2
// aux.c
int gvar = 5;
then we run the command
1
gcc -o test main.c test.c
and everything goes smoothly. That is to say, the code presented is correct, variable gvar is not multiple definition. Note that there is no extern modifier for gvar and the result of this program is 5, which means that the variable gvar
is shared across two files. This is quite strange since we know that if
two global variable have the same name in a project, it will incur multiple definition errors.
What is more, I found something more interesting. When we change the postfix of these two files,
1
2
3
mv main.c main.cpp
mv test.c test.cpp
gcc -o test test.cpp main.cpp #error:multiple definition
the compiler told me that there is a multiple definition error for variable gvar.
How does this happen, the content in the files are not changed and we
only changed the file name. An intuitive explanation is that the
different of c++ and c cause the puzzling bug since .cpp is the file type for c++ and .c is for c.
Strong and weak symbols
Actually,
these strange phenomenons are all caused by one features provided by
GCC, called strong and weak symbols. For global variables, it was
divided into three types.
initialized to a non-zero value
initialized to zero
not initialized, just defined
In GCC, the first two types of global variables is called strong symbols that are store in .DATA and .BSS section. As for the third type, it is called weak symbols, and it is saved in .COMMON section.
There are three rules that must be followed for these variables
only one strong symbol is allowed with the same name
if there exists one strong symbol and several weak symbols, the weak symbols are overrode by strong symbols
if there exists several weak symbols, GCC will choose one that have the largest size (memory occupation).
Now we can clarify why the c version program can run without any errors. In aux.c, we define a strong symbol gvar and it is initialized to 5. In main.c, we only define the variable gvar, and it is a weak symbol. When we compile the program using GCC, the gvar in main.c is overrode by gvar in aux.c according to the second rule. Therefore, the program runs smoothly and the result is 5. If we change the main.c as follows, it will incur multiple definition also.
1
2
3
4
5
6
7
#include<stdio.h>
// main.c
int gvar=0; // this is a strong symbol
intmain(){
printf("shared var is %d\n",gvar);
return0;
}
Wait, there is still one puzzling problem left. Why the program incurs multiple definition error when the file name is changed ? Actually, when we change the file type from .c to .cpp, the GCC compiler will use the rules for c++
problem to compile this c program. Therefore, to answer this question,
we need to investigate the difference when GCC handle the strong/weak
symbol between .cpp and .c.
Here is my conclusion. For c program, if you define an global
variable and not initialize it, GCC will regard it as weak symbol.
However, for c++ program, the default type is strong variable. That is to say, for line int gvar; in main.cpp, it is a strong symbol. Since we have another strong symbol with the same name in aux.cpp, the compiler gives the error.
If you want to use weak symbol in a c++ program, you need to explicitly declare the variable is weak. For example, if we write a c++ program like this,
1
2
3
4
5
6
7
#include<stdio.h>
// main.cpp
int __attribute__((weak)) gvar=2;
intmain(){
printf("shared var is %d\n",gvar);
return0;
}
1
2
// aux.cpp
int gvar = 5;
the program will have the same behavior like the c version.
To avoid the bugs like that, we can use the -fno-common
option provided by GCC, it will regard all variables as strong symbols.
However, in some cases, we have to use weak symbols (see next section).
Therefore, we should develop a good coding habit. There are three rules
we can follow,
eliminate all global variables (hard)
add static modifier for global variables, provide interfaces for accesses (medium)
initialize all global variables, such as zero (easy)
Function of s. w. symbols
It
seems that we should use strong symbols instead of weak symbols in
programming, so why does GCC provide weak symbols? As far as I known,
weak symbols are useful for library functions. For example, if the
symbols in library are weak symbols, users can easily override some
library functions for personal objectives. What’s more, programmers can
declare some weak symbols of library functions. If the program is linked
with the library, program can provide more powerful features,
Otherwise, the program can still run without any errors. Here is a
simple example.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>
#include <pthread.h>
__attribute__((weak)) int pthread_create( pthread_t*, const pthread_attr_t*,
void*(*)(void*), void*);
int main()
{
if (pthread_create)
{
printf("This is multi-thread version!\n");
}
else
{
printf("This is single-thread version!\n");
}
return 0;
}
If the program is not linked with pthread library, it will run in single-thread mode. Otherwise, it can run in multi-thread mode.
Manage your global variables
If
you have to use global variables, here is an way to manage your global
variables in an comfortable way. Create two files called global_var.h and global_var.c. Declare all global variables using extern modifier in global_var.h. Initialize all global variables in global_var.c. For instance,
1
2
3
4
5
6
// global_var.h
#ifndef GLOBAL_VAR
#define GLOBAL_VAR
externint g_A;
externchar g_B;
#endif
1
2
3
// global_var.c
int g_A = 0;
char g_B = 'g';
When you need to use global variables in other files, such as main.c, simple include global_var.h and you will be able to access all global variables.
1
2
3
4
5
6
7
// main.c
#include<stdio.h>
#include<global_var.h>
intmain(){
printf("var is %d\n",g_A);
return0;
}
Through this way, you can easily manage your global variables. However, be sure to use global variables as less as possible.
Instructions
in a pipelined processor are performed in several stages, so that at
any given time several instructions are being processed in the various
stages of the pipeline, such as fetch and execute. There are many
different instruction pipeline microarchitectures, and instructions may be executed out-of-order. A hazard occurs when two or more of these simultaneous (possibly out of order) instructions conflict.
Types
Data hazards
Data hazards occur when instructions that exhibit data dependence modify data in different stages of a pipeline. Ignoring potential data hazards can result in race conditions (also termed race hazards). There are three situations in which a data hazard can occur:
read after write (RAW), a true dependency
write after read (WAR), an anti-dependency
write after write (WAW), an output dependency
Consider two instructions i1 and i2, with i1 occurring before i2 in program order.
Read after write (RAW)
(i2 tries to read a source before i1
writes to it) A read after write (RAW) data hazard refers to a
situation where an instruction refers to a result that has not yet been
calculated or retrieved. This can occur because even though an
instruction is executed after a prior instruction, the prior instruction
has been processed only partly through the pipeline.
Example
For example: i1. R2 <- R1 + R3
i2. R4 <- R2 + R3
The first instruction is calculating a value to be saved in register R2, and the second is going to use this value to compute a result for register R4. However, in a pipeline,
when operands are fetched for the 2nd operation, the results from the
first will not yet have been saved, and hence a data dependency occurs.
A data dependency occurs with instruction i2, as it is dependent on the completion of instruction i1.
Write after read (WAR)
(i2 tries to write a destination before it is read by i1) A write after read (WAR) data hazard represents a problem with concurrent execution.
Example
For example: i1. R4 <- R1 + R5
i2. R5 <- R1 + R2
In any situation with a chance that i2 may finish before i1 (i.e., with concurrent execution), it must be ensured that the result of register R5 is not stored before i1 has had a chance to fetch the operands.
Write after write (WAW)
(i2 tries to write an operand before it is written by i1) A write after write (WAW) data hazard may occur in a concurrent execution environment.
Example
For example: i1. R2 'R2 <- R4 + R7 i2. R2 <- R1 + R3
The write back (WB) of i2 must be delayed until i1 finishes executing.
Structural hazards
A
structural hazard occurs when a part of the processor's hardware is
needed by two or more instructions at the same time. A canonical example
is a single memory unit that is accessed both in the fetch stage where
an instruction is retrieved from memory, and the memory stage where data
is written and/or read from memory.[3] They can often be resolved by separating the component into orthogonal units (such as separate caches) or bubbling the pipeline.
Branching hazards (also termed control hazards) occur with branches.
On many instruction pipeline microarchitectures, the processor will not
know the outcome of the branch when it needs to insert a new
instruction into the pipeline (normally the fetch stage).