Wednesday, 31 May 2017
How to create a new user as root?
If you are signed in as the root user, you can create a new user at any time by typing:
adduser newuser
If you are signed in as a non-root user who has been given sudo privileges, as demonstrated in the initial server setup guide, you can add a new user by typing:
sudo adduser newuser
Tuesday, 30 May 2017
Store VSX Scalar as Integer Halfword Indexed (stxsihx/STXSIHX)
X-form
stxsihx XS,RA,RB
Let XS be the value 32×SX + S.
Let the effective address (EA) be sum of the contents of
GPR[RA], or 0 if RA is equal to 0, and the contents of
GPR[RB].
The contents of halfword element 3 of VSR[XS] are
placed into the halfword in storage addressed by EA.
Special Registers Altered:
None
31 S RA RB 909 SX
0 6 11 16 21 31
if SX=0 & MSR.VSX=0 then VSX_Unavailable()
if SX=1 & MSR.VEC=0 then Vector_Unavailable()
EA <= ((RA=0) ? 0 : GPR[RA]) + GPR[RB]
MEM(EA,1) <= VSR[32×SX+S].byte[7]
VSR Data Layout for stxsibx
src = VSR[XS]
unused .byte unused
0 56 64 127
31 S RA RB 941 SX
0 6 11 16 21 31
if SX=0 & MSR.VSX=0 then VSX_Unavailable()
if SX=1 & MSR.VEC=0 then Vector_Unavailable()
EA <= ((RA=0) ? 0 : GPR[RA]) + GPR[RB]
MEM(EA,2) <= VSR[32×SX+S].hword[3]
VSR Data Layout for stxsihx
src = VSR[XS]
unused .hword[3] unused
0 48 64 127
Similarly STXSIBX, just store byte 7 into EA address in memory.
llvm PPC Register classes [Implementation based]
98 Register classes?
class name spill
VSSRC 32
VSFRC 64
VSRC 128
VRRC 128
F4RC 32
F8RC 64
VSHRC 128
we have 64 VSR’s (128-bit vector registers). The first 32 of them overlap with the FPR’s (floating point registers that occupy the most significant 64 bits of each).
The last 32 of them overlap with VR’s (128-bit Altivec/VMX registers).
- VSSRC(Vector-Scalar, Single-precision) is a scalar register class consisting of 64 registers that can each hold an f32 (edited)
- VSFRC (Vector Scalar Floating-points) class consisting of 64 registers that can each hold an f64
- VSRC is a register class consisting of 64 registers that can each hold a vector. This is currently restricted to `v4i32, v4f32, v2i64, v2f64` since those are the only types that we have meaningful operations on in the ISA.
And VSSRC and VSFRC are the upper 64-bits of VSR0-VSR63. I.e. they model higher 64 bit of VSR (all 64 of them)
- VRRC is a register class consisting of 32 registers that each hold an Altivec (VMX) vector type which is VSR32-VSR63
- F4RC is just the 32 floating point registers used for f32 values
So F4RC models the upper 64 bit of VSR0-VSR31? and for 32 floating point
- F8RC are the 32 floating point registers that each hold an f64
Then we have a bit of complexity when it comes to GPRC/G8RC. On the one hand, it’s pretty simple - 32/64 bit integer registers respectively.
R0-R31 are 32-bit ones. X0-X31 are the 64-bit ones. However, we have a number of instructions that treat R0/X0 as special - a zero means immediate zero, not contents of register zero.
For such instructions we use the `GPRC_NOR0` and `G8RC_NOX0`. They have a special register called `ZERO` and `ZERO8` respectively. And what we do is mark this register as reserved. Then we scan the instructions that use this register class to see if they’re fed by a register that contains a zero, we get rid of the instruction that defines that register and we allocate `ZERO/ZERO8`
because they instructions need constant 0 instead of R0
So we’ll catch stuff like this:
```li 4, 0
lfsx 2, 4, 3
```
we can freely get rid of the first instruction and put a `ZERO` instead of 4 in the `lfsx`
If you use the wrong register class, you could get werid SDAG selcetion error. (Can not map REGA to REGB, or something)
FPR(i) = upper VSR(i)
isCodeGenOnly means not suitable for disassembly and assembly etc.
Error for using wrong register class:
: error: In XXBRH: Type inference contradiction found, merging '{v4i32:v2i
64:v4f32:v2f64}' into 'v8i16'
def XXBRH : XX2_XT6_XO5_XB6<60, 7, 475, "xxbrh", vsrc,
The COPY_TO_REGCLASS must be used, if you want to copy a value to a class that doesn't support that type.
e.g. $A is v1i128, which is not supported by VSRC, thus we used COPY_TO_REGCLASS, we also copied the result of XXBRQ
to VRRC because after exectuting XXBRQ the result is still v1i128 even though you COPY_TO_REGCLASS $A to it,
and VRRC the only register class that support v1i128.
(v1i128 (COPY_TO_REGCLASS (XXBRQ (COPY_TO_REGCLASS $A, VSRC)), VRRC))>;
class name spill
VSSRC 32
VSFRC 64
VSRC 128
VRRC 128
F4RC 32
F8RC 64
VSHRC 128
we have 64 VSR’s (128-bit vector registers). The first 32 of them overlap with the FPR’s (floating point registers that occupy the most significant 64 bits of each).
The last 32 of them overlap with VR’s (128-bit Altivec/VMX registers).
- VSSRC(Vector-Scalar, Single-precision) is a scalar register class consisting of 64 registers that can each hold an f32 (edited)
- VSFRC (Vector Scalar Floating-points) class consisting of 64 registers that can each hold an f64
- VSRC is a register class consisting of 64 registers that can each hold a vector. This is currently restricted to `v4i32, v4f32, v2i64, v2f64` since those are the only types that we have meaningful operations on in the ISA.
And VSSRC and VSFRC are the upper 64-bits of VSR0-VSR63. I.e. they model higher 64 bit of VSR (all 64 of them)
- VRRC is a register class consisting of 32 registers that each hold an Altivec (VMX) vector type which is VSR32-VSR63
- F4RC is just the 32 floating point registers used for f32 values
So F4RC models the upper 64 bit of VSR0-VSR31? and for 32 floating point
- F8RC are the 32 floating point registers that each hold an f64
Then we have a bit of complexity when it comes to GPRC/G8RC. On the one hand, it’s pretty simple - 32/64 bit integer registers respectively.
R0-R31 are 32-bit ones. X0-X31 are the 64-bit ones. However, we have a number of instructions that treat R0/X0 as special - a zero means immediate zero, not contents of register zero.
For such instructions we use the `GPRC_NOR0` and `G8RC_NOX0`. They have a special register called `ZERO` and `ZERO8` respectively. And what we do is mark this register as reserved. Then we scan the instructions that use this register class to see if they’re fed by a register that contains a zero, we get rid of the instruction that defines that register and we allocate `ZERO/ZERO8`
because they instructions need constant 0 instead of R0
So we’ll catch stuff like this:
```li 4, 0
lfsx 2, 4, 3
```
we can freely get rid of the first instruction and put a `ZERO` instead of 4 in the `lfsx`
If you use the wrong register class, you could get werid SDAG selcetion error. (Can not map REGA to REGB, or something)
FPR(i) = upper VSR(i)
isCodeGenOnly means not suitable for disassembly and assembly etc.
Error for using wrong register class:
: error: In XXBRH: Type inference contradiction found, merging '{v4i32:v2i
64:v4f32:v2f64}' into 'v8i16'
def XXBRH : XX2_XT6_XO5_XB6<60, 7, 475, "xxbrh", vsrc,
The COPY_TO_REGCLASS must be used, if you want to copy a value to a class that doesn't support that type.
e.g. $A is v1i128, which is not supported by VSRC, thus we used COPY_TO_REGCLASS, we also copied the result of XXBRQ
to VRRC because after exectuting XXBRQ the result is still v1i128 even though you COPY_TO_REGCLASS $A to it,
and VRRC the only register class that support v1i128.
(v1i128 (COPY_TO_REGCLASS (XXBRQ (COPY_TO_REGCLASS $A, VSRC)), VRRC))>;
Wednesday, 24 May 2017
How to disable llvm opt optimization only? Leave the clang FE and llc optimization there
`-X -disable-llvm-passes`
This is the option to disable just llvm opt
Tuesday, 23 May 2017
How to rename a group of filename on unix?
a) put the file name in to a .sh, each line just put one file name, and make sure there is
no empty line.
b) :%s#\(.*\)#rename \1 prefix_\1_suffix#g
or b) :%s#\(.*\)#mv \1 prefix_\1_suffix#g (if you get error: Bareword "some_word" not allowed while "strict subs" in use at (eval 1) line number.)
c) edit the name to what you like, e.g, here I have to change *_119793 to *, I first change them all to *_1197938888, and then search _1197938888 and delete these suffix all.
d) then change mode to the script and run it. (chmod 777 rename.sh)
$ cat ~/bin/rename.sh
mv bindings_119793 bindings
mv cmake_119793 cmake
mv CMakeCache.txt_119793 CMakeCache.txt
mv CMakeFiles_119793 CMakeFiles
mv CMakeLists.txt_119793 CMakeLists.txt
mv CODE_OWNERS.TXT_119793 CODE_OWNERS.TXT
mv configure_119793 configure
mv CPackConfig.cmake_119793 CPackConfig.cmake
mv CPackSourceConfig.cmake_119793 CPackSourceConfig.cmake
mv CREDITS.TXT_119793 CREDITS.TXT
mv docs_119793 docs
mv examples_119793 examples
mv include_119793 include
mv lib_119793 lib
mv LICENSE.TXT_119793 LICENSE.TXT
mv LLVMBuild.txt_119793 LLVMBuild.txt
mv llvm.spec.in_119793 llvm.spec.in
mv projects_119793 projects
mv README.txt_119793 README.txt
mv RELEASE_TESTERS.TXT_119793 RELEASE_TESTERS.TXT
mv resources_119793 resources
mv runtimes_119793 runtimes
mv temp_52476 temp_524768888
mv test_119793 test
mv tools_119793 tools
mv unittests_119793 unittests
mv utils_119793 utils
91. llvm architecture (partially, to be extend)
IR--> SDAG Construction --> Type Legalization --> Operation Legalization --> Lowering --> ISel (Instruction Selection) --> MIR (Machine IR)
Thursday, 18 May 2017
BE VS LE data layout (&sldwi)
Say you have a vector int a = {1,2,3,4}, vector int b = {5,6,7,8}
The lay out of a in the memory on a BE machine is pretty simply, just as it is:
Because BE means high word is save in low address ( but it is natural for array access)
Addr0x 0 1 2 3
BE: [1,2,3,4]
On LE it is the opposite (high word is save in high address):
Addr0x 0 1 2 3
LE: [4,3,2,1]
say you have xxsldwi(a,b,3)
On BE: it is just 0 1 2 3 4 5 6 7
[1,2,3,4] [5,6,7,8]
The Result RT = {4,5,6,7}
if you want to use vec_shuffle to implement xxsldwi, you have to pass in
shuffle_vector((vector int)a, (vector int)b, 3,4,5,6), you can simply image BE
is acess from Left to right.
if you do for (int i = 0; i < 4; i++)
{
printf("RT[%d]=%d\n\n",i, RT[i]);
}
it is just :
c[0]=4
c[1]=5
c[2]=6
c[3]=7
But On LE, for the same array a and b,
it is just: 3 2 1 0 7 6 5 4
[4,3,2,1] [8,7,6,5]
The Result RT = [1,8,7,6] in the register
So no matter what, when xxsldwi(a,b,3), the word[3] of A, followed by
word[0], word[1] and word[2] will be put into the result vector register
if you want to use vec_shuffle to implement xxsldwi, you have to pass in
shuffle_vector((vector int)a, (vector int)b,5,6,7,0), you can simply image LE
is acess from right to left.
if you do for (int i = 0; i < 4; i++)
{
printf("RT[%d]=%d\n\n",i, RT[i]);
}
it is just :
c[0]=6
c[1]=7
c[2]=8
c[3]=1
Saturday, 6 May 2017
XXPERMDI/xxpermdi (VSX Permute Doubleword Immediate ) and its extended mnemonics XX3-form
22. Extended Mnemonic for xxpermdi
Extended Mnemonic Equivalent To
xxspltd T,A,0 <=> xxpermdi T,A,A,0b00
xxspltd T,A,1 <=> xxpermdi T,A,A,0b11
xxmrghd T,A,B <=> xxpermdi T,A,B,0b00
xxmrgld T,A,B <=> xxpermdi T,A,B,0b11
xxswapd T,A <=> xxpermdi T,A,A,0b10
Extended Mnemonic Equivalent To
xxspltd T,A,0 <=> xxpermdi T,A,A,0b00
xxspltd T,A,1 <=> xxpermdi T,A,A,0b11
xxmrghd T,A,B <=> xxpermdi T,A,B,0b00
xxmrgld T,A,B <=> xxpermdi T,A,B,0b11
xxswapd T,A <=> xxpermdi T,A,A,0b10
Subscribe to:
Posts (Atom)