Tuesday, 31 January 2017

subfc or sf (Subtract from Carrying) instruction


Purpose
Subtracts the contents of a general-purpose register from the contents of another general-purpose register and places the result in a third general-purpose register.
Syntax
BitsValue
0 - 531
6 - 10RT
11 - 15RA
16 - 20RB
21OE
22 - 308
31Rc
PowerPC® 
subfcRT, RA, RB
subfc.RT, RA, RB
subfcoRT, RA, RB
subfco.RT, RA, RB
POWER® family 
sfRT, RA, RB
sf.RT, RA, RB
sfoRT, RA, RB
sfo.RT, RA, RB
Description
The subfc and sf instructions add the ones complement of the contents of general-purpose register (GPR) RA and 1 to the contents of GPR RB and stores the result in the target GPR RT.
The subfc instruction has four syntax forms. Each syntax form has a different effect on Condition Register Field 0 and the Fixed-Point Exception Register.
The sf instruction has four syntax forms. Each syntax form has a different effect on Condition Register Field 0 and the Fixed-Point Exception Register.
ItemDescription
Syntax FormOverflow Exception (OE)Fixed-Point Exception RegisterRecord Bit (Rc)Condition Register Field 0
subfc0CA0None
subfc.0CA1LT,GT,EQ,SO
subfco1SO,OV,CA0None
subfco.1SO,OV,CA1LT,GT,EQ,SO
sf0CA0None
sf.0CA1LT,GT,EQ,SO
sfo1SO,OV,CA0None
sfo.1SO,OV,CA1LT,GT,EQ,SO
The four syntax forms of the subfc instruction, and the four syntax forms of the sf instruction, always affect the Carry bit (CA) in the Fixed-Point Exception Register. If the syntax form sets the Overflow Exception (OE) bit to 1, the instruction affects the Summary Overflow (SO) and Overflow (OV) bits in the Fixed-Point Exception Register. If the syntax form sets the Record (Rc) bit to 1, the instruction affects the Less Than (LT) zero, Greater Than (GT) zero, Equal To (EQ) zero, and Summary Overflow (SO) bits in Condition Register Field 0.
Parameters
ItemDescription
RTSpecifies target general-purpose register where result of operation is stored.
RASpecifies source general-purpose register for operation.
RBSpecifies source general-purpose register for operation.
Examples
  1. The following code subtracts the contents of GPR 4 from the contents of GPR 10, stores the result in GPR 6, and sets the Carry bit to reflect the result of the operation:
    # Assume GPR 4 contains 0x8000 7000.
    # Assume GPR 10 contains 0x9000 3000.
    subfc 6,4,10
    # GPR 6 now contains 0x0FFF C000.
  2. The following code subtracts the contents of GPR 4 from the contents of GPR 10, stores the result in GPR 6, and sets Condition Register Field 0 and the Carry bit to reflect the result of the operation:
    # Assume GPR 4 contains 0x0000 4500.
    # Assume GPR 10 contains 0x8000 7000.
    subfc. 6,4,10
    # GPR 6 now contains 0x8000 2B00.
  3. The following code subtracts the contents of GPR 4 from the contents of GPR 10, stores the result in GPR 6, and sets the Summary Overflow, Overflow, and Carry bits in the Fixed-Point Exception Register to reflect the result of the operation:
    # Assume GPR 4 contains 0x8000 0000.
    # Assume GPR 10 contains 0x0000 4500.
    subfco 6,4,10
    # GPR 6 now contains 0x8000 4500.
  4. The following code subtracts the contents of GPR 4 from the contents of GPR 10, stores the result in GPR 6, and sets the Summary Overflow, Overflow, and Carry bits in the Fixed-Point Exception Register and Condition Register Field 0 to reflect the result of the operation:
    # Assume GPR 4 contains 0x8000 0000.
    # Assume GPR 10 contains 0x0000 7000.
    subfco. 6,4,10
    # GPR 6 now contains 0x8000 7000.

subf : Subtract From


The Subtract From instructions subtract the second operand (RA) from the third (RB). Extended mnemonics are provided
that use the more “normal” order, in which the third operand is subtracted from the second. Both these mnemonics
can be coded with a final “o” and/or “.” to cause the OE and/or Rc bit to be set in the underlying instruction.
sub Rx,Ry,Rz (equivalent to: subf Rx,Rz,Ry)
subc Rx,Ry,Rz (equivalent to: subfc Rx,Rz,Ry)

Monday, 30 January 2017

使用 GNU C getopt 获取不固定位置的参数

有几个全局变量与getopt函数解析参数有关:
optind: int型, 指示下一个要解析的参数位置,初始时为1.
optarg: char *, 必须接参数的选项元素的参数, 如上面的-nxzz, optarg 就指向"xzz"字符串.
opterr: int 型, 设为0将不打印错误信息.

函数原型为:  int getopt(int argc, char * const argv[], const char *optstring);

参数说明: 前两个参数与main函数参数相同, argc保存参数个数,argv[]保存参数数组,第三个参数optstring是你定义
的选项字符组成的字符串, 如"abc",表示该命令有三个选项元素 -a, -b, -c, 选项字符后面如果有一个冒号说
明该选项元素一定有一个参数, 且该参数保存在optarg中如"n:t", 表示选项元素n后要接参数, 选项元素t后
不接参数,如 -n xzz -t 或 -nxzz t,有两个冒号说明该选项可接可选参数, 但可选参数不保存在optarg中.

返回值: 如果当前处理的参数为选项元素,且该选项字符在optstring字符串中, 即为你定义的选项, 则返回该
选项字符,如果该选项字符不是你定义的, 那么返回字符'?', 并更新全局变量optind, 指向argc数组中的下一
个参数. 如果当前处理的参数不是选项元素, 则optind偏移向下一个参数, 直到找到第一个选项元素为止,  然后
再按之前描述的操作,如果找不到选项元素, 说明解析结束, 则返回-1.



Examples:

$ cat getopt.c                                                                                                                [14/554]
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main (int argc, char **argv)
{
  int aflag = 0;
  int bflag = 0;
  char *cvalue = NULL;
  int index;
  int c;

  opterr = 0;

  while ((c = getopt (argc, argv, "abc:")) != -1)
    switch (c)
      {
      case 'a':
        aflag = 1;
        break;
      case 'b':
        bflag = 1;
        break;
      case 'c':
        cvalue = optarg;
        break;
      case '?':
        if (optopt == 'c')
          fprintf (stderr, "Option -%c requires an argument.\n", optopt);
        else if (isprint (optopt))
          fprintf (stderr, "Unknown option `-%c'.\n", optopt);
        else
$ cat getopt.c                                                                                                                [14/554]
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int
main (int argc, char **argv)
{
  int aflag = 0;
  int bflag = 0;
  char *cvalue = NULL;
  int index;
  int c;

  opterr = 0;

  while ((c = getopt (argc, argv, "abc:")) != -1)
    switch (c)
      {
      case 'a':
        aflag = 1;
        break;
      case 'b':
        bflag = 1;
        break;
      case 'c':
        cvalue = optarg;
        break;
      case '?':
        if (optopt == 'c')
          fprintf (stderr, "Option -%c requires an argument.\n", optopt);
        else if (isprint (optopt))
          fprintf (stderr, "Unknown option `-%c'.\n", optopt);
        else
          fprintf (stderr,
                   "Unknown option character `\\x%x'.\n",
                   optopt);
        return 1;
      default:
        abort ();
      }

  printf ("aflag = %d, bflag = %d, cvalue = %s\n",
          aflag, bflag, cvalue);

  for (index = optind; index < argc; index++)
    printf ("Non-option argument %s\n", argv[index]);
  return 0;


$ gcc -o opttest getopt.c
$ ./opttest -a -b -c
Option -c requires an argument.
$ ./opttest -a -b -c:foo
aflag = 1, bflag = 1, cvalue = :foo
$ ./opttest -a -b: -c:foo
Unknown option `-:'.


Sunday, 29 January 2017

A quick introduction to simple functions


The PowerPC ABI is fairly complex, and will be covered in much greater detail in the next article. However, for functions which do not themselves call any functions and follow a few easy rules, the PowerPC ABI provides a greatly simplified function-call mechanism.
In order to qualify for the simplified ABI, your function must obey the following rules:
  • It must not call any other function.
  • It may only modify registers 3 through 12.
  • It may only modify condition register fields cr0, cr1, cr5, cr6, and cr7.
  • It must not alter the link register, unless it restores it before calling blr to return.
When functions are called, parameters are sent in registers, starting with register 3 and going through register 10, depending on the number of parameters. When the function returns, the return value must be stored in register 3.
So let's rewrite our maximum value program as a function, and call it from C.
The parameters we should pass are the pointer to the array as the first parameter (register 3), and the size of the array as the second parameter (register 4). Then, the maximum value will be placed into register 3 for the return value.

Saturday, 28 January 2017

shift command

shift is a bash built-in which kind of removes arguments in beginning of the argument list. Given that the arguments provided to the script are 3 available in $1, $2, $3, then a call to shift will make $2 the new $1. a shift 2 will shift by two making new $1 the old $3. for more info see here

Thursday, 26 January 2017

What is OPerf and Perf?

97. What is Operf?
  operf - Performance profiler tool for Linux
   Operf is the profiler tool provided with OProfile. Operf uses the
     Linux Performance Events Subsystem and, thus, does not require the
     obsolete oprofile kernel driver.

     By default, operf uses <current_dir>/oprofile_data as the session-dir
     and stores profiling data there.  You can change this by way of the
     --session-dir option. The usual post-profiling analysis tools such as
     opreport(1) and opannotate(1) can be used to generate profile
     reports. Unless a session-dir is specified, the post-processing
     analysis tools will search for samples in <current_dir>/oprofile_data
     first. If that directory does not exist, the post-processing tools
     use the standard session-dir of /var/lib/oprofile.

     Statistics, such as total samples received and lost samples, are
     written to the operf.log file that can be found in the
        <session_dir>/samples directory

       
98. What is Perf?
Perf is a profiler tool for Linux 2.6+ based systems that abstracts away
CPU hardware differences in Linux performance measurements and presents
a simple commandline interface. Perf is based on the perf_events interface
 exported by recent versions of the Linux kernel.
 One example about how to use Perf on Linux (genoa here):
 perf stat -d ./executable_name
 perf stat -d ./gcc-ld-global-var-benchmark

Monday, 23 January 2017

Global Varible access using GOT (Global Offset Table)



section:“.bss” contains the uninitialized global data,  while .data contains all the initialized data.
In general, the data segment of the executable contains initialized global/static variables and the BSS segment contains uninitialized global/static variables.

Figure 1. Memory access via the GOT


The GOT is private to each process, and the process must have write permissions to it. Conversely the library code is shared and the process should have only read and execute permissions on the code; it would be a serious security breach if the process could modify code.

------------------------------------------------------------------------------------------------
jtony@genoa:~/scrum/s7$ cat got.c
extern int i;
void test(void)
{
  i = 100;
}
------------------------------------------------------------------------------------------------

Above we create a simple shared library which refers to an external symbol. We do not know the address of this symbol at compile time, so we leave it for the dynamic linker to fix up at runtime. But we want our code to remain sharable, in case other processes want to use our code as well.
------------------------------------------------------------------------------------------------
$ clang -nostdlib  -shared -o got.so ./got.c
$  readelf --sections ./got.so
 Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .gnu.hash         GNU_HASH         0000000000000158  00000158
       0000000000000034  0000000000000000   A       2     0     8
  [ 9] .got              PROGBITS         0000000000020000  00010000
       0000000000000010  0000000000000008  WA       0     0     8
  [10] .comment          PROGBITS         0000000000000000  00010010
       00000000000000a5  0000000000000001  MS       0     0     1
------------------------------------------------------------------------------------------------

If we have a look at the readelf output above we can see that the .got section starts 0x20000 bytes past where library was loaded into memory.  Thus if the library were to be loaded into memory at address 0x6000000000000000 the .got would be at 0x6000000000020000,  The disassembly reveals just how we do this with the .got.

-----------------------------------
objdump --disassemble ./got.so 
Disassembly of section .text:
0000000000000290 <test>:
 290:   02 00 4c 3c     addis   r2,r12,2
 294:   70 7d 42 38     addi    r2,r2,32112
 298:   64 00 60 38     li      r3,100
 29c:   00 00 00 60     nop
 2a0:   08 80 82 e8     ld      r4,-32760(r2)
 2a4:   00 00 64 90     stw     r3,0(r4)
 2a8:   20 00 80 4e     blr
        ...

On ppc64le, the register r2 is known as the Toc pointer and always points to TOC base address.
The symbol .TOC. may be used to access the GOT or in TOC-relative addressing to other data constructs. The symbol may be offset by 0x8000 bytes, or another offset, from the start of the .got section. I think after line 290 – 294 (addis, addi) [need to confirm] , r2 will point to .GOT + 0x8000 (i.e 32768).  Then the ld on line 2a0, will set r4 point to .GOT + (32768 – 32760), i.e. r4 point to .GOT + 8

Table 4:  Relocation
-----------------------------------
jtony@genoa:~/scrum/s7$ readelf --relocs ./got.so
Relocation section '.rela.dyn' at offset 0x270 contains 1 entries:
000000020008  000300000026 R_PPC64_ADDR64    0000000000000000 i + 0
-----------------------------------


The relocation says "replace the value at offset 0x20008 with the memory location that symbol i is stored at".
We know that the .got starts at offset 0x20000 from the previous output. We have also seen how the code loads an address 0x8 past this ( using ld      r4,-32760(r2))  giving us an address of 0x20000 + 0x8 = 0x20008 ... the address which the relocation is for!
So before the program begins, the dynamic linker will have fixed up the relocation to ensure that the value of the memory at offset 0x20008 is the address of the global variable i!