2012年11月7日 星期三

weblog


The Problem:

When set to llvm::CodeGenOpt::None, some execution can cause segfault on ARM host.

Reduced Test Case:

Found in 483.xalanc benchmark

    movd   %edi,%xmm1
    pshufd $0x0,%xmm1,%xmm0
    mov    0x24(%esp),%ebx
    lea    (%ebx,%ecx,4),%ecx
    mov    %ecx,0x14(%esp)
    xor    %ecx,%ecx
    mov    0x14(%esp),%ebx
    movdqa %xmm0,(%ebx)
    add    $0x1,%ecx
    add    $0x10,%ebx
    cmp    %ebp,%ecx
    jb     _end

Reason:

The generated ARM code contains the instruction:
vld1.64 {d0-d1}, [sp, :128]
which requires $sp to be 16-byte (128bit) aligned.
BUT! $sp does not 16-byte aligned! 
The interesting thing is, after execution this instruction, it did not throw any exception. Instead, the value of $sp changes! Therefore, any instruction that accesses the stack cause segfault.

Solution:

make sure the $sp is at least 32-byte aligned in the prologue.

Code in prologue generated by TCG ARM:

Before:

---------------------------------------------------------
push    {r4, r5, r6, r8, r9, r10, r11, lr}
sub    sp, sp, #128  ; 0x80 # reserve some space
bx      r0 # go to code cache
pop     {r4, r5, r6, r8, r9, r10, r11, pc}
----------------------------------------------------------
After
----------------------------------------------------------

push    {r4, r5, r6, r8, r9, r10, r11, lr}
st       sp, [r7, xxx] # store stack pointer
sub    sp, sp, #65536  ; 0x10000 # reserve some space
bic     sp, sp, 0x1f  # align to 32-byte
bx      r0 # go to code cache
ld       sp, [r7, xxx] # restore stack pointer
pop     {r4, r5, r6, r8, r9, r10, r11, pc}
CP1
======================================================









沒有留言:

張貼留言