2012年10月30日 星期二

x86-to-ARM LnQ Status

Block mode, test inputs

400.perlbench                               NR
401.bzip2          --   180              -- S
403.gcc            --   466              -- S
429.mcf            --    53.7            -- S
445.gobmk          --   874              -- S
456.hmmer                                   NR                              
458.sjeng          --   129              -- S
462.libquantum     --    15.7            -- S                            
464.h264ref        --     0.0248            RE                        
471.omnetpp        --    62.1               RE                        
473.astar          --   126              -- S                            
483.xalancbmk      --   218              -- S
------------------------------------------------------------------------------------
Summary:
  • 400.perlbench, and 456.hmmer can not run due to floating point precision error
  • 464.h264ref
  • 471.omnetpp 
================================================================
Trace Mode, test inputs
400.perlbench                               NR
401.bzip2          --   140              -- S                               
403.gcc            --   335                 RE                              
429.mcf            --    53.5            -- S                               
445.gobmk          --   533                 RE
456.hmmer                                   NR                             
458.sjeng          --   106              -- S                           
462.libquantum     --    14.3            -- S                            
464.h264ref        --     0.0257            RE                             
471.omnetpp        --    50.3               RE
473.astar          --   104              -- S
483.xalancbmk      --   169              -- S
------------------------------------------------------------------------------------
Summary:
  • Error inherit from block mode 
  • 403.gcc
  • 445.gobmk
===============================================================
guest applications are re-compiled with gcc 4.7 to make sure it really use SSE instructions
  • Perlbench still stuck in arith.t due to precision problem
  • bzip OK
  • gcc: KILLILL, illegal instruction due to incorrect encoding of VST1LNd32
    • fix alignment encoding for these instructions in getAddrMode6AddressOpValue() in ARMCodeEmitter.cpp
  • Re-test all but perlbench and hmmer
    • hang on leslie3d, try to find out why...
      • trap in infinite loop; maybe due to precision error?
      • skip it.
    • Omnetpp: Fail to run; try to find out why...
      • QEMU cannot run either.
    • Bwaves: mis-compare; does QEMU get the same result?

    • Gamess: mis-compare; does QEMU get the same result?
      • NOTE: unlike bwaves, there are minus signs where it should not appear
    • Milc: mis-compare;
    • Zeus: segfault.
      • Reason: On ARM Linux, the shared library, like ld-2.13.so, libpthread-2.13.so, etc..., are loaded starting at 0x40000000, and and the image of x86 guest starts at 0x08048000.  Zeus asks for 0x45efa000 memory for its image, which cannot fit in the ``hole'' between 0x08048000 and 0x40000000.
      • And I move qemu image to 0x90000000;
      • So, the guest image is finally put in between 0x40000000 and 0x90000000.
      • however, during execution, the guest asks more memory, and finally shit happen...
    • Gromacs: mis-compare
    • Leslie3d: mis-compare
--------------------------------------------------------------------------------------
Summary:
  • QEMU FP precision problem
    • Perlbench and hmmer trap in infinite loop due to FP precision problem
    • Omnetpp fail to run; 
    • Only cactum and namd success in floating point benchmarks...
  • 9 CINT2006 benchmarks run successfully.
  • Measure timing...
400.perlbench                               NR
401.bzip2          --      163           -- S
403.gcc            --      507           -- S
429.mcf            --       56.1         -- S
445.gobmk          --      955           -- S
456.hmmer                                   NR
458.sjeng          --      126           -- S
462.libquantum     --       16.3         -- S
464.h264ref        --      387              VE
471.omnetpp                                 NR
473.astar          --      114           -- S
483.xalancbmk      --      230           -- S
======================================================================
Trace :

401.bzip2: NOT OK

  • Infinite loop


403.gcc: mis-compare
429.mcf: OK
445.gobmk: OK
458.sjeng: OK
462.libquantum: OK
473.astar: OK

464.h264ref: NOT OK

*** longjmp causes uninitialized stack frame ***: /home/tk/lnq/install/bin/qemu-i386 terminated
Aborted (core dumped)
483.xalancbmk: NOT OK
  • Terminate without printing anything
400.perlbench                               NR
401.bzip2                                   NR
403.gcc            --      460              VE
429.mcf            --       56.5         -- S
445.gobmk          --      874           -- S
456.hmmer                                   NR
458.sjeng          --      112           -- S
462.libquantum     --       14.7         -- S
464.h264ref                                 NR
471.omnetpp                                 NR
473.astar          --       98.4         -- S
483.xalancbmk                               NR

===============================================================
Debug Trace Mode:
===============================================================
First, try to find out whose fault, and always use easiest-bug-first strategy to fight.
===============================================================
GCC: Mis-compare

  •  When in trace mode, the blocks are compiled with IFastEnable and Opt::None options. So, check whether this error comes from fast instruction selection mode.
    • Set optimization options to ``Default'' and disable IFastEnable.
      • This error is gone.
      • Confirm!
  • Now debug becomes simple: run $l gcc two times: enabling and disabling FastISel, and compares logs to locate where went wrong.
    • FAIL!
  • Another approach:
    • EnableFastISel does not affect correctness.
    • llvm optimization does!
      • When opt is set to None, gcc got segfault!
      • When opt is set to Less, gcc runs successfully.
  • Run block with None, and compare used MI.
    • Run experiment!
      • Nothing found!
  • Still don't know why llvm::CodeGenOpt::None cause fault, try find reduced example.
    • Run CINT2006 with llvm::CodeGenOpt::None
      • Wait for result...
      • Error: 1x401.bzip2 1x403.gcc 1x445.gobmk 1x464.h264ref 1x483.xalancbmk
      • Success: 1x429.mcf 1x458.sjeng 1x462.libquantum 1x473.astar 1x999.specrand














沒有留言:

張貼留言