2012年7月24日 星期二

Wrok Flow


  • opt0: rearrange assembly instructions
  • opt1: sunk all miss blocks
  • opt2: move redundant dirty stores to slow path
  • opt3: victim tlb cache
    • could have variations
  • opt4: cross-page block linking
  • opt5: indirect branch target caching
  • opt6: enlarge TLB table size
  • opt7: TLB mini buffer to reduce fast path cycles;  probably failed, 
=============================================================
  • performance reduction?
    • baseline performance?
  • code_read TLB access defined in exec-all.h
    • quick thought as follows
    • split code access and data access, similar to i-Cache and d-Cache
    • Not worth it, too much work, too little gain.
  • move redundant stores to miss block:
    • restore clobber flag for qemu_ld and qemu_st
    • so before qemu_ld/st, dirty states will now be stored back to memory.
    • and we only need to store that is NOT redundant.
    • This is true for globals, what about temporaries?
      • ABOUT temporaries: we only need to store temporaries in miss block.
    • We store temporary variables in miss block.
    • We only need to consider global variables.
  • DO IT!
  • Optimization 2: Boot Test:
    • opt2: OK
    • opt2+opt4: OK
    • opt1+opt2+opt4:OK
    • opt1+opt2+opt3+opt4:OK, NOT SO SURE....
    • opt1+opt2+opt3+opt4+opt5: NOT OK
    • opt3: OK

沒有留言:

張貼留言