[IEEE]
A hardware/software co-designed processor transparently supports a ubiquitous ISA (e.g. ×86) with diversified and innovative microarchitectural implementations. It leverages co-designed HW features and dynamic binary translation (DBT) SW to morph existing binary programs to scale performance and save power. On such systems, the portable bytecode of modern dynamic languages (e.g. Java, JavaScript, etc.) is first translated into the code in the architecture ISA by the just-in-time (JIT) compilation in the bytecode virtual machine, and then into the code in the internal implementation ISA by the DBT. This not only incurs the translation overheads twice, but also brings significant emulation inefficiency as the DBT does not have the high level bytecode information. In this paper, we present AccelDroid, which accelerates the Android Dalvik bytecode execution on the HW/SW co-designed processor through direct bytecode translation in the DBT. Our experiments on a HW/SW co-designed Transmeta Efficeon machine show that AccelDroid can improve performance by 78% and save energy by 40% for the CaffeineMark 3.0 benchmark suite.