Improve google android user experience with regional garbage collection

Y. He, C. Yang, X. Li. IFIP 2011

[Springer]

Google Android is a popular software stack for smart phone, where the user experience is critical to its success. The pause time of its garbage collection in DalvikVM should not be too long to stutter the game animation or webpage scrolling. Generational collection or concurrent collection can be the effective approaches to reducing GC pause time. As of version 2.2, Android implements a non-generational stop-the-world (STW) mark-sweep algorithm. In this paper we present an enhancement called Regional GC for Android that can effectively improve its user experience. During the system bootup period, Android preloads the common runtime classes and data structures in order to save the user applications’ startup time. When Android launches a user application the first time, it starts a new process with a new DalvikVM instance to run the application code. Every application process has its separate managed heap; while the system preloaded data space is shared across all the application processes. The Regional GC we propose is similar to a generational GC but actually partitions the heap according to regions instead of generations. One region (called the class region) is for the preloaded data, and the other region (called the user region) is for runtime dynamic data. A major collection of regional GC is the same as DalvikVM’s normal STW collection, while a minor collection only marks and sweeps the user region. In this way, the regional GC effectively improves Android in both application performance and user experience. In the evaluation of an Android workload suite (AWS), 2D graphic workload Album Slideshow is improved by 28%, and its average pause time is reduced by 73%. The average pause time reduction across all the AWS applications is 55%. The regional GC can be combined with a concurrent GC to further reduce the pause time. This paper also describes two alternative write barrier designs in the Regional GC. One uses page fault to catch the reference writes on the fly; the other one scans the page table entries to discover the dirty pages. We evaluate the two approaches with the AWS applications, and discuss their respective pros and cons.