Sun Microsystems' Java 2, Micro Edition (J2ME) Mobile Information Device Profile is gaining momentum in the wireless market. It is finding its way onto many systems both high-end and midrange, all the way from embedded devices that use wireless as a connection medium to pass both control and data, to small-footprint information-based Internet-centric computing and communications appliances.
The high-end wireless appliances, such as PDAs with voice capability, are characterized by running a fully featured operating system, such as Pocket PC, Smartphone 2002 or Symbian OS. Here the kernel virtual machine (KVM) typically runs as a separate OS process. Midtier wireless devices run a monolithic OS, possibly with a phone-feature-aware network application environment running on top of it, such as Qualcomm's Binary Runtime Environment for Wireless (Brew). Here the KVM runs as a task (thread), or possibly within a thread.
Hosting a fully featured operating system requires large amounts of RAM and flash. For example the Kyocera 7135 phone comes equipped with 16 Mbytes of RAM on board. Midtier phones pack considerably less flash and RAM. The manufacturers generally keep the actual memory sizes under wraps, and are also very sensitive to memory utilization (that is, how best to populate flash and RAM), since they need to supply as many services as possible to the end user within these constraints.
Device manufacturers must at least honor the minimum memory requirements for Java contained in the the J2ME Mobile Information Device Profile (MIDP) specification. However, the reference implementation for MIDP 1.0 far exceeds those requirements, and because MIDP 2.0 adds functionality, the size is considerably larger, placing an additional burden on the memory footprint.
So what can the programmer do to reduce this footprint? One effective approach uses heap optimizations and a spatially optimizing prelinker to make space for a lightweight dynamic adaptive compiler (DAC). This scheme provides a high-performance MIDP product capable of running in a very tightly constrained memory footprint, both RAM and flash.
At the most basic level, one must consider how the handset's system software supports applications such as the virtual machine (VM). Essentially there are two scenarios involving the presence or absence of an application loader and confusion can arise over the memory footprint for the Java run-time environment if the scenario isn't clarified. Yet VM suppliers often fail to make this clarification when quoting memory footprint figures.
In the first scenario no application loader exists in the OS. Instead, applications are statically linked with the handset image, which is typical of monolithic OSes such as Nucleus. The code can execute in place in flash. Initialized read-only data is accessed from flash and initialized writable data is copied into RAM when the device boots.
In the second scenario, an application loader service is available. The application may reside in a discontiguous file system in flash. All code and data for the application are loaded into RAM for execution. Loading may all occur up front before the application executes (as in Qualcomm's Brew), or it may entail page faulting to populate RAM on the fly (as in the Windows CE family).
This scenario doubles the overall memory footprint for the application, since two copies exist during execution the original in flash and the loaded version in RAM. Some systems may offer a contiguous file system in flash as a spatial optimization, in which case the code can execute in place. This is atypical, however.
Meanwhile, there are two distinct approaches for storing the MIDP system classes. At first glance, the first one a jar file approach seems to be the easiest method for assessing memory footprint. Its flash footprint is simple and is just the file size, which for MIDP 2.0 is in the order of 0.5 Mbyte.
However, every system class accessed at run-time needs to be internalized by the VM's class loader into RAM, usually within the Java heap. Analysis shows this approach rapidly consumes RAM and you must allocate a lot more RAM to allow MIDlets to run. The heap occupancy for a MIDlet is now a function of both its own classes and the system classes it uses.
The other approach, which uses a prelinker, typically converts the system classes into some set of C source code files containing C structures that model the class file information, some of which uses the "const" qualifier to indicate flash-based storage for these at run-time. Prelinking is highly beneficial to VM startup time, since a large amount of computation can be avoided at run-time.
The use of a prelinker and the design of the data structures emitted by it can have a large effect on the overall memory footprint, both flash and RAM. For example, with Sun's MIDP RI, the flash memory footprint is slightly reduced through the use of prelinker JavaCodeCompact, as compared to using a compressed jar file. To enable reasonable byte code execution performance with the RI, the system classes' byte code must be copied at startup into RAM as part of the writable, initialized data. The interpreter can then quicken these on the fly. With the RI, the size of the VM and libraries becomes insignificant in relation to the size of the system classes' representation.
For memory footprint, one must also allow for the Java heap in RAM. It is used to store classes loaded at run-time, such as internalized MIDlet classes, along with Java objects and system objects. Flash space is also required for storing the MIDlets and for providing persistent data storage for the MIDlets' RMS. System resources, such as files and bit maps, may also reside externally to the heap, depending upon the OS. For example, Insignia's enhanced Connected Limited Device Configuration (CLDC)/ MIDP uses one-word object headers, giving approximately 14 percent better heap occupancy compared with the RI. A separate database is required for monitor and hash accesses but it is typically tiny.
Research indicates that run-time compilation can equate to a reduction in power consumption as compared to an interpreter-only build. This is especially true on architectures without on-chip caches, since off-chip memory accesses consume more power than the fetching of cached data or instructions. As such, developers must ensure that the DAC technology can be hosted in a MIDP environment without enlarging the footprint compared with the RI.
The DAC itself occupies 35 kbytes of flash as 32-bit ARM code (or less if implemented as Thumb code). It uses a build-time-configurable amount of RAM taken from the Java heap. The total footprint both flash and RAM consumed by the DAC and its buffer space still is far less than having a copy of the system classes' byte code in RAM.
The design of the data structures emitted by the prelinker is critical to the size reductions. For example, using JavaCodeCompact and enhanced CLDC/ MIDP's spatially optimizing prelinker on the same RI MIDP 2.0 jar file, the method data is 34 kbytes for an enhanced version of the CLDC/MIDP and 106 kbytes for the RI. (The figures are purely for the system class representation and do not include the VM, libraries, the Java heap, the DAC, RMS storage or MIDlet storage.)
The data structures emitted by JavaCodeCompact are used as input to the optimizing prelinker, which analyzes and optimizes the data, and generates new data structures. It discards redundant information, performs byte code conversions and generates a global constant pool to allow a large amount of data to be deleted. The byte code conversion exposes further redundancies, which are removed automatically.
To illustrate the point further, if the RI generates 44 kbytes of flash data for the constant-pool data, the optimizing prelinker will generate 10 kbytes. The byte code conversions include byte code fastening where there are no class initialization dependencies. The majority of byte codes potentially requiring class initialization can be proven to be convertible, allowing the byte code to be stored completely in flash. This saves approximately 130 kbytes of RAM that would have otherwise been stored in a byte code copy. These generated data structures are linked in with the VM.
At run-time, the DAC compiles byte code into target processor code, which is stored in RAM on a buffer-reusable basis, depending on byte code execution behavior, compiling only worthwhile byte code. A very small (less than 0.5-kbyte) statistically based execution history database is maintained for this purpose. Compiled code mitigates any memory access speed discrepancies between flash and RAM. Such a discrepancy is sometime cited as the justification for having a copy of the byte code in RAM in an interpreter-only system.
Clearly, an optimizing prelinker running at system build time can save space as well as providing much faster VM startup. The MIDP 2.0 version of enhanced CLDC/MIDP contains less Java than the RI, so the actual figures are even smaller. The net result of using heap optimizations, an optimizing prelinker and a lightweight DAC is a system-class footprint approximately 0.5 times the size of the RI more than enough headroom. This means MIDP 2.0-based handsets have more room for other services, not to mention better performance in a much smaller footprint.