JSTAR: Practical Java Acceleration
For Information Appliances

by
Hank Shiffman
Java Technologist
JEDI Technologies Inc.
February, 2000

Introduction
This paper will discuss Java and its application to the growing class of devices known as information appliances. It will consider the reasons for Java's success with developers of these devices, the challenges these developers face and some of the approaches being taken to address these challenges. Finally, it will introduce JSTAR, a new product from JEDI Technologies, Inc. that delivers high performance Java to embedded applications.

The Java Phenomenon
The rise of Java is unprecedented in the computer industry. Never before has a single technology captured so much attention from companies both large and small, from investors and developers, from journalists and columnists and writers of every kind. That all this attention is focused on something as esoteric as a programming language is even more remarkable.

But Java is far more than a language. It has characteristics that go beyond the definition of typical programming languages into the realms of operating systems, networking and other application services. This wider definition of the Java environment has sparked fundamental changes in our thinking about software and about the computers that run it. Java is forcing us to reconsider the ways we develop applications, the ways we distribute them and even how they are used.

That there is excitement about Java is obvious. That this excitement is justified should be clear as well; after several years of development and tremendous investment in both money and talent, the interest in using Java to solve real problems is only growing. What may not be quite so obvious is that not everyone who sees value in Java wants the same thing from it. Java offers many benefits; which of these matter will vary with the needs of its users.

  • Java is portable. Most programming languages use a compiler to convert the instructions written by a programmer into the low level language of the computer that will run them. This machine language version of a program is specific to a kind of microprocessor and to the operating system that runs on it. As a result, software written for one kind of computer will not run on a different computer unless they use the same microprocessor and the same operating system and have access to the same set of system libraries.
     
    Moving a working application to an incompatible kind of system is not an easy task. For one thing, not every service the program may use is available on every system. Where these services are available, they may sport subtle differences that must be worked around. A small example is the symbol used to separate directory names in a file path. UNIX systems use the slash (/) character. Microsoft Windows relies on the backward slash (\). And MacOS uses a colon, although most Macintosh owners are probably unaware of that fact.
     
    Even in cases where porting an application to a new platform might be easy and inexpensive (for example, moving from one system running UNIX to another), one must consider the cost to test, market and support the new version. This cost consideration is the reason that many applications that could be made to run on multiple systems are rarely made available on those systems. For too many developers, any cost associated with a new version is too much.
     
    Java applications don't have such problems; the same piece of code will run on any computer that supports the language. Portability is achieved by not compiling programs for a specific computer. Instead, the target of the compilation is something called the Java Virtual Machine, a conceptual computer whose machine language, known as Java byte code, bears a striking resemblance to the Java language itself. Since real computers can't run this Java machine language directly, a piece of software has to be inserted between the computer and the application. This Java Virtual Machine emulator reads the compiled byte code and instructs the computer to perform the desired tasks.
     
    What this means is that any computer, armed with a Java Virtual Machine emulator (which we'll refer to as a JVM from now on), can run any Java application built for the JVM. The processor doesn't matter; the operating system doesn't matter. In the words of the Sun Microsystems marketing department, "Write Once, Run Anywhere." Initially this meant "run on any well equipped desktop computer for which someone has written a JVM." But as we will see, the meaning of "Run Anywhere" has grown almost beyond measure.
     
  • Java is productive. Although "Write Once, Run Anywhere" became Sun's rallying cry for Java almost from the beginning, not everyone has a requirement for that kind of portability. For many programmers the value of Java lies in improved productivity, their ability to write programs in less time and with higher quality than in other languages. Quality has many metrics: the number and severity of bugs, the ability of new developers to understand the structure and details of the code, the ease with which they can add new features and reuse portions of the code in new applications.
     
    Java is more productive than popular languages like C and C++. But why should this be so? To understand the difference between C/C++ and Java it helps to know a little history. The C language was developed at Bell Labs in the early days of UNIX. Like every other operating system of its day, the first version of UNIX was programmed in assembly language. This made it very efficient, since assembly language lets a programmer get as close to the hardware of a computer as anyone could possibly want. But assembly language is specific to a computer, the DEC PDP-7 in the case of that first UNIX. UNIX could never be ported to another kind of computer without rewriting everything from scratch.
     
    Enter C. C was designed as a systems programming language, so it still gave the programmer very precise control of the hardware. But as a high level compiled language, it gave programmers much higher productivity than assembly language. And by abstracting all the concepts shared by individual computers, it let a programmer create portable source code. UNIX was rewritten in C. And that led to UNIX being ported to many different kinds of computers. C became the language of choice not just for UNIX but for much of the software that would run on top of it. In time C would become not just the systems program language for UNIX but also for the Macintosh, where it displaced Pascal, and Microsoft Windows.
     
    But traditional C (often referred to as K&R C after its authors, Brian Kernighan and Dennis Ritchie) had its flaws. For one, it was extremely forgiving and would compile even the most obviously broken code without complaint. Getting a C program to compile was easy; getting it to work was another matter entirely. And C had scalability problems. As applications got very large it became harder and harder for programmers to keep track of all the relationships among their code and data.
     
    An answer to these problems arrived in the form of Bjarne Stroustrup's C++. A mostly compatible superset of C, C++ let programmers catch their more common mistakes at compile time. And it added a set of object-oriented programming features that made it easier for programmers to manage the code and data complexity of large programs. What it did not do, or at least did its best not to do, was give up the precise control and efficiency it inherited from C. Properly written C++ code should be every bit as fast as its equivalent in C, which made C++ a popular replacement for C for most applications.
     
    But in some respects C++ is a giant step backward. One of C's virtues is its simplicity; the first C programming text is a surprisingly thin book. And it's possible for the typical programmer to learn every nuance of C within a reasonable period of time. C++ adds so many new features that interact in so many subtle ways that most programmers give up trying to understand more than a subset of the language.
     
    Java code looks a lot like C++. But don't be fooled; these two languages were designed to solve vastly different problems. C++, like C, is intended for developers who need tight control over computing resources. Java is designed for developers who don't need such control and are willing to let the system handle more low level details for them.
     
    A few examples: Both C/C++ and Java let programs allocate chunks of memory at run time for new objects. In C/C++ it is the programmer's job to release the memory when it's no longer needed so it can be used later for other objects. Forget to do so and you have a memory leak; leak enough memory and eventually the program will fail. By contrast, Java uses an automatic Garbage Collector (GC) to identify and reclaim memory that's no longer in use. Making a GC a standard part of the language eliminates a whole class of logic errors and performance problems from Java code.
     
    Java and C/C++ also differ in the way they handle common run time errors. C and C++ treat error detection as a programmer problem. It is the programmer's responsibility to check the value returned by each system or library call to determine if an error occurred. (The specific error is identified by the value of a variable called errno.) Similarly, the programmer must check subscript values against the size of an array before using them. Such out-of-range subscript errors are both common and extremely hard to detect. (The usual symptom is a bad value in an unrelated piece of data, with no clue to what part of the code wrote that value.) Java avoids both of these problems by detecting errors automatically and then reporting them to the application using its exception mechanism. If the application handles the exception, it will take whatever action the programmer deemed appropriate. If it doesn't, Java will terminate the application. Better to fail than to continue with bad information.
     
    Java eliminates one other common source of problems in C/C++ code: the incorrect use of pointer arithmetic. It does so in the most direct fashion possible: by not letting the programmer manipulate pointers at all. This certainly eliminates a significant source of bugs. However, it does so at the expense of utility. You can't write operating systems or device drivers in Java. Without the ability to address arbitrary locations in memory, Java can't be used for the kind of low level tasks for which C was designed.
     
    Still, for applications that don't need low level control Java does provide tremendous productivity benefits. And even those that do need such control may benefit from a hybrid approach, using Java for most of the work and C or C++ native methods where Java can't do the job.
     
  • Java is flexible and dynamic. Most programming languages are designed around a static model of compilation, linking and execution. You compile your source and link your compiled module or modules with code from a series of libraries. The assumption is that all of the procedures an application will use are known at link time. It is possible to load new code during execution that wasn't identified at link time. However, this is not supported directly by most languages. Layering a dynamic object model like Microsoft's COM on top of C or C++ tends to produce strange looking, hard to understand source code.
     
    Java was designed for a much more dynamic style of linking. Java programs do have a link phase, where individual object classes and their methods (procedures) are pulled together into an executable program. What is different is that this link operation takes place every time the program is run. This late binding mechanism makes it much easier to incorporate changes, enhancements and new capabilities into existing applications.
     
    Java supports several mechanisms that take advantage of this dynamism to produce especially flexible software. JavaBeans, Enterprise JavaBeans and Jini serve different kinds of developers building different kinds of software. But they all work by letting applications know about new kinds of components: what they are, what operations they support, how they can be integrated, how they can be customized and so on.
     
    (A brief explanation of these three technologies: JavaBeans is a specification that lets programs learn about new object classes. It was designed to permit graphical application-building tools to adapt to new components and let their users connect them together to make applications. Enterprise JavaBeans is similar in concept; the big difference is that EJB components exist on a server system and are shared among multiple applications. EJB is oriented around large objects that make up a corporate multitier architecture. Finally we have Jini; a mechanism for devices on a shared network to learn about each other and to interact in a sort of open community. Put another way, Jini is intended for smart appliances, Enterprise JavaBeans is for in-house client/server business applications and JavaBeans is for objects within a single application running on a single Java Virtual Machine.)
Java & Information Appliances

Information appliances are a new class of specialized devices that are designed to interact with the Internet or a local network. There is incredible momentum behind such smart devices, with many industry analysts predicting that such appliances will outnumber PCs on the Internet within the next two years. What is equally interesting is that in every application category, manufacturers who are building information appliances are making Java an integral part of their devices. A few examples:

  • Mobile communications: More and more cellular telephone manufacturers are integrating Internet services into their products. Symbian is a partnership formed by Psion, Ericsson, Nokia, Motorola and Panasonic to develop a standard platform for wireless information devices. In March, 1999 Symbian announced the adoption of Java as part of its standard platform. At the same time, NTT of Japan announced the adoption of Java, Jini and Java Card for their digital cellular telephone service.
  • Screen phones: Screen phones combine traditional telephone capabilities with Internet access services like email and web browsing. Alcatel is already shipping such a product, their WebTouch One, based on Sun's PersonalJava and JavaOS platform. Many others are working on similar Java-based multifunction telephone/terminal products.
  • Set-top boxes: The set-top box is a broad category for devices that connect a television to a source of programming or an interactive network. Examples including cable boxes, Direct Broadcast Satellite (DBS) receivers and Internet terminal services like Microsoft's WebTV. In early 1998, TCI (now AT&T Cable Services) announced plans to adopt PersonalJava as the basis of future set-top boxes. By taking advantage of Java's portability, TCI will be able to deliver one set of standard software to all of its customers, regardless of the manufacturer of their equipment.
  • Digital televisions: As televisions make the move from current analog to high definition digital technology, many of the industry's largest firms are working together to create a standard programming interface (API) for digital TV applications. Sony, Philips, Matsushita, Toshiba, Motorola, OpenTV, LG Electronics and Hong Kong Telecom have partnered with Sun to create the JavaTV specification.
  • Automotive: More and more automobile makers are offering computerized systems for navigation, communications and passenger entertainment. In May, 1999 the Automotive Multimedia Interface Collaboration, a joint venture of Ford, General Motors, Daimler-Chrysler, Renault and Toyota, proposed standardizing on Java for in-vehicle multimedia systems and announced a partnership with Sun and IBM to develop the technology.
  • Thin servers: Thin servers are special purpose computers that provide file, print, web and/or application services or shared devices. Most of these devices emphasize compact size, limited expansion and easy installation and administration. Java is an essential part of such devices, whether to implement their services (e.g. Encanto Networks' Java-based Encanto Web Server), to provide a unified management tool (Axis Technologies' ThinWizard software) or to provide an environment for customer applications (Cobalt Networks' RaQ, Rebel.Com's NetWinder LC, JES Hardware Solutions' NetRaptor and Sun's own Netra servers).
  • Smart cards: Smart cards are credit card sized devices for information exchange or financial transactions. Unlike traditional credit and identification cards like driver's licenses, smart cards use embedded electronics to provide secure and reliable transactions even in the absence of a network connection. The smart card industry has moved to standardize on the Java Card API as the basis for their devices.

Each of the first six categories listed above represents tens of millions of devices that will rely on Java. The market for Java-based smart cards is far larger, with estimates as high as three billion units worldwide. Taken together, they represent a huge demand for Java solutions.

Embedded Java: A Paradigm Shift
All of this interest in Java among embedded developers represents a dramatic move away from their traditional model of development. The embedded world has always focused on the practical aspects of the devices they build. Typically these involve some combination of cost, size/weight and power consumption. Reducing these means finding low performance components that are just good enough to do the job. If a device will be sold in volume, cheap components mean lower costs and higher profits. Software development costs are not nearly so important. Reducing the development cost for a device is easy: just sell more units.

This fact has kept embedded projects from enjoying the kinds of productivity gains seen by virtually every other class of software developer. But all this is beginning to change, with Java at the forefront of that change. Here are some of the ways embedded development is being turned on its head:

  • Dedicated & standalone flexible & networked: Traditionally, embedded devices were designed for a single purpose. These devices interacted with their user but rarely had a need for complex interaction with other devices, especially devices that had yet to be invented. That will change with the development of smarter products. These products must be able to be used in ways not imagined by their designers and to interact with a variety of other products in convenient ways. Smarter interconnects among devices will be as important to the user as the capabilities of the individual devices.
  • Custom & proprietary universal & standards-based: Where efficiency of the end product is the ultimate goal, it is better to create a custom solution that does exactly what you want and nothing you don't need. Every implementation, every algorithm can be implemented and tuned to the specific requirements of the device under construction. But flexible, networked devices demand a different approach. Standards are a vital part of that approach. Each device can't dictate its communication with other devices; the only answer is to agree on a set of protocols that everyone will use. And if there is benefit in a device's software being customizable, it is essential that it make use of a universal software platform. The more we standardize the environment the easier it becomes to adapt it to new situations.
  • Monolithic layered: A device's software was generally designed as a single monolithic structure. Such a design is more efficient for the end product, although it tends to take far longer to develop. But increasing market pressure requires product teams to deliver their products faster. And the best way to write code faster is not to write it at all: reuse instead of recoding. Monolithic applications are giving way to layered approaches that write new code only where it's unavoidable. And programmer productivity becomes much more important when it is reflected in reduced time to market.

Much of Java's success has occurred on web pages (Java applets), as a programming language for database clients applications, in a few simple embedded devices and as a server-side programming language (Java servlets). These are at the low end of a range of applications, based on their size and complexity. Further up the scale we find potential applications like smart clients, more sophisticated embedded devices, shrink wrapped software, scientific & engineering computing and large enterprise applications. And although there are many developers working with Java in these areas, their success to date has been limited.

The problem is that Java's flexibility and portability come at too high a price for these applications. Java's memory needs are often much larger than C, once the requirements for the virtual machine and all the class libraries are added to the application. And remember that the JVM is an interpreter. Between the overhead of byte code interpretation and all of Java's extra error checking, it takes a much more powerful processor to get to the required level of performance. More memory and faster processors mean higher cost. Worse, for wireless applications they also mean higher power consumption and shorter battery life.

The Issue of Java Performance
Sun released its first implementation of Java in 1995. That version ran applications much more slowly than their native code equivalents, with typical applications reporting slowdowns of 20X and some applications running more than fifty times slower. Since then, huge investments have been made by Sun and others in an effort to close the gap between Java performance and that of native code.

Today, there are many different approaches to executing Java code. What follows is a survey of the different techniques commonly in use, each of which has benefits and tradeoffs. Keep in mind that these approaches are not mutually exclusive; a single Java solution may benefit from a combination of approaches.

  • The classic interpreter: Most Java implementations continue to ship with the kind of interpreter that appeared in the first JVM. At the heart of an interpreter is the instruction processing loop: fetch an instruction; figure out what to do with it; do it; figure out where to get the next instruction. The JVM's instruction loop has been rewritten countless times to reduce its overhead and has even been implemented in assembly language to eliminate even the smallest inefficiency introduced by the C compiler. In a loop that executes hundreds of thousands of times before the first bit of user code gets to execute, even a tiny inefficiency becomes significant.

But not all of a program's time is spent in interpretation. Java's powerful run-time environment reduces the demand on programmers by taking on many of their responsibilities. Features like garbage collection and thread synchronization, significant sources of overhead in early implementations, have seen their contribution to total execution time reduced as more sophisticated algorithms have been applied to them. And these improvements are being applied to every Java implementation, not just to interpreters.

  • Just-in-time (JIT) compilers: If interpreters are so slow, the best way to make things faster is to get rid of them. A JIT compiler does just that, replacing portable Java byte code with the equivalent native machine instructions the first time a piece of code is run. By doing the translation at run time we keep the benefit of portability. We also maintain an important aspect of Java security: the ability to examine code before we execute it to determine that it doesn't attempt to use areas of memory in inappropriate ways. This kind of validation is essential in the rough and ready world of the Internet, where we can't always know where our software is coming from and want to be protected from flawed or malicious software. Validation is also impossible to achieve using native code, regardless of the language used to write it.

JIT compilers provide dramatically better performance than interpreters, often reducing run time by a factor of ten. But they can be a mixed blessing. Using a JIT compiler means adding some translation overhead up front to get better performance later in the execution. And a JIT means a big increase in memory footprint. Depending on the processor, translated code will be between five and ten times the size of the original byte code. In server applications where large memory configurations are the norm a tradeoff of memory for time may well be acceptable. But in an embedded application more memory means higher cost and more drain on limited power.

There are variations on the JIT theme. Sun has released a translation engine it calls HotSpot. HotSpot is a dynamic JIT. Instead of translating every routine as it's used, HotSpot monitors the program's execution and identifies the most often used and most expensive routines. It translates only those routines it deems worthwhile, letting the interpreter deal with the rest. In theory this will offer better performance than a JIT and lower memory usage, since less time and memory are taken up with translated code. But HotSpot's monitoring adds its own overhead. Experiments show that HotSpot produces better performance for some applications but slows down others. And its memory requirements are no better than those of JIT solutions.

  • Native code compilers: A JIT compiler has limits on the quality of the code it can produce. Generating optimal code takes time, time a JIT cannot afford to spend while a user waits impatiently for results. An additional consideration is the dynamic nature of Java; since classes are loaded as they are used it isn't possible to have all the information required to generate the best possible set of instructions.

A Java native compiler violates some of the rules of Java in exchange for better performance. It treats Java byte code the way other language compilers treat source, analyzing it, optimizing it and then converting it into native code. This object code is linked with Java run time support and other system libraries to create an executable program. Such a program gives up many of the advantages of Java. It isn't portable, it can no longer be validated for safety and, depending on the implementation, it may not support run time integration with new classes into dynamic networks of components.

Why would someone give up so much of what makes Java special? Native code compilers will be of interest primarily to developers who choose Java for its productivity rather than its flexibility. For them Java is a better language, what C++ might have been if its designers had worried more about the needs of the programmer than about those of the computer. Developers of compute-intensive software may fall into this category, as would developers of in-house enterprise applications.

Java native applications will have better performance than JIT compiled code, although the difference may not be significant. They may also have somewhat lower memory requirements, since they don't need to carry around the compiler at run time. The benefit provided by a native code compiler will depend heavily on the specifics of the application.

  • Faster processors: The usual solution for slow software is to throw more hardware at it. Processors continue to improve both their performance and their price/performance; eventually they will reach a stage where our slow applications run fast enough. The questions are: How long must we wait? And what will the hardware cost when it arrives? Faster processors demand more power and generate more heat and radio frequency emissions, making them a real challenge for cell phones and other portable wireless devices. And even desktop and server applications can't wait for the 20X performance improvement that would make interpreted Java as fast as existing native solutions. When faster processors arrive they won't completely eliminate the need for some kind of JIT solution and its increase in memory that comes with it.

  • Java-specific processors: Instead of attempting to emulate the Java Virtual Machine on an existing processor, a Java chip implements that conceptual computer in real silicon. A Java processor uses Java byte code as its machine language. There is no need to translate to a different instruction set or to layer the JVM's stack-based architecture on top of the register architecture used by most microprocessors. Java processors seem the simplest and most natural way to run Java efficiently.

Although Java chips are designed around their ability to run Java, they must be able to do more than that. Java byte code does not permit the arbitrary manipulation of memory you need to write operating systems and device drivers. Java chips need these capabilities to run system software and reserve some unassigned byte codes for such privileged operations. Any attempt to use these byte codes from a Java application will be reported as a security violation.

The challenge for a Java chip is the same as for any new processor with a new instruction set: software. One can assume that a supplier of a Java chip will provide a real-time operating system (RTOS) and a Java Virtual Machine for their chip, whether as custom software or ported versions of existing packages. Note that any new processor, even a Java processor, needs someone to implement a JVM for it before it can run Java applications. This can be a significant effort: while Java applications are portable, the underlying Java platform software is not.

This raises important questions each embedded device designer must answer: Is the software my device will need available on this processor? Is my first choice of RTOS (either because of its feature set or my organization's familiarity with it) available? What about the availability and quality of JVM(s) for that chip? And what about other software: applications, libraries, etc.? In short, how much of the software already runs on the chip and how much time and money will it take to get the missing pieces in place?

Designers of embedded devices have large investments in hardware, software and, even more significant, in the experience and expertise of their people. Java processors run counter to that investment, requiring designers to start again with new hardware, new and potentially less stable software and a possibly short but still not insignificant learning curve. It is uncertain how many developers will consider the new investment worthwhile.

It is clear that each approach to running Java has its strengths and its weaknesses. None emerges as the clear winner for embedded devices, as designers are forced to trade off processor speed against memory requirements against the availability of software.

JEDI Technologies & The JSTAR Accelerator
What would an ideal Java solution look like? Obviously, it would run Java applications at least as well as the alternatives available today, offering performance five to ten times higher than interpreters. It would achieve this level of performance without the need for a faster CPU clock or extra memory. And it would be compatible with existing processor architectures, to take advantage both of existing expertise and a wide variety of available software.

It is from this problem definition that JEDI Technologies began. Founded in late 1998, JEDI set itself the task of eliminating the technical barriers to widespread use of Java in embedded devices. JEDI's approach was to develop a Java accelerator that could be added to existing microprocessor designs. Acting like an on-the-fly JIT compiler, this JSTAR accelerator provides similar levels of performance without any need for memory to hold the translated code. And because JSTAR enhances an existing processor it gets all the benefits of using that processor, including the catalogue of software that supports it.

Architecturally, JSTAR is a coprocessor that interfaces to the native microprocessor core and its cache or memory subsystem. JSTAR acts as a Java interpreter in silicon, retrieving byte code instructions from memory and executing them in conjunction with the native processor. JSTAR operates directly on Java byte code, eliminating the extra memory JIT compilers need to hold the native code they generate.

The JSTAR-enabled processor

Adding JSTAR to a microprocessor requires no modification to the native core. In particular, the native instruction set and pipeline architecture of the processor are unchanged. Operating systems and native applications, software components and tools run on a JSTAR-enabled processor just as they do on the original chip.

Even Java native methods compiled for the native processor run without modification. JSTAR was designed to integrate with existing Java Virtual Machine implementations from Sun, HP and others. The JVM is modified to initialize JSTAR and then to give it control of the main fetch/decode/execute instruction processing loop. Making a JVM work with JSTAR is greatly simplified by JSTAR's ability to adapt to the internals of the JVM. In particular, JSTAR does not impose any specific layout for local variables, call stack frames or the Java operand stack. (The operand stack stores intermediate results from Java computations. Where a register-oriented processor would implement a simple expression like C =A + B as load A into r1, load B into r2, add r2 to r1, store r1 into C, the stack-based Java byte code would use something more like push A, push B, add, pop C.)

JSTAR also works with the JVM's implementation of garbage collection, native method invocation and thread switching & synchronization. It is designed to work with the variety of thread schedulers used by JVMs, including both native threads and cooperative thread schemes like Sun's green threads. It also supports multiple JVMs running concurrently on the same processor. JSTAR's flexibility minimizes the work required to convert a new JVM to take advantage of it. It also permits JSTAR to benefit from all the work being done to optimize specific Java run-time implementations.

JSTAR Application Performance
The first JSTAR implementation makes use of an R3000 class processor core, the VxWorks real-time operating system from Wind River Systems and Sun's PersonalJava 3.0 virtual machine environment. Performance was measured using a set of industry standard benchmarks on a field-programmable gate array (FPGA) clocked at ten and twelve megahertz (MHz). These benchmarks were run using the same software environment with JSTAR enabled and then disabled, giving an accurate picture of its effect on the performance of each application.

Pendragon Software Corporation's Embedded CaffeineMark suite consists of six tests of basic Java execution, whose results are combined using a geometric mean calculation. In addition to the R3000 and R3000/JSTAR tests, results were obtained for a MIPS R4600 processor running at 200 MHz and a StrongARM processor running at 166 MHz, both using the same combination of PersonalJava and VxWorks, as well as an Intel Pentium-based PC running Microsoft Windows 98 and Sun's JVM 1.2.2. The JIT compiler in the latter system was deactivated to provide an accurate comparison of interpreters and JSTAR. Benchmark performance was divided by processor clock rate to provide a common measurement of CaffeineMarks per MHz. Experiments with the same processor and software environment at varying clock rates indicate that the benchmark results scale linearly with increases in processor speed.


Java Interpreter Performance
Embedded CaffeineMark 3.0

As we can see from the graph, the MIPS and StrongARM processors produce similar performance of between .52 and .61 CM/MHz. The more complex Pentium processor gets between 50% and 74% better performance than these other processors, the reason for which we will discuss shortly. Adding JSTAR to the R3000 improves its performance by 5.5 times, giving the R3000/JSTAR combination Java throughput more than three times that of the larger and more power-hungry Pentium running at the same clock rate.

A large portion of the Pentium processor's advantage in these tests comes from a single benchmark. The CaffeineMark Float benchmark performs a large number of double precision floating point calculations, giving a significant edge to the only processor in this group with floating point hardware. Removing the Float benchmark from the set places the Pentium processor more in line with the others and gives JSTAR an acceleration of nearly seven times that of the R3000 alone and four times that of the Pentium.

Java Interpreter Performance
Embedded CaffeineMark 3.0
Without Float Benchmark

Other benchmark applications show similar results. JSTAR improves an R3000 processor's Dhrystone performance by 5.2 times, increasing a 12 MHz processor from 256.52 Dhrystones per second to 1333.33. The Tak benchmark, a measure of recursive procedure calling, shows accelerations of 7.3 times and 2.1 times for integer and single precision floating point, respectively. Once again, note that this last result was achieved using a software implementation of floating point; having floating point hardware would produce even greater improvement. Also be aware that all of these results were obtained from a particular virtual machine implementation. Greater efficiencies in the VM would produce higher overall acceleration.

JSTAR Hardware Specs
A best performance JSTAR core requires approximately 30,000 gates to support a 32 bit, single issue processor. It can be implemented using a unified cache or memory system or the separate instruction and data cache of Harvard architecture processors like ARM9, either with or without a memory management unit.

Power requirements for JSTAR execution are estimated to be 18 milliwatts for a 1.5 volt 100 megahertz processor This represents less than 15% of the power required by a typical native processor at the same clock rate. With appropriate hardware and software support, JSTAR will draw no more than leakage power when idle.

The JSTAR Advantage
Building an embedded device involves making choices: technical, economic and practical. Each choice is defined by a series of tradeoffs and implicit decisions regarding other choices. Choose a processor and you either limit yourself to software that already runs on that processor or to software that can be moved to it within the time available and at an acceptable cost. Add an operating system and you restrict your options further. And so it goes with each new layer of software and each new component.

Developers working with embedded Java must balance their need for Java performance against all of their other requirements: cost, power, size, the ability to run native code, the ability to interact with the outside world and so on. Java technology that intrudes on the device's ability to satisfy the non-Java requirements of a device is not a viable solution.

JSTAR represents the low risk, low cost Java solution most developers need. It offers a high performance engine for running Java applications without placing unacceptable demands on scarce resources. And it offers this benefit without giving up the high C/C++ performance of a native processor, its large catalogue of available software or the years of expertise built up by embedded developers. This symbiosis between JSTAR and a native core means the best of both Java and traditional computing.


Take me home: Show me another:

 Comments to: Hank Shiffman, Mountain View, California