tangentum technologies    
 

Deutsch

   
   
 
   

Advantages

The approach to represent Java bytecode in XML has some advantages over previous approaches, like other assembly languages and bytecode-libraries. The advantages emerge from using XML together with the special problems solved by the features of XJBC. The resulting setup generates some unique benefits which are listed below.

1. XJBC not bound to Java language
2. XJBC not bound to Java platform
3. Bytecode-representation not bound to single Java VM
4. Significant set of tools available or emerging
5. XJBC has query capabilities built-in
6. Round-trip engineering without loss of metadata
7. No symbol tables necessary
8. XJBC forms basis of programming in XML

1. XJBC not bound to Java language

Problem: On the Java platform different implementations of many different languages are currently available (for an excellent list, please see Languages for the Java VM). Each language solves particular problems, and it is not unusual that in larger and more complex applications some of these languages are used side-by-side to get the job done. To analyse and transform Java source-code only is therefore not sufficient in these projects, and to build fully-functional analysers and transformers for each single language is nearly impossible to do.

Solution: On the Java platform only one language really counts: Java bytecode. To be able to analyse or transform software running on the Java platform, but not necessarily written in the Java language, it is therefore crucial to use Java bytecode as the starting point. This is the rousing idea behind XJBC.

2. XJBC not bound to Java platform

Problem: Some people argue that software running on a Java VM is slow compared against natively compiled software written in say C++. This is debatable for different reasons. But let's suppose they are right and despite the widespread use of Java, it will be more efficient to analyse, transform or generate Java bytecode on other platforms, say ".NET" from Microsoft. An automation technology whose implementation is totally bound to Java, like the bytecode-libraries, would loose ground in this case, because tool chains built on it would suffer from the inability to use the fastest execution platform around.

Solution: XJBC would not severely suffer in this constructed scenario, because it is mainly a language and not a library or API. XML is widely adopted and represents a platform of its own in the meantime. So it is safe for XJBC to depend on it. The tools of XJBC, i.e. the assembler and disassembler, are dependent on Java, but this isn't a real limitation, because Java is the platform for which XJBC was originally built. But manipulating XJBC on other machines than the Java VM can be easily achieved, and tool chains are not dependent on or limited by the Java platform.

3. Bytecode-representation not bound to single Java VM

Problem: Analysing, generating and transforming Java bytecode of large applications and software-systems requires the automation system to scale-up itself accordingly. The automation system must therefore be distributable and able to use network resources for its work in parallel. Software artifacts have to be storeable in remote and shared repositories, and serializable efficiently to be able to use the capacity of fast networking devices. Representing Java bytecode as instances of some Java classes inside a single Java VM as the bytecode-libraries currently do, is bound to a single box and will definitely not scale-up to the depicted scenario.

Solution: Because of its sequential nature XML-data like XJBC is easily serializable and deserializable. It is therefore ideal to stream and fits into a distributed system well. The overhead to stream XJBC is low, making scaleable and parallel distributed automation feasable.

4. Significant set of tools available or emerging

Problem: Developing a new system from scratch is nearly impossible today. It is necessary to reuse already developed and tested software as much as possible to keep the own efforts as small as possible and the whole assignment manageable. Furthermore selecting a technology base which is on the one side mature and has on the other side much steam behind, seems to be crucial for the development of a complex and demanding software automation system.

Solution: XJBC is directly based on XML and inherits all of its advantages on the tool front:

  • Serialization and deserialization is simple and efficient using standard tools.

  • In-memory representations like DOM enable programmatic traversing of XML-structures.

  • Selection languages like XPath and XQL are already here.

  • Transformation systems like XSLT are also available.

  • Search engines are able to index XML-data and to find structure- and text-information fast.

  • XML is not bound to the Java platform only. Any tool on any platform will do the job if necessary.

These technologies are already available or are emerging, because there is much interest in them from other sides. A coming software automation system based on XML may therefore reuse this plethora of technologies and isn't forced to reinvent the wheel.

5. XJBC has query capabilities built-in

Problem: To make modifications in a data-representation, one must be able to identify points where to modify. Therefore the ability to select a sub-structure or the like must be available. This can be done operationally using a programming language, or declaratively using a query language. Both have advantages and disadvantages of their own, and are often mixed. The operational select has the advantage to be efficient and the disadvantage to be inflexible. For the declarative select the opposite holds. With generative technics the unification of both approaches is possible and sometimes already applied. Unfortunately these technics are difficult to apply to the other assembly languages or to the bytecode-libraries, and without little effort.

Solution: As every other XML-representation, XJBC has the capability to be questioned built-in. This may be operationally done using DOM or other in-memory representations, or declaratively using XPath, XQL, or other query capabilities. Both approaches are supported "out-of-the-box" effectively and efficiently with little effort.

6. Round-trip engineering without loss of metadata

Problem: Converting a design artifact into running code normally means to lose the data describing the design. This holds true for generated Java bytecode. Therefore full round-trip engineering is difficult, if not impossible to do with bytecode today. The attributes provided by standard bytecode are too low-level to work with directly in a flexible and dynamic design environment.

Solution: With XJBC the metadata can be represented in XML and embedded into Java bytecode without explicit knowledge of "bytecode attributes" and their peculiarities, to say nothing of programming against APIs of special bytecode-libraries. With XJBC homogeneous integration of design data and bytecode is easily achieved.

7. No symbol tables necessary

Problem: Traditional compilers are build around data-structures called "symbol tables". From parsing of the source language till generating of the target language compiler components of each phase refer to the symbol table, because nearly all necessary information is collected there. This makes the symbol table the heart of a compiler, but also a hinderance in its parallelization.

Solution: Because by construction all information about Java bytecode can be directly stored in XML documents, XJBC turns the symbol table inside out, making parallelization possible. It will be very exciting to see this fly.

8. XJBC forms basis of programming in XML

Problem: Writing programs with languages like Java, Python, C++, et.al. uses textual representations which are simple sequences of characters. This forms the base for software for nearly forty years now. The process has become more and more sophisticated since then, compiler construction has made significant progress, interactive environments has simplified some development tasks, and modelling software designs has become standard. But one fundamental problem wasn't solved: the tools do not fit together well. Why? Different languages: different syntax and different semantics. But what is the base of all these different languages? The executables, i.e. the bytecode. In the end the semantics of a program, independent of the language it is written in, is defined by the behaviour of the running machine.

Solution: Using XJBC all generative and transformative technics are based on XML, because not only the output of generators may be XJBC, but the generators themselves are representable as XJBC and may therefore be input and output of transformators of XJBC, which are themselves representable as XJBC, and so on. Therefore step by step all the different languages will be representable in XML. XML will therefore close the gap between models and metamodels, because it will become executable by means of XJBC. As a mathematically inclined argument, XJBC may become the "induction base" of a software automation system and process, which is completely based on XML.