Interoping Windows HPC SOA from Java (or other non-.NET environment)

When working with our customers to adopt Windows HPC cluster, some of them asked for interop capability so that they can integrate legacy Linux or Java based application into Windows HPC infrastructure. After a major release (Windows HPC Server 2008 R2) and 2 service packs (SP1 & SP2), I'm happy to introduce you the Java interop library for Windows HPC SOA.

Quick Start In 30 Seconds

If you know what you need, just download https://github.com/MicrosoftHPC/Java-Interop-Library/tarball/master, unpack it, and follow the README.

So What Is It?

"Java Interop Library For Windows HPC Server" is an open-sourced, production quality, redistributable, and commercial software friendly library, which is also fully supported by Microsoft HPC team. The library enables a Java client application to allocate resource and run parallel computing code on Windows HPC cluster. This library also include a Java based "service host" that hosts Java computation code to run HPC cluster.

The library is open sourced (see below) and the source code can be downloaded on www.github.com. The library is distributed under a Apache-like license so you can modify, compile and redistribute it in both binary and source code format provided it's operated with Windows HPC Server. It also means you can do it internally, for free or make money from it (as part of some commercial software).

The library is backed up by Microsoft HPC team and has been fully tested by the team. HPC team commit full support on the library. This means that we will listen to the feedback from our users, fix bugs, and potentially add features to align with the .Net library.

The client library talks through SOAP over HTTP protocol with our Windows HPC cluster. HPC SOA is based on WCF and provide service on both SOAP/HTTP and Net.TCP binding end points. All those acronyon just means that the all functionalities on HPC cluster can be access from SOAP/HTTP client (i.e., the Java client) as well as Net.TCP based client (i.e., the .Net/C# client that comes with HPC Client Pack). In fact, the Java client does implement almost all C#'s functionalities.

Here is a quickly list of things the Java library can do,

  • Works on both Windows and Linux. (We have tested on RHEL 5.5)
  • Create a session (Durable & Non-Durable) on HPC cluster, and control all Session related properties.
  • Send batch of requests to the session.
  • Async send/receive pattern.
  • Flush/Cancel/Cleanup of request batch.
  • Using the common data API to stage large data in and out of cluster efficiently.
  • All scheduling policies work. That means first come first serve, priority, preemption, dynamic grow/shrink of allocated resource, graceful killing, automatic error retry, and high availablity.

Besides the client library, it also comes with a Java based service host, means that user's Java based calcluation engine can be hosted on Windows HPC OS natively without a C++ or C# wrapper.

 

So It's Working, How's the performance?

The real throughput number depends on a lot of factors - message size, network connection, broker node CPU/memory configuration, and number of clients/services running. Following is the measured end-to-end message throughput of a reasonabled configured cluster. (Both the broker and the client were running AMD Opteron 1.9GHz with 24 cores and 32GB of memory. Both clients and services are connected to the broker with 1GB Ethernet.)

Metrics

0B

1KB

2KB

4KB

8KB

16KB

32KB

64KB

128KB

Interactive Session - 16 cores

593

542

556

572

524

522

447

338

211

Interactive Session - 256 cores

1867

1699

1610

1694

1443

1227

855

457

248

Durable Session - 16 cores

422

422

443

427

405

368

313

254

166

Durable Session -256 cores

1526

1515

1269

1119

895

448

253

128

80

 

As a reference point, the throughput of Durable Session are roughly the same as Durable Session throughput with C#/.Net client. For a Durable Session, since all messages are persisted onto disk, the bottleneck is the CPU and disk I/O so there is no significant performance difference between Java and C# client. The interactive session, however, is about 50% slower than C#/.Net client - a C# client will have throughput around 5000msg/sec. The main reason is that for an Interactive Session, the bottleneck is the network since there is no disk I/O involved. Dot Net client use Net.tcp binding which use binary encoding and compression so is faster than the http binding used by Java client. This is the price we pay for interopbility.

 

OK, Great. Now Where Do I Get It?

Interested? Good. Because it's easy to get! Just open https://github.com/MicrosoftHPC/Java-Interop-Library, go to the download section, and choose the package right for you.

Or, if you prefer git, checkout a local version as you want. The website has full instruction on how to do this. Please notice that at this stage, we will *NOT* accept any change from the community.

I Got It. I Want To Use It! Right Now!

The README in the package comes with instruction on how to use the library. The "test/" directory comes with the test suite we use to validate the library. And you know "sample/helloworld" is always a good idea to start with everything.

Following is an excerpt from the hello world client program with comments. The code is simple enough to be self-explanatory.

01 // step 1. Create a session 02 SessionStartInfo info = new SessionStartInfo(headnode, serviceName, username, password);     03 DurableSession session = DurableSession.createSession(info); 04  05 System.out.printf("new session id = %d\n", session.getId()); 06  07 // step 2. Create a client 08 BrokerClient<CcpEchoSvc> client = new BrokerClient<CcpEchoSvc>(session, CcpEchoSvc.class); 09 for(int i = 0; i < 10; i++) 10 { 11   // step 3. Create the message and send it to broker 12   ObjectFactory of = new ObjectFactory(); 13   Echo request = of.createEcho(); 14   request.setInput(of.createEchoInput("hello world!")); 15   client.sendRequest(request, i); 16 } 17 // step 4. endRequests() must be called to start processing. 18 client.endRequests();        19  20 // step 5. the foreach() call will iterate through all the responses 21 for(BrokerResponse<EchoResponse> response : client.<EchoResponse>getResponses(EchoResponse.class)) 22 { 23   try 24   { 25     String reply = response.getResult().getEchoResult().getValue(); 26     System.out.printf("\tReceived response for request %s: %s%n", response.getUserData(), reply); 27   } 28   // if the message hasn't been processed correctly, an Exception will be generated. 29   // However, other messages might have been processed, so should continue process. 30   catch(Exception ex) 31   { 32     nerrs++; 33     System.out.printf("Error: process %s-th reuqest: %s%n", response.getUserData(), ex.toString()); 34   } 35 } 36  37 // step 6. remember close the client and session 38 client.close(); 39 session.close();

 

Above is how you consume a service already deployed on Windows HPC cluster - regardless the service is based on .Net or Java. But, how to host a service created in Java? It turns out to be pretty easy too.  Basically there are 2 things you need to accomplish. First, you need modify the service registration files to indicate it's a Java based service. Second, you need use the Java service host library to create the Java service host.

Following is a segment of the JavaEchoSvc.config since the rest of configuration are the same as service config for a .Net based service. Two things to note here. First, the service assembly field is changed to pointing to a JAR file instead of .Net assembly. This is the actual Java service that'll be loaded. In this case, we assume it's been put under C:\JavaSvcHostTest\. Second, the service host is customized to replace the default .Net service host. The Microsoft-HpcServiceHost-3.0.jar should be compiled with the source code you get from the github project. There are no other changes to the service registration.

  <microsoft.Hpc.Session.ServiceRegistration>
 <service assembly="C:\JavaSvcHostTest\JavaEchoSvc.jar" ... > ... </service>
 <!-- Using Java Service Host -->
 <host hostType="Customize" exeFileName="java -jar &quot;%CCP_HOME%bin\Microsoft-HpcServiceHost-3.0.jar&quot;" />
 </microsoft.Hpc.Session.ServiceRegistration>

The second part is to create the actual Java service host. It's not that complicated either, as you can see from the CcpEchoSvc under "sample/" directory.

01 package org.tempuri; 02  03 import javax.xml.bind.JAXBElement; 04 import javax.xml.bind.annotation.XmlAccessType; 05 import javax.xml.bind.annotation.XmlAccessorType; 06 import javax.xml.bind.annotation.XmlElementRef; 07 import javax.xml.bind.annotation.XmlRootElement; 08 import javax.xml.bind.annotation.XmlType; 09  10  11 /** 12  * <p>Java class for anonymous complex type. 13  *  14  * <p>The following schema fragment specifies the expected content contained within this class. 15  *  16  * <pre> 17  * &lt;complexType> 18  * &lt;complexContent> 19  * &lt;restriction base="{https://www.w3.org/2001/XMLSchema}anyType"> 20  * &lt;sequence> 21  * &lt;element name="input" type="{https://www.w3.org/2001/XMLSchema}string" minOccurs="0"/> 22  * &lt;/sequence> 23  * &lt;/restriction> 24  * &lt;/complexContent> 25  * &lt;/complexType> 26  * </pre> 27  *  28  *  29  */ 30 @XmlAccessorType(XmlAccessType.FIELD) 31 @XmlType(name = "", propOrder = { 32     "input" 33 }) 34 @XmlRootElement(name = "Echo") 35 public class Echo { 36  37     @XmlElementRef(name = "input", namespace = "https://tempuri.org/", type = JAXBElement.class, required = false) 38     protected JAXBElement<String> input; 39  40     /** 41      * Gets the value of the input property. 42      *  43      * @return 44      * possible object is 45      *     {@link JAXBElement }{@code <}{@link String }{@code >} 46      *      47      */ 48     public JAXBElement<String> getInput() { 49         return input; 50     } 51  52     /** 53      * Sets the value of the input property. 54      *  55      * @param value 56      * allowed object is 57      *     {@link JAXBElement }{@code <}{@link String }{@code >} 58      *      59      */ 60     public void setInput(JAXBElement<String> value) { 61         this.input = ((JAXBElement<String> ) value); 62     } 63  64 }

 

Looks Simple Enough! What Can I Expect In The Future?

This is not the end of the library. There are still plenty of things we are planning. Besides adding support of new functionality that we will add in future release of Windows HPC SOA, we are also looking at the support of the new RESTful API and integration with Windows Azure HPC Scheduler (WAHS). In fact, there is a open source project (https://github.com/MicrosoftHPC/REST-Client-Sample) today that implement a Java library which talks with HPC cluster in REST protocol. Note that this project also comes with a C++ implementation too. This project is developed by Microsoft HPC team too. However, it's purposed as sample only so the quality is as-is and should not be granted as production level code. We'll keep working on the RESTful API support of Java and will probably release that with Java interop library, depending on customer requirements. When that library is released, it'll also be able to communicate with WAHS so the Java client will have a taste of the cloud.

Here is Java, What About Other Languages?

For now, the library is based on Java because this is what our customer has asked the most. However, we understand that there are a large portion of our customer still work on non-managed code. So a C/C++ based library is also in the pipeline.

 

So far, I've covered the basic features, performance and roadmap of the Java interop library. If you want to try it out, you know where to get it. And there is more to explore in the package. If you still have questions, the Windows HPC Server Developers forum is always a good place to post them.