Thursday, February 19, 2009

Web-services. Edge case performance issue.

Think twice before you choose web-services as the only way of communication. People say they are slow, but in general it's not true.

Web services are fast enough and pretty scalable if you are going to send simple small set of parameters and expecting to receive back small result. Saying small I mean say 10-50k of data. Actually even if you send back more than 50k it will work fine except the case that I'm going to discuss here.

Task:
We need to transfer huge (say 1 megabyte) XML from .Net client to the Java server. Web-services looks very attractive from performance point of view and scalability as well. Why did we chose XML as a data structure? It's quite native for the web-services and provides good structured, human-readable data format.

1st and simplest solution:
Spring-WS. Of course this one. Really simple (especially if you have Spring based server side application). Yep. 20 minutes and you have cool web-service. On .Net side in Visual Studio we create proxy by means of corresponding wizard and "voila!" it works.

Hmm, but not everything is so good. For some reasons empty web-service takes 10-20 seconds to receive this embedded XML that we sent from client.

Explanation:
Right. You did face this performance issue. Let me explain what happened.

Web-Services uses SOAP to transfer data from client to server and back. SOAP is an XML with certain structure and has so called envelope, body, header and so on. When you send XML from client to the server, web-services implementation has to process embedded data and escape all tags in SOAP message body with < > as it is normal text.

So if you sent
<myTag>value</myTag>
from client, on the server side you should get
& lt;myTag& gt;value& lt;/myTag& gt;
instead. Not a big deal actually, but keep in mind that we have 1M XMl here.

Java by default has Xalan transformation and Xerces parsers for XML processing. And when it receives SOAP message it tries to parse ENTIRE message body, taking into account all those "tags". But you can say "They were escaped!". Yes they were. Do you know what does & character mean in XML? Yep - it's a reference and Xerces tries to handle em as well. That's why it takes dozens of seconds to simply receive this text in your web-service.

I personally did not find any way to configure this parser. Just because it happens somewhere inside Spring+Java and you cannot set up Java system property that somehow speed it up.

2nd a bit more complicated solution:
Well, if it's just a parsing issue, we need to avoid such SOAP body and send XMl as an attachment.
Woks great in case of Java-to-Java communication, but what did kill me is that .Net does not have WS-attachment implementation in their web-services extension v3!!! Yep they say use MTOM instead.

Not sure if it's possible in Visual Basic but anyway even if it's possible we cannot change communication protocol on-the-fly. And this change requires some refactoring, planning and so on.

That's all folks. Sad but unfortunately this performance bottleneck was not solved in time. We postponed it. Have a good performance guys

Tuesday, February 17, 2009

Fighting with Terracotta

Recently I waste some time fighting with Terracotta server and 'd like to share my thoughts about it.

First of all nobody solved the issue of clustering database connection and any other network connections either. But to my understanding this is quite important and plausible. This is the only issue that does not allow clustering to come truth.

Now let's get back to Terracotta. The task was to cluster pretty big application tidily integrated with Spring 2.5 (most of database, web-services, executors functionality were written using Spring's WS and JDBC calls). Application runs asynchronous tasks with Quartz. And to sum it up it looks more or less good from software design point of view.

The initial idea was to provide server side failover and thus:
1) Somehow share asynchronous tasks and guarantee re-execution is case of one server fail.

2) Share managed beans exposed outside and their states


3) Since server has complicated business logic that requires some synchronization provide distributed locks (ideally they need to be moved on the database level indeed)


Terracotta is really great product to get deal with. I got excited and confused at the same time because:

1) Terracotta provides integration with Quartz and allows sharing tasks with re-execution, BUT noone tested it with Quarts+Spring (as you probably know Spring gives some good wrappers for Quartz to simplify and unify the code). Thus after wasting 3-4 hours I had to use pure Quartz without Spring.

2) Shared locks as any other shared (distributed) objects works perfect and out of the box, BUT as soon as I start distributing more or less complicated data with some inheritance and fields like Loggers in parent classes I found myself in troubles. I faced the fact that Terracotta cannot exclude field from the distribution if it is declared in parent class (superclass). I had pretty much Spring stuff and I was not able to change the code and had to find out a lot of workarounds for this.

3) Spring. Yep. Wasted days to understand that it's just impossible to distribute most of the beans (especially those ones that had some JDBC stuff injected). All database connections need to be established when you need them. The same should be done in asynch. tasks.


To sum up this post:

1) Terracotta works fine but be ready to refactor your code and keep in mind all "product features" :)
2) Forget about database stuff injection into distributed beans or asynchronous tasks.
3) Be ready to feel "magic inside and consequences outside". Terracotta is a "black box" with some magic.

Thursday, January 29, 2009

Oracle tips

Couple of Oracle tips/trick that amused me today.

1) I ran into the issue of creating very specific unique constraint on couple of fields in certain table. This uniqueness should work in case when one of the fields has value 1.

So let's assume that we have 2 fields FK, VALUE (0|1). This pair must be unique in case of value 1 (FK, 1) and should not if value is 0 (FK, 0).

There is no way to create "check constraint" or "unique constraint", either you cannot write trigger on it because it's not allowed to access to the same table from the trigger. Digging a bit I found cool index that helped me with this issue. Oracle called it functional index and in my particular case it looks like following and works like a charm:

create unique index uix_single on table (
case when value = 1 then fk else null end,
case when value = 1 then value else null end
);

2) Another interesting task that I have been solving is keeping graphs in Oracle and loading them as fast as it's possible. There are number of graphs but in my case I used DAGs (direct acyclic graphs) and what actually surprised me is that Oracle supports it. They can even handle cyclic graphs and have special keywords for that: connect by and start with. But the most interesting things happen when you need not just a single edge that connects two nodes, but rather a set of edges and so called connection points (as separate entities). So every node could have several connection points and edges combine all this stuff together.

Connect by does not work in this case because they expect child node id to be the same as parent node one, but apparently we have connection points and their ids could be quite different (say you have set of input connection points and another set of output ones and only input can be connected with output).

The solution is elegant and I would say trivial. Edges should still connect nodes, not the connection points but connection point ids need to be included into the edge. This allows using connect byand at the same time keeping relations between connection points (regardless of denormalization)

Thursday, December 18, 2008

Simple way to get your Java application internal state

I think everyone who wrote Java applications had some thoughts like this "It would be great to periodically check internal application states from the command line or script to see if everything is OK".

Right command line interface is a simple and native way to trace what is going on inside the application that provides access to for example internal application variables.

Here is my answer that probably can help. a kind of alternative to the JMX ;) The main idea is to open socket on certain port and send some commands from command line to this port. Application will catch those commands parse them and write back the result that we could see on STDOUT.

public class TcpTest {
// List of supported commands
enum Command {
GET,
SET,
TEST
}


public static void main(String[] args) throws Exception {
// Create server socket
ServerSocket serverSocket = new ServerSocket(8000);

while (true) {
// Wait for connection and read data
Socket sc = serverSocket.accept();
BufferedReader sr = new BufferedReader(
new InputStreamReader(sc.getInputStream()));

String command = sr.readLine().toUpperCase();

try {
// Parse command
Command cmd = Command.valueOf(command);

// Write answer
switch (cmd) {
case GET: sc.getOutputStream().write(
new byte[]{'A','N','S','W','E','R','\n'});

break;
case SET: sc.getOutputStream().write(
new byte[]{'S','E','T','\n'});

break;
case TEST: sc.getOutputStream().write(
new byte[]{'P','A','S','S','E','D','\n'});

break;
default: sc.getOutputStream().write(
new byte[]{'U','N','S','U','P','P','O','R','T'
,'E','D','\n'});

}
}
catch (Exception e) {
sc.getOutputStream().write(
new byte[]{'U','N','S','U','P','P','O','R','T',
'E','D','\n'});

}
finally {
sc.close();
}
}
}
}
This code is not exhausted but explains the approach. And of course when you start it you should be able to communicate with this program by sending commands to the port 8000.

On Linux it's echo "GET"|netcat localhost 8000 executed from command line. Also you can use it from shell script and get back command result from standard output, parse it and handle.

Monday, December 15, 2008

javax.sql.DataSource? Forget about it.

Well, how much time do you usually waste fighting with some stupid inconsistency in implementation? I think quite a lot if you are in charge of some investigations and integrations. So do I and here is a story to cheer you up.

We all know about Tomcat and their way to configure data source for the application. Just to refresh your memory here is a snapshot of
context.xml

< Resource name="jdbc/mydb"
auth="Container"
type="javax.sql.DataSource"
driverClassName="oracle.jdbc.driver.OracleDriver"
url="bla-bla-bla"
username="bla"
password="bla-bla"/>

Basically this means that we are going to establish connection with Oracle database. But Tomcat in this particular case will use DBCP and give us a wrapper to the Oracle data source and thus some useful Oracle features won't work.

Simple and native way to get all features is to ask Oracle to provide the connection by adding "factory" property to the Resource tag. It should look like this:


< Resource name="jdbc/mydb"
auth="Container"
type="javax.sql.DataSource"
factory="oracle.jdbc.pool.OracleDataSourceFactory"
driverClassName="oracle.jdbc.driver.OracleDriver"
url="bla-bla-bla"
username="bla"
password="bla-bla"/>

Looks good. Simple. Yeah. Wait man it does not work! I got strange message in Tomcat console:
SEVERE: Null component Catalina:type=DataSource, path=/bla, host=localhost, class=javax.sql.DataSource, name="jdbc/mydb.
Guys I need help! Please!

So that's what you can find if you Google this issue. Also there is a recommendation to replace "javax.sql.DataSource" with "oracle.jdbc.pool.OracleDataSource" like below.


< Resource name="jdbc/mydb"
auth="Container"
type="oracle.jdbc.pool.OracleDataSource"
factory="oracle.jdbc.pool.OracleDataSourceFactory"
driverClassName="oracle.jdbc.driver.OracleDriver"
url="bla-bla-bla"
username="bla"
password="bla-bla"/>

Right. Cool stuf. It works like a charm but you don't see this data source in Tomcat console. The Tomcat does not recognize it as DataSource and does not publish it as DataSource MBean. And you cannot monitor it and all connections.

What the f.... is wrong with it? could you ask and here is the answer. ObjectFactory interface has only one method


public Object getObjectInstance(Object obj, Name name, Context nameCtx, Hashtable environment)

and in our particular case the very first parameter passed into it is a ... what do you think ... right reference to the type. Guys from Oracle are smart enough to handle it and process some cases like
  • oracle.jdbc.pool.OracleDataSource
  • oracle.jdbc.xa.client.OracleXADataSource
  • oracle.jdbc.pool.OracleConnectionPoolDataSource
  • oracle.jdbc.pool.OracleOCIConnectionPool
but they don't care about javax.sql.DataSource. They just return NULL in this case instead of default implementation. Guys, have you ever read about Null Object pattern? What the hell are you doing there.

That's it basically and there is no way to get rid of it. Wasting time is our job.

Thursday, December 11, 2008

Java monitoring tools/suites

Recently I spent some time on investigation what sort of monitoring tools can help in solving production issues. And here are some thoughts about it.


Issues I usually run into in production and that I'd like to solve

  • Memory footprint/leaks (caches, collections).
  • Threads/pools/executors. They should have limited, possibly configured number of threads.
  • Database connections/throughput.
  • Number of different requests coming into the system.
  • The most CPU consuming tasks.
  • Application availability (whether or not it is started and reachable)
So these are the issues that happen in production and which are usually hard to solve.


Monitoring tools

Here is a list of frameworks/tools/suites I found and looked at so far

  • JConsole
  • Visual VM
  • Lambda probe
  • GlassBox
  • JAMon
  • Spring AMS/Hyperic HQ
  • JXInsight

Let me describe them and point some key features in context of issues listed above.


JConsole


This one provides the access to the started VM using different kind of Mbeans/MXBeans. Following information can be extracted from the running VM:

System information
  • Threads (Peaks, current number and so on),
  • Memory (Heap/NoneHeap)
  • Loaded classes
  • OS state (operation system vendor, system properties and paths and so on)
  • Garbage collector info.

Mbeans:
  • Custom application Mbeans
  • VM Mbeans (runtime, threading, memory pools, garbage collector)

In most cases this information is quite enough to identify the problem in general. I.e. whether or not this issue is a memory leak or may be over-threaded application. Say, this is a basis that one can get fast and for free but it won't give you any details about application bottlenecks.

JConsole supports pluggable modules that are simple to write and integrate into it.

Going forward we could say that currently this tool gives enough data and together with some applications provided by Sun it could be the best one to monitor and find all kind of bottlenecks.


Visual VM


Heavy-weighted framework based on NetBeans API and thus requires it to start. At the same time it is very flexible in data representation aspect. Pluggable modules allow depiction of any monitored data the way you like the most (charts/histograms/textual view). But this approach gives you yet another representation layer for the same information that you can get with JConsole.

To my personal understanding VisualVM won't give much in comparison with JConsole except may be some CPU/memory profiling features integrated into the VisualVM and it won't help you to find out database bottlenecks.


JAMon

JAMon is not a tool but rather a monitoring framework that wraps your code with proxy objects and logs execution time. Basically it can wrap almost all calls and objects, even the database ones and thus provide comprehensive view on what happens inside the application. Also it has very simple user interface based on some servlets wrapped into the WAR file.

One have to either change code and wrap every monitoring place with JAMon classes or use aspect pointcuts to instrument code at runtime. Both ways have some pros and cons. But it would be great not to change code (avoid any dependency on JAMon) and at the same time not to loose the performance (and memory) with instrumenting code at runtime.

This framework can be used instead of profiler even in the production but in very exceptional cases when we know where exactly issue happens.

JAMon coding example:

import com.jamonapi.*;
...
Monitor mon=MonitorFactory.start("myMonitor");
...Code Being Timed...
mon.stop();


Lambda probe.

This one is much better (and even oriented) for the application servers. Lambda probe can be easily integrated into the web-container or application server and show some additional container specific data like database connections, running servlets, thread pools and even particular thread in one of them.
So it looks like it could help us with database issues but in practice it just give us number of active connections that application has. It does not show MBeans and should be used together with JConsole.


GlassBox

Simple and probably useful monitoring application, but lacks of documentation does not allow to dig into it.

It runs as a wrapper around the web-container/appServer and as I see instruments everything using AspectJ in runtime. The main idea behind it is to get access to all possible Java calls and then filter out those ones that are not really interesting from performance point of view. But it makes execution slower (mostly at startup but at runtime as well, when application gets access to the particular class the first time). Also it consumes a lot of memory to instrument all classes.

Tool developers make some performance assumptions based on some internal criteria and there is no way to configure them. This framework has a few of maintainers (3-5) and last commit to their svn was about a month ago. So I wouldn't recommend to use it.


Spring AMS

The most powerful suite (application management suite) based on Hyperic HQ - world-class leader monitoring framework.

Features:

  • Joins together all application information under the same roof.
  • Different applications can be grouped to provide useful views.
  • Physical box availability with a lot of operating system specific parameters.
  • Depicts comprehensive Spring based details (contexts, executor services, db connections – everything that can be declared in Spring configuration).
  • Has integration with almost all application servers (Tomcat, WebLogic, WebSphere, GlassFish, Spring DM)
  • Configurable alerts could inform in time about failures, lack of memory or overloaded CPUs

At the same time it is:
  • Heavy-weighted (AMS requires server to be installed along with Progress database, agents set up on every monitored box)
  • Proprietary
  • Complicated, details overloaded web-based console.

Beside the basics provides by JConsole, Spring AMS gives some additional useful details about Spring based application like database commits/rollbacks (overall, average). The better integration you have with Spring the more information could be shown on the console.

One can even get all application Mbeans by changing MBeans domain to the “spring.application” (this is a requirement of Spring and the way they define what should be shown in console) and thus adding application specific metrics.

Another requirements is to use compile-time instrumented Spring files. Spring allows to download all libraries from their site and use instrumented version. Also they provide instrumented logging, hibernate, collections, ehcache. So it is oriented to the Spring applications, but IMHO we have a lot of them.


JXInsight

This product is mostly oriented to the development phase but can be used at production as well.

Pros:

  • Integration with a lot of frameworks and products
  • Support for distributed environment
  • Probes to meter and and traces to get paths (stack traces). Common ways to detect CPU consumption and hotspots but very featured ones.
  • True JDBC monitoring on transaction level with long-term statements detection.
  • Allows off-line analyze by taking snapshots at runtime.
  • Hight resolution clock.

Cons:
  • Proprietary.
  • JDBC monitoring is not recommended for the production.
  • Adds overhead by own Java agent and instrumentation
  • Does not support alerts and thus need to be monitored all the time.

So it won't give database activity monitoring at production. The only useful things are probes that meter resources consumption across different customizable groups (read packages/classes).


Profiling


Memory footprints/leaks

Why memory is so important? Simply because of lack of the system resources, but even if you have enough hardware resources the Java GC could take time and thus slow down your application.
Let's assume that we know the issue (whatever tool we used it gave us some information to make the decision) and it's a memory footprint/leak or application over-threading. Next step is to define where exactly this problem occur. What point in code or at least class causes it.
Starting from Java 5 Sun provides set of very convenient tools to identify it. First of all it's a memory dumper. Tool called “jmap” allows getting memory dump by process id (the only issue is that it fails up to JDK 5.0._14). It works very fast especially if traces “live” objects only and could get a 4G heap dump approximately in a couple of minutes. This dump is a memory snapshot with all objects and references between them and thus can be analyzed later to find out the outstanding number of threads or other objects.
Another case is a out of memory exception (OOME). In this case it's recommended to start production application with Java parameter (-XX:+HeapDumpOnOutOfMemoryErrors) that takes memory snapshot right after exception happened and thus we still have a dump file to analyze and find out leaks.

Cpu overloading


This is the most complicated issue because usually it takes a few seconds and it's hard to catch it.
But let's assume that application consumes 100% of CPU and we see it in our monitoring tool. In this case we can go through the list of active threads and find out those ones that are in charge of that. Thread name should give us the point in code that caused this problem.
In most cases this means that application uses all hardware resources and need to be either optimized or scaled. Talking about scalability we should remember two types of it. One can scale-in application by adding more power to the same box or scale-out code by moving some calculations outside the original box and thus making grids (both types computational and data ones).


Database monitoring

It happens very often that application does almost nothing and has little memory footprint and at the same time works very slow. The possible bottleneck could be a database that consumes a lot of resources on a remote box and cannot handle all applications requests.
Usually databases provide tools for their monitoring and optimization but on application side it could be worthwhile to trace database activity as well.
One of the ways is a J2EE data sources registered in application server and showing all SQL statements, the slowest ones, all connections and their activity.


IO stat

The last major performance issue that should be taken into consideration is a input-output throughput. This means both network and hardware activity that can be easily monitored with Linux “iostat” and “netstat” utilities.
The solution in this case is to fix it on hardware level or change application code to diminish data amount sent/saved if it's possible.


Conclusion

So as I see it all bottlenecks can be found with JConsole and some useful tools like jmem, SAP mat, operating system tool (iostat, netstat) and database specific applications. The only thing that should be solved is an application availability. But this could be resolved with Apache/shell scripts.

Sunday, December 7, 2008

JPPF and GridGain. Two Java computational grids.

It took me a month to get back to my blog and today I'd like to talk about JPPF (Java Parallel Processing Framework). As usually I will compare it with GridGain just because GridGain to my personal understanding is the best one in some aspects.
Last couple of years JPPF grew up very intensively and brought some new features that made it flexible and robust.

Differences in architecture.

In general we can say that GridGain has only one "layer". Every node ("master" and "worker") can execute tasks either from another node or from itself. Whenever you start GridGain from your code or as standalone application or as a service integrated into the application server you should know that you start new node and to avoid any calculation started on this node one should make some configuration changes. Normally it should be node attribute that says to topology SPI not to include this node into the calculation.
On one hand this is very flexible because you don't need to change your code or start additional services to involve this node into calculation, but on the other hand it messes concept up for the newcomers who usually expect some clients and servers (just because of the common multi-tier approach).
Another consequence of this approach is a peer-to-peer architecture. All nodes are connected to each other. Obviously this leads to the nodes number limitation (because of the network traffic). I know that GridGain jumped on this issue and is going to solve it quickly.


Unlike the GridGain, JPPF divides the framework into the client, driver and executor parts. Client layer provides an API and communication tools to use the framework to submit tasks, to execute in parallel. Service layer (driver) is responsible for the communication between with the clients and the nodes, along with the management of the execution queue, the load-balancing and recovery features, and the dynamic loading of both framework and application classes onto the appropriate nodes. And execution layer is the node - it executes individual tasks, return the execution results, and dynamically request, from the JPPF driver, the code they need to execute the client tasks.
This approach simplifies the initial understanding and in some cases makes "master-worker" implementation less complicated.
Another benefit of such division is that it overcomes the limitation of the maximum number of nodes. Only servers are connected to each other, not nodes and this segmentation allows having a lot of nodes with a few connected servers. On the other hand if you need simple 5-10 nodes grid, server could be potentially a single-point-of-failure.


Features and features.

Both products are very featured and cover a lot of edge cases. Let me mention some of them:
  1. On demand class loading. To my understanding (just because I suggested this feature and was responsible for the implementation ;) ) GridGain was the first one who supported transparent class loading between nodes. JPPF supports it as well.
  2. Load balancing. Both frameworks support it, but GridGain gives you about 5-7 strategies out-of-the-box.
  3. J2EE integrations. Both products can be integrated into the application servers, but the integration ways are different. GridGain starts either as a service (JBoss, WebLogic, WebSphere) and uses application server resources (executor service, logs, etc) or as a servlet (Tomcat, GlassFish). JPPF registers in JNDI tree and provides its functionality as a standard J2EE component (JCA). As far as I know GridGain has a ticket to support JNDI lookup as well.
  4. Both frameworks are task/job based.
  5. Both frameworks have annotation based execution (@Gridify in GridGain and @JPPFRunnable in JPPF).
  6. JFFP has DataProvider to exchange data between tasks/nodes. GridGain gives distributed TaskSession for that.

Despite of some common features they have a lot of differences:

  1. Communication. JPPF is a TCP/Multicast based approach. GridGain supports various protocols (TCP, JGroups, Mule, JMS, Mail, JBoss and so on).
  2. Extension. JPPF is much more closed then GridGain. Last gives you SPI interfaces and one can extend or write new functionality and integrate it into the GridGain.
  3. Monitoring. While GridGain is still writing their cool monitoring console, JPPF already provided one and as far as I see it's pretty good.
  4. Node information. I did not find any node attributes in JPPF which is not very good because it's very useful when you send your tasks/jobs into the grid. Very often you need to control, which node should execute this particular task. Simple example is executing task on Linux nodes just because it loads some native libraries or uses node specific resources. GridGain supports node attributes and even custom (user defined) ones.
  5. Tasks rescheduling in case of overload. JPPF keeps tasks on server side (note that execution layer is not the same as server one) and all nodes execute tasks as soon as they have free resources for that. This is great but what if one server got overloaded and another one has nothing to do? Nothing in case of JPPF. GridGain will redistribute work (taking into account user wishes) if job stealing is on.

Coding.

Coding approach is more or less similar with some differences.

In JPPF one should start client and submit set of tasks into the grid like this:

JPPF Code

JPPFClient client = new JPPFClient();

List<JPPFTask> tasks = new ArrayList<JPPFTask>();

tasks.add(new HelloTask());

try {
// execute the tasks
List<JPPFTask> results = client.submit(tasks, null);
} catch (Exception e) {
e.printStackTrace();
}
JPPFJob interface can be used as a tasks container and provide some additional functionality. But anyway it's quite far away from Map/Reduce way.

GridGain requires a kind of Map/Reduce implementation and force you to understand this concept. Your task should implement method "map" where you split it into small jobs and "reduce" to collect results from all jobs and reduce it into one task result. but from execution standpoint they look alike.

GridGain Code

Grid grid = GridFactory.start();

try {
// Execute Hello World task.
GridTaskFuture<Integer> future = grid.execute(GridHelloWorldTask.class,
task_param);
}
finally {
GridFactory.stop(true);
}

Conclusion.

Both frameworks are very friendly and reliable. They are simple and flexible. But in this particular case I would say that GridGain wins just because of simple common Map/Reduce approach and its SPIs that provide incredible flexibility.