[DJUG] Building an Open Source ESB and Ruby on Rails
Managing Chaos
Building an Open Source Enterprise Service Bus from Scratch
Bruce Tate got rained out in Texas, so Bruce Snyder and Jeff Genender are talking about building an ESB with open source software. Bruce and Jeff created this product in the last project they were on. I was lucky enough to work with them on it the first half of this year - but only after the ESB was created. The final product was very stable and the client loved it.
Bruce is a Geronimo committer and one of the founders of the Castor project. Jeff is also a Geronimo guy and is currently working on a JBoss Live book for SourceBeat. The problem that the ESB was trying to solve was fixing a horrendous data flow. A lot of the data flow was occurring between people's PCs, shared drives, FTP servers, HTTP Servers, web servers - and rarely were things automated. Bruce recalls one of his first days when he heard the trading guys yelling at each other to "close the spreadsheet".
The solution to the problem was building an Enterprise Service Bus (ESB) that performed the following:
- Centralized Management of Activities
- Powerful Scheduler
- Guaranteed Event and Activity Execution
- Durable Transactions
- Pluggable ESB Components for Activities
- Staging Database for Single Common Data Location
- Logging and Notification of Activities
- Proactive Response to Failed Activities
- J2EE Architecture - Provides for True 24/7 Uptime
The pluggable components were called transformers and were standalone JARs that lived on their own, but could be managed by the ESB. Notifications were key so the traders would be notified when something went wrong.
Architecture
Scheduler (Quartz) » Workflow (jBPM) » Persisted Guaranteed Messaging (JMS). JMS talked to Activities (a.k.a. transformers).
The Quartz Java Scheduler is an open source project from OpenSymphony. It's a persisted scheduling engine, so it'll live through app server restarts. It also has millisecond granularity.
For workflow, the Java Business Process Manager (jBPM) was used. It doesn't use BPEL, and was used to track multiple activities and make decisions based on an activity's completion or failure status. Other functionality included tracking the activity state (running, cancelled, completed) and sending/managing notifications. Workflow was very important because the previous system had no way of detecting where things failed in a process. With the new system, downstream dependencies were handled, the escalation path was based on success or failure - and automatic retry occurred on failures if the failure reason was a known and configured expectation.
For messaging, JMS was used - implemented with EJB and MDB. This provided guaranteed and persisted messages. Events were automatically recovered if the server failed.
Activities/Transformers were pluggable components (wrapped with EJBs for transactionality). The nice thing about using EJBs was it was easy to create JARs for each transformer, drop them into JBoss and they'd immediately become available.
The system was all managed with a management console, that was a webapp implemented in Struts, Spring and Hibernate. Most of the Spring and Hibernate classes and configuration was generated with Middlegen and XDoclet. The management console allowed you to kick off activities, monitor their progress, as well as manage users with Active Directory and single sign-on with NTLM and jCIFS.
Solution Facts
Since December 2004, over 200,000 jobs have been run through this system. Of those, 500K activities have been run, with < 10,000 failed activities (2% failure rate). Nearly all failures were due to external issues, such as unavailability of remote systems and databases, or source files not available.
Tools and APIs Used
When Jeff showed up, it was a Microsoft shop using ASP, Visual Studio and some PowerBuilder. They brought Jeff in to help them use and adopt open source. The first thing they did was install and begin to use CVS (previously source control was done on shared drives). They also used Maven to build everything and produce a project site - which the managers and C-types loved. One thing they mentioned is they often got questions from higher-ups like "How much does it cost?" They did have problems with Maven, but it was mainly due to the poor documentation. They found a lot of Maven solutions by cracking open plugins and looking at their Jelly files.
Development Lessons Learned
Configuring EJBs and MDBs as singletons helped solve some problems (a JBoss setting allows you to configure this). As for running Spring in a heavily-managed environment, they found that setting singleton="false" solved a lot of problems. The next problem they had was mixing Hibernate and JTA Transactions. No details, just that they had an interesting time and it took them a few days to get it working. The last problem they encountered was using Hibernate and/or Spring JDBC to manage hundreds of thousands of records. Since these O/R tools create objects for each record, OOM errors occurred with large resultsets.
Business Lessons Learned
All notification messages came from the ESB, leading many to believe the problem was the ESB - rather than the data sources it was talking to. By using open source, they saved the company $500K in licenses fees. The interesting part was the company had a 3rd-generation agreement with IBM, and owned $12 million worth of licenses for WebSphere and WSAD. The reason this group used open source was because there weren't enough licenses.
Bruce and Jeff's presentation was good, but they looked like a couple of goofballs standing up there in their t-shirts and shorts (standard Virtuas gear). I guess that's what happens when you get a 2-hour notice. The worst part about the presentation was the fact that the A/C doesn't work and it's about 85° in here. I told Geary he'd better keep in short or I'm outta here.
Ruby on Rails
David Geary
David got into Rails by reading an article called "Rolling with Rails". In the article, the author claimed that you can develop webapps in Rails at least 10 times faster than in Java. David responded to the article on his blog with an entry called the Ruby on Rails Koolaid. He experienced quite a butt-whooping from various folks, including Rails' founder - and realized afterwards that the claims might be valid. After working with Rails, David believes that Rails is probably 5-10 times faster.
Ruby
Potent mix of SmallTalk, Perl and Python. Everything is an object. No static type checking. Duck Typing (talks like a duck, walks like a duck, it probably is a duck). Testing usually solves the lack of static types. Blocks - like anonymous inner classes, but retain state. Mixins - a cheap way of doing multiple inheritance. Dynamic classes - can modify at runtime (add methods, renaming methods, etc.). Rails takes great advantage of the dynamic attributes as part of its framework.
David is now showing a ContactsController that has 5 methods for CRUDing a Contact object. 4 of the 5 methods are one-liners. It kinda reminds me of my Hibernate DAOs after integrating Spring.
Rails
Ruby-based MVC framework. Convention over configuration. Scaffolding - builds pieces of your application for you. ActiveRecord does O/R Mapping. Has a built-in testing framework. Near-zero configuration (no XML). Zero-second deploy time (development environment).
David's first demo is being done by audience member Kirk. David asked for a volunteer from the audience with MySQL experience, and the ability to type in a few commands. David said his 6-year old daughter was able to do this demo last night successfully - so Kirk's gonna look pretty bad if he can't pull it off.
Using scaffolding, Rails generates 1 controller, a test for it, a helper class, a css file and 5 rhtml templates. Kirk pulled off the demo, even though he had a bit of trouble with the Mac environment. This is a lot like AppFuse's AppGen in a sense - except AppGen has to parse a bunch of XML files and reconfigure them.
David keeps hammering that the most productive feature of Rails is that there is zero deploy time. Save. Refresh. It looks very similar to developing a static HTML site.
Rails has ActiveRecord, ActionPack (MVC), ActionMailer, ActionWebService, Ajax Support, Transactions and Security. Currently at version 0.13.1 - the last version before 1.0.
David is delivering an excellent presentation, but it's too damn hot without A/C - I'm outta here.
Posted by Patrick Peak on July 14, 2005 at 03:36 PM MDT #
Posted by Jin Lee on July 14, 2005 at 03:59 PM MDT #
Posted by 62.173.110.66 on July 15, 2005 at 09:11 AM MDT #
Do you know how it was solved? It's for real exists, and I forecast getting into that in my project very soon. Better to have workaround before it happens. At the moment the only idea is to do partial fetching ("paging"). Any other ideas? Any other books or resources to read about that? Thanks!
Posted by Olexiyy Prokhorenko on July 16, 2005 at 05:34 AM MDT #
RE: So they created an ESB from scratch - what's wrong with Mule?
At the time of our development, Mule was not ready for primetime. Today we would look at Mule and ServiceMix if we had to do it over again.
Do you know how it was solved? It's for real exists, and I forecast getting into that in my project very soon. Better to have workaround before it happens. At the moment the only idea is to do partial fetching ("paging"). Any other ideas? Any other books or resources to read about that? Thanks!
Paging really wasn't a possibility, especially when transforming data, as that data could change midway through the transformation, and mess up the results. The cursor had to remain. So we wrote an AbstractJDBC class for Spring which was essentially the same as a Spring JDBC DAO, but it kept the ResultSet open, allowing you to retrieve 100 records at a time. This allowed us to query huge result sets with a very small memory footprint, while leveraging Spring's infrastructure.
Posted by Jeff Genender on July 16, 2005 at 02:39 PM MDT #
Posted by perlista on July 16, 2005 at 03:36 PM MDT #