Building LinkedIn's Next Generation Architecture with OSGi by Yan Pujante
This week, I'm attending the Colorado Software Summit in Keystone, Colorado. Below are my notes from an OSGi at LinkedIn presentation I attended by Yan Pujante.
LinkedIn was created in March of 2003. Today there's close to 30M members. In the first 6 months, there was 60,000 members that signed up. Now, 1 million sign up every 2-3 weeks. LinkedIn is profitable with 5 revenue lines and there's 400 employees in Mountain View.
Technologies: 2 datacenters (~600 machines). SOA with Java, Tomcat, Spring Framework, Oracle, MySQL, Servlets, JSP, Cloud/Graph, OSGi. Development is done on Mac OS X, production is on Solaris.
The biggest challenges for LinkedIn are:
- Growing Engineering Team on a monolithic code base (it's modular, but only one source tree)
- Growing Product Team wanting more and more features faster
- Growing Operations Team deploying more and more servers
Front-end has many BL services in one webapp (in Tomcat). The backend is many wars in 5 containers (in Jetty) with 1 service per WAR. Production and Development environments are very different. Total services in backend is close to 100, front-end has 30-40.
Container Challenges
1 WAR with N services does not scale for developers (conflicts, monolithic). N wars with 1 service does not scale for containers (no shared JARs). You can add containers, but there's only 12GB of RAM available.
Upgrading back-end service to new version requires downtime (hardware load-balancer does not account for version). Upgrading front-end service to new version requires redeploy. Adding new backend services is also painful because there's lots of configuration (load-balancer, IPs, etc.).
Is there a solution to all these issues? Yan believes that OSGi is a good solution. OSGi stands for Open Services Gateway initiative. Today that term doesn't really mean anything. Today it's a spec with several implementations: Equinox (Eclipse), Knoplerfish and Felix (Apache).
OSGi has some really, really good features. These include smart class loading (multiple versions of JARs is OK), it's highly dynamic (deploy/undeploy built-in), it has a service registry and is highly extensible/configurable. An OSGi bundle is simply a JAR file with an OSGi manifest.
In LinkedIn's current architecture, services are exported with Spring/RPC and services in same WAR can see each other. The problem with this architecture comes to light when you want to move services to a 2nd web container. You cannot share JARs and can't talk directly to the other web app. With OSGi, the bundles (JARs) are shared, services are shared and bundles can be dynamically replaced. OSGi solves the container challenge.
One thing missing from OSGi is allowing services to live on multiple containers. To solve this, LinkedIn has developed a way to have Distributed OSGi. Multicast is used to see what's going on in other containers. Remote servers use the OSGi lifecycle and create dynamic proxies to export services using Spring RPC over HTTP. Then this service is registered in the service registry on the local server.
With Distributed OSGi, there's no more N-1 / 1-N problem. Libraries and services can be shared in one container (memory footprint is much smaller). Services can be shared across containers. The location of the services is transparent to the clients. There's no more configuration to change when adding/removing/moving services. This architecture allows the software to be the load balancer instead of using a hardware load balancer.
Unfortunately, everything is not perfect. OSGi has quite a few problems. OSGi is great, but the tooling is not quite there yet. Not every library is a bundle and many JARs doesn't have OSGi manifests. OSGi was designed for embedded devices and using it for the server-side is very recent (but very active).
OSGi is pretty low-level, but there is some work being done to hide the complexity. Spring DM helps, as do vendor containers. SpringSource has Spring dm Server, Sun has GlassFish, and Paremus has Infiniflow. OSGi is container centric, but next version will add distributed OSGi, but will have no support for load-balancing.
Another big OSGi issue is version management. If you specify version=1.0.0
, it means [1.0.0, ∞]
. You should at least use version=[1.0.0,2.0.0]
. When using OSGi, you have to be careful and follow something similar to Apache's APR Project Versioning Guidelines so that you can easily identify release compatibility.
At LinkedIn, the OSGi implementation is progressing, but there's still a lot of work to do. First of all, a bundle repository needs to be created. Ivy and bnd is used to generate bundles. Containers are being evaluated and Infiniflow is most likely the one that will be used. LinkedIn Spring (an enhanced version of Spring) and SCA will be used to deploy composites. Work on load-balancing and distribution is in progress as is work on tooling and build integration (Sigil from Paremus).
In conclusion, LinkedIn will definitely use OSGi but they'll do their best to hide the complexity from the build system and from developers. For more information on OSGi at LinkedIn, stay tuned to the LinkedIn Engineering Blog. Yan has promised to blog about some of the challenges LinkedIn has faced and how he's fixed them.
Update: Yan has posted his presentation on the LinkedIn Engineering Blog.
Posted by Sakuraba on October 21, 2008 at 08:54 AM MDT #
sure you know about that, but Newton and the commercial branch, Infiniflow, seems to be what makes OSGi scale
/peter
Posted by Peter Neubauer on October 21, 2008 at 11:14 AM MDT #
Posted by chanwook on October 21, 2008 at 03:34 PM MDT #
Posted by Mirko Jahn on October 24, 2008 at 11:27 PM MDT #
Posted by James Stansell on October 31, 2008 at 10:16 PM MDT #
Posted by David Whitehurst on November 25, 2008 at 02:30 PM MST #
Hi,
Is it possible for OSGi to register million of serviceObjects? I tried doing it but am getting performance issues. I used the containers Felix and Equinox. Can anyone suggest me the solutions?
Also look-up of these services shows huge performance and time difference. Please suggest me some ways, I would really appreciate.
Thanks,
Sagar
Posted by Sagar on July 06, 2012 at 08:57 PM MDT #
Sagar - why would you want to register millions of service objects? One of the features of OSGi is it allows you to modularize your app into smaller components, so registering millions of objects sounds like you'd be going against the grain.
Ross Mason wrote a great post on OSGi back in 2010:
OSGi? No Thanks
I tend to agree with him.
Posted by Matt Raible on July 10, 2012 at 04:22 PM MDT #