Matt RaibleMatt Raible is a Web Architecture Consultant specializing in open source frameworks.

The JHipster Mini-Book The JHipster Mini-Book is a guide to getting started with hip technologies today: AngularJS, Bootstrap, and Spring Boot. All of these frameworks are wrapped up in an easy-to-use project called JHipster.

This book shows you how to build an app with JHipster, and guides you through the plethora of tools, techniques and options you can use. Furthermore, it explains the UI and API building blocks so you understand the underpinnings of your great application.

For book updates, follow @jhipster-book on Twitter.

10+ YEARS


Over 10 years ago, I wrote my first blog post. Since then, I've authored books, had kids, traveled the world, found Trish and blogged about it all.

Google's Mirror of Maven Central 25% Faster

Last week, Takari announced that Google is Maven Central's New Best Friend. While writing a news article about this for InfoQ, I decided to run a small test to see the speed of the default Maven Central versus the new Google Cloud Storage instance. This micro benchmark didn't seem worthy of including in the article, but I think it's interesting to see the speed improvements I found.

I ran rm -rf ~/.m2/repository, then mvn install with the default repository configured. I ran the commands again with Google Cloud Storage. I found that the downloading of dependencies, compilation and running unit tests on AppFuse's web projects averaged 4 minutes, 30 seconds. With Google Cloud Storage, the same process averaged 3 minutes and 37 seconds. By my calculations, this means you speed up artifact resolution for your Maven projects by 25% by switching to Google. To do that, create a ~/.m2/settings.xml file with the following contents.

<settings>
  <mirrors>
    <mirror>
      <id>google-maven-central</id>
      <name>Google Maven Central</name>
      <url>https://maven-central.storage.googleapis.com</url>
      <mirrorOf>central</mirrorOf>
    </mirror>
  </mirrors>
</settings>

Benchmark Details
My tests were run on a Mac Pro (late 2013) with a 3.5 GHz 6-Core Intel Xeon E5 processor and 32 GB of RAM. Bandwidth speeds during this test averaged 57 Mbps down, 6 Mbps up. Below are the timing numbers (in minutes) from my test:

Default: 4:33, 4:36, 4:32, 4:24, 4:09
Google: 5:13, 3:35, 2:15, 3:38, 3:39

Google had some wide variances in its results, with five minutes and two minutes. Because of this, I dropped the low and high numbers for each service before calculating the average. My math with raw numbers is below.

Default:
273, 276, 272, 264, 213 = 260, 4:20
276, 272, 264 = 270, 4:30

Google:
313, 215, 135, 218, 219 = 220, 3.66 = 3:40
215, 218, 219 = 3:37

Chen Eric commented on the InfoQ article to note that Chinese programmers are blocked from using Google.

Update: Jason Swank of Sonatype has done some more extensive benchmarking and found different results.

We found that average unprimed Google API (first mvn run) caching performed 30% slower than Maven Central. Primed Google API cache performance (second run) was 3% faster then Maven Central (second run). We also ran a number of cloud-based tests with similar results.

Posted in Java at Nov 10 2015, 12:13:51 AM MST 4 Comments
Comments:

Thanks for the post Matt. I'm the head of tech ops for Sonatype and try to ensure we are serving the community effectively. More specifically, I wanted to know if our CDN was markedly underperforming Google's edge delivery capabilities.

Using a similar methodology to yours (`mvn dependency:resolve` versus a full build of the appfuse-web project), we ran this from a dozen-ish geographically dispersed locations across the US and Canada (our engineering team members all work remotely which helped here).

We found that average unprimed Google API (first mvn run) caching performed 30% slower than Maven Central. Primed Google API cache performance (second run) was 3% faster then Maven Central (second run). We also ran a number of cloud-based tests with similar results.

Since the likely scenario in real-world usage is invariably a mixture of primed and unprimed (pick whatever your preferred ratio), the answer based on our testing is somewhere in between the numbers noted above. I have no doubt that mileage will vary based on individual network connectivity and proximity to corresponding endpoints- but am of the opinion that effective local caching is generally much more important to build times than internet download speeds.

Our test results are here:

https://docs.google.com/spreadsheets/d/1e797425FV54cOPPAZ3Y4OC0LE3sO-_qcWsaVQSwoH78

Posted by Jason Swank on November 13, 2015 at 12:28 PM MST #

Jason: thanks for providing a more extensive benchmark than mine. I agree that local network caching is the best way to resolve dependencies quickly.

Posted by Matt Raible on November 17, 2015 at 08:51 PM MST #

Since the artifacts are cached locally, I don't really see the point of these poor "benchmarks".

Posted by Nidget on November 30, 2015 at 09:56 AM MST #

Sorry if this is a duplicate of my first attempt to post. The editor didn't respect carriage returns, so that was just one long, run-on paragraph. Delete it if you wish.

It may be faster, but it's apparently not quite reliable. I just spent an hour or so trying to figure out why a project I pulled from GitHub wouldn't build. One dependency just would not update, either from within Eclipse/STS or from the command line (mvn install), failing with the following error: "The POM for com....:2.2.0 is missing, no dependency information available". (I have elided the artifact info because it's irrelevant.)

Initially I thought that the artifact had been uploaded to Maven Central without a pom.xml, but I'm pretty sure that's impossible. After creating a barebones Gradle managed project and included this particular dependency, Gradle found the dependency and the project was successfully built. How could that be?!

Finally, I remembered that I had put the Google Maven mirror in the .m2/settings.xml file soon after the Google mirror was announced. I commented it out, and the Maven build suddenly "just worked".

I have found at least one other complaint on the Web where it's obvious that the Google mirror is not in sync.

No matter how fast it might or might not be, if it's not reliably mirroring everything on Maven Central, it's questionable how useful it can be. Yes, I have been using for 6 or 7 months, but wasting so much time because it's not in sync makes me think that it should remain commented out in my settings.xml file and live with any performance penalty (if there actually is one).

(Note that the artifact I was working with was published back in February 2016 -- 3 months ago.)

Posted by Paul Furbacher on May 09, 2016 at 09:50 AM MDT #

Post a Comment:
  • HTML Syntax: Allowed