indra's blurblog

About PayPal's Node vs Java “fight” by noreply@blogger.com (Stephen Connolly)
Monday December 16^th, 2013 at 4:34 AM

CloudBees Development Blog

So far I have held back from writing this blog post… but today in my email inbox I saw the following:

Yep, somebody is pimping their book, because of PayPal’s switch from Java to Node.js.

Let’s set one thing clear up front… namely

Which is the “faster” virtual machine...

I like the JVM. I think it is a superb piece of engineering. When it first came out, Java was dog slow… but by 2006 it was quite fast. At the time, as I was using Java for writing web applications, my brother asked for some help playing with different protein folding modelling algorithms. I knocked a couple of them out in Java and started running them while working on hand tuning the algorithms in C. Back in 2006, once the first 10,000 iterations had passed the JVM’s full steam optimisations kicked in. My best hand tuned C version of the algorithms were at least 20% slowerthan the JVM, so my immediate thought was “I must be useless at hand tuning C” so then I implemented a standard algorithm for protein folding in Java and pitted it against the best of breed native code version that my brother’s supervisor had spent grant money getting optimised… JVM was still faster in server mode after compilation threshold had kicked in… by somewhere between 10 and 15%

Of course the reason is that the JVM can optimise for the exact CPU architecture that it is running on and can benefit from call flow analysis as well as other things. That was Java 6.

Node.js runs on the V8 JavaScript virtual machine. That is a very fast virtual machine for JavaScript. When you are dealing with JavaScript you have an advantage because JavaScript is single threaded you can make all sorts of optimisations that you cannot achieve with the JVM becausethe Java virtual machine has to handle potentially multi-threaded code. The downside of V8 is that it is dealing with JavaScript, a language which can provide far fewer hints to the virtual machine. Type information has to be inferred, and some JavaScript programming patterns make such type inference almost impossible.

So which is the faster virtual machine, V8 or JVM? My belief is that when you are coding in the most basic building blocks of either virtual machine (i.e JavaScript vs JVM Byte Code) that the JVM will win out every time. If you start to compare higher up, you may end up comparing apples with oranges and see false comparisons. For example, if we consider this comparison of V8 vs JVM performance at mathematical calculations. This blog post tells us that if we relax our specifications we can calculate things faster. Here is the specification that V8 must use for Math.powand here is the specification that the JVM must use for Math.pownotice that the JavaScript specification allows for an “implementation-dependent approximation” (of unspecified accuracy) while the JVM version has the addition that

The computed result must be within 1 ulp of the exact result. Results must be semi-monotonic.

And there is additional restrictions about when numbers can be considered to be integers. V8 has a faster version of Math.pow because the specification that it is implementing allows for a faster version. If we throw off the shackles of the JVM runtime specification we can (and do if you read the blog post) get an equivalently fast result… and if it turns out that we don’t even need the accuracy of V8’s implementation, we can make more trade-offs and get even faster performance.

My point is this:

You will see people throw out micro-benchmarks showing that the JVM is faster than V8 or V8 is faster than the JVM. Unless those benchmarks are comparing like for like, the innate specification differences between the two virtual machines will likely render such comparisons useless. A valid comparison would be between say Nashorn or DynJSand V8. At least then we are comparing on top of the same specification…

What are PayPal comparing?

Here is what we know about PayPal’s original Java application:

It uses their internal framework based on Spring
Under minimum load the best page rendering time was 233ms
It doesn’t scale very well reaching a plateau at about 11 requests per second.

Here is what we know about PayPal’s Node.js application:

It uses their internal kraken.js framework
Under minimum load the best page rendering time was 249ms
It scales better than the Java application but still doesn’t scale very well.

So we are comparing two crappy applications in terms of scalability and concluding that because the Node.js one scales slightly better, then Node.js is better than Java.

I can only say one thing...

Screen Shot 2013 12 13 at 15 26 48

What we can conclude is that the internal Spring-based framework is overly complex for the task at hand. As Baron Schwartz says:

really? 1.8 pages/sec for a single user in Java, and 3.3 in Node.js? That’s just insanely, absurdly low if that amount of latency is really blamed on the application and the container running it. If the API calls that it depends on aren’t slow, I’d like to see many hundreds of pages per second, if not thousands or even more. Someone needs to explain this much more thoroughly.

Who is to say what performance they would have been able to achieve if they had built their Java application on a more modern framework. Spring brings a lot of functionality to the table. Likely far too much functionality. Most people are moving away from the monolithic application and moving towards smaller more lightweight frameworks… but if you have a corporate mandated framework that you must use when developing Java applications in-house… well you may not have much choice. On the other hand, if you move to a different technology stack there may be no corporate framework that you have to use.

Now we come to the second “benefit”, namely faster development.

We are told from the PayPal blog post that at the comparison point both applications had the same set of functionality...

Are we sure? How much functionality was the in house Spring based framework bringing to the table “for free” (or more correctly for a performance cost)?

I am not defending the in-house Spring framework, but I do find it a stretch to believe that the two applications were delivering the entirity of equivalent functionality. I do believe that the context specific functional tests were passed by both applications. So this tells us that the user will not see a difference between the two applications. But what about logging requests, transactions, etc? What about scalability and load reporting? I don’t want to defend the in-house Spring framework, in part because I find Spring to be an over-baked framework to start with, but potentially that framework is bringing a lot more to the table. If we threw all that extra “goodness” out would the Java developers have been able to develop the application faster? If we asked the Node.js developers to add all that extra “goodness” would they have been able to deliver as fast?

It is likely that we will not know the answers to these questions, what we do know is that it would seem that the extra “goodness” that the in-house framework adds appears to be a waste of time, as they are happy to go into production without them.

In other words, the in-house framework sounds a bit like one of these (at least from the point of view of somebody writing this specific application): ~~these:~~

NewImage

So it would not surprise me to hear that you can develop an application, when released from the shackles of an in-house framework, in 33% fewer lines of code and with 40% fewer files...

Spring as a base framework loves to have many many small files with lots of boilerplate
Even with annotations Spring can be rather XML heavy

If you were using a more modern Java based framework, likely you would not have had the same restrictions. For example I like using Jersey as my base framework, I find that it needs very little boilerplate and helps you to keep clear of the multi-threaded cargo cultsthat a lot of developers fall into. Node.js also helps you keep clear of the multi-threaded cargo cults… by forcing you to live in a single-threaded world.

OK, so the in-house framework is over-baked and does delivers very bad performance, so all we are left with in terms of benefits is that the Node.js version was

Build almost twice as fast with fewer people

Well first off, two people can develop faster than a team of five people when you are working in a close-knit codebase. The application itself has three routes. If you have a team of up to three developers, you give each one a route and let them code that route out. If you have more than three developers you will have more than one developer per route, which means that they will either end up pair-programming or stepping on each other’s toes. Add on top the unclear difference in delivered specification, i.e. the added “goodness” of the in-house framework… which will require hooking up before you even get out the gate… All we can really say is that this is probably at best an unfair comparison and at worst an apples to oranges comparison.

So what can we conclude?

@joemccann ... sometimes you need to ditch your old platform so management will let you do a rewrite from scratch...
— Stephen Connolly (@connolly_s) November 25, 2013

The above was my original thought when I read the PayPal blog post.

I think, that in the scope of this application, ~~think~~ the in-house framework was over-engineered on top of the over baked Spring framework and it probably does not bring much real value to the table and only costs in terms of a significant performance hit.
Any solution built on top of the JVM would have technically been able to be “integrated” with the in-house framework.
The only political route to avoid the in-house framework was to ditch the JVM
Node.js is simultaneously "just cool enough" and “just serious enough” to be a valid non-JVM candidate (you could try Ruby, but that’s been around long enough that there is likely an in-house framework for that too… and anyway you can run Ruby on the JVM… so it may not be the escape you need)

My take-home for you, the persistent reader who has read all my ramblings in this post...

Don’t build your app on top of a pile of crap in-house framework.

PayPal may have ditched one pile of crap framework based on Spring. What is not clear is whether the scalability limits in their Node.js in-house framework (i.e. 28 simultaneous clients with 64 instances) is a limit of their new framework or a limit of Node.js itself when used with the backing APIs that they have to call through to.

Time will tell, but don’t jump from one platform to another just because apples are greener than oranges.

Update

Just to be clear, this post is not intended as a criticism of PayPal; PayPal’s internal frameworks or their decision to switch from Java to Node.js.

The intention of this post is to criticise anyone who cites a performance gain from 1.8 pages per second to 3.3 pages per second in what cannot be a CPU bound web application as being the primary reason to switch from Java to Node.js.

Similarly anyone citing PayPal’s blog as evidence that Java web development is harder than Node.js is mis-using the evidence. The only evidence on ease of development from PayPal’s blog is that their internal Node.js framework is easier to develop for than their internal Spring-based framework.

My personal opinion is that there were other non-performance related reasons for the switch. For example the reactive programming style enforced by Node.js’s single threaded model may suit the application layer that the switch was made in better than the Spring-based framework that the Java version was written in. Similarly, it may be that the responsible architect analysed the requirements of this tier and came to the conclusion that a lot of what the internal framework brings to the table is just not required for this tier. It is a pity that such detail was not provided in their blog post announcing their switch, as without such detail their blog post is being incorrectly used by others to draw conclusions that are just not supported by the data presented in that blog post. Hopefully PayPay’s development team will provide some of this additional information and analysis that was unfortunately lacking in their first blog post.

Finally, we should always remember that premature optimization is the major root to bugs and performance issues in software engineering. If the application tier they are developing in Node.js is not the bottleneck, in fact until it is proven to be the bottleneck, there is no need to worry about whether it is written in the most performant language or framework. What is most important with those elements that are not the bottleneck is that they be written in the simplest form so that if they do become the bottleneck later on (due to optimization of the current slowest moving part) it will be easy to rework them.

For an tier with just three routes that is the front-end and likely calling through to multiple APIs, my gut tells me that a reactive framework such as Node.js or Vert.x will give you a very simple expression of the required logic without becoming the bottleneck. Perhaps that was the real reason why Node.js was considered as an experiment for this tier.

Read the whole story

1 public comment

indra

4603 days ago

Canberra

NewsBlur Puzzle T-shirt 2013
Thursday July 25^th, 2013 at 5:06 PM

Last year I was proud to be able to send a free t-shirt and handwritten note to every single user who requested one. It took a few days of writing, stuffing, and mailing to send out a couple hundred t-shirts.

I’m pleased to announce that this year’s t-shirt is a puzzle with every single letter being part of a 4+ letter word. I’m using Teespring for fulfillment and order processing. While it’s not free, I am making absolutely zilch profit, so I can keep the t-shirt price to the absolute minimum.

Impress your friends with your esoteric yet exquisite taste in t-shirts. But you’d better move quickly, you only have until July 31st, one week from now, to order the t-shirt. Order the 2013 NewsBlur t-shirt on Teespring.

Read the whole story

samuel

4747 days ago

Answer key to the new NewsBlur t-shirt: http://instagram.com/p/cKrHHfTa3J/

San Francisco

cinebot

4747 days ago

i want this on my chest

Dadster

4747 days ago

Go ahead, spoiler! Just see if you can get those last two or three orders in so we all get one.

8 public comments

MacDiva

4742 days ago

Nice design. Is it box-fit only?

Worldwide | NYC

samuel

4742 days ago

Box fit?

MacDiva

4741 days ago

That's the roundabout way of asking if there's a women's cut T-shirt. :) (Box fit = the body is cut like a rectangle)

samuel

4741 days ago

Ahh, this is a V-neck, so it ideally fits both men and women. Teespring didn't offer an option for both cuts.

MacDiva

4741 days ago

Ah... I dig the print!

heliostatic

4743 days ago

More people in Minneapolis need to use NewsBlur so we can be tshirt buddies.

Williamstown, MA

heliostatic

4734 days ago

Any thoughts about doing a second run for people who are... slow to order?

jhamill

4746 days ago

Ordered.

California

Dadster

4747 days ago

thankfully, mine hasn't been shrunk beyond my ability to wear in public ... with decorum.

New Hampshire

redknightalex

4747 days ago

This is a pretty fantastic idea. Newsblur is awesome but your little "side projects" are just as amazing.

Northeastern US

chrisamico

4747 days ago

Want.

Boston, MA

tedder

4747 days ago

ordered my @newsblur shirt!

Uranus

josephwebster

4747 days ago

Sweet shirt. Get one. Or more.

Denver, CO, USA

Your Child Will Never Be Safer In a Car Than In the Arms of Batman by Andrew Liszewski
Friday April 26^th, 2013 at 9:06 PM

Click here to read Your Child Will Never Be Safer In a Car Than In the Arms of Batman

Despite his gruff mannerisms, deep down Batman is a tender, loving superhero. And had he kids of his own, he would embrace and protect them just like this Batman carseat will do for yours. If he can protect Gotham from evil super-villains, surely he can protect your kids during a fender bender. More »

Read the whole story

1 public comment

ryanbrazell

4837 days ago

I am the shadow that ... keeps your kid in their seat?

Richmond, VA

Researcher sets up illegal 420,000 node botnet for IPv4 internet map • The Register
Wednesday March 20^th, 2013 at 4:20 AM

www.theregister.com - Articles

An anonymous researcher has taken an unorthodox approach to achieve the dream of mapping out the entire remaining IPv4 internet, and has broken enough laws around the world to make them liable for many thousands of years behind bars in doing so, if current sentencing policy prevails.

Getting the sheer numbers of IPv4 addresses involved would take a huge amount of scanners to make billions of pings. While noodling around with an Nmap scripting engine the researcher noticed a lot of virtually unsecured IPv4 devices – only requiring the admin/admin, root/root login, or either admin or root with the password field blank. What if these could be used as a temporary botnet to perform?

"I did not want to ask myself for the rest of my life how much fun it could have been or if the infrastructure I imagined in my head would have worked as expected," the report "Internet Census 2012" states.

"I saw the chance to really work on an Internet scale, command hundred thousands of devices with a click of my mouse, portscan and map the whole Internet in a way nobody had done before, basically have fun with computers and the Internet in a way very few people ever will."

The report states a 46 and 60 kb binary was written in C with two parts; a telnet scanner to try the login connection and propagate and then control code to assign scan ranges and feed the results back. A reboot of the infected system would wipe the binary completely and the code didn't scan traffic running though the device or any intranet-connected systems.

The code was set to run as lowest possible priority in the infected device to avoid interference and included a watchdog to make sure normal operations of the host weren't overloaded. It also carried a readme file with a description of the project and an email address for the owner, or law enforcement, to get in touch if it was discovered.

After releasing the code overnight the report's writer found 420,000 suitable botnet endpoints, accounting for around a quarter of the total number of suitable IPv4 systems with enough CPU and RAM and which ran Linux. The botnet was able to spread quickly and efficiently just using the four login combinations and was soon reporting back in healthy numbers.

The Carna IPv4 botnet

"While everybody is talking about high class exploits and cyberwar, four simple stupid default telnet passwords can give you access to hundreds of thousands of consumer as well as tens of thousands of industrial devices all over the world," Mark Bower, VP of product management at Voltage Security told El Reg.

"This is a great study which underlines the fact that once again exploitable weak links are abundant and ripe for compromise, even on embedded or industrial systems. While the researchers merely reported on security gaps, any attacker could quickly access these systems - maybe leading to downstream compromise of something much more valuable."

The home spy

The vast majority of infected systems were consumer routers or set-top boxes but they also found Cisco and Juniper hardware, x86 equipment with crypto accelerator cards, industrial control systems, and physical door security systems.

"A lot of devices and services we have seen during our research should never be connected to the public Internet at all. As a rule of thumb, if you believe that 'nobody would connect that to the Internet, really nobody', there are at least 1000 people who did," the report states.

"Whenever you think 'that shouldn't be on the Internet but will probably be found a few times' it's there a few hundred thousand times. Like half a million printers, or a Million Webcams, or devices that have root as a root password."

The resultant botnet was used to build the botnet the report dubs Carna, named after the Roman goddess of physical health or door hinges, depending on which historical source you believe. But it soon found it was getting competition from a malicious botnet dubbed Aidra and the researcher adapted the binary to block this competitor where possible, but estimates it still has around 30,000 endpoints.

In all the project took nearly six months and the full scan was concluded by October last year. The report estimates that the remaining number of active IPv4 addresses is around 1.3 billion, out a total of around 4.3 billion. The complete scan data, all 9TB or it, is available for download, but not the botnet which created it.

"The actual research itself is noteworthy in that it is the most comprehensive Internet-wide scan. I'd like to see more projects of this kind, conducted legally, and sharing information about the real state of play on the internet," said Mark Schloesser, security researcher at Rapid7 in an emailed statement.

"While the Internet Census 2012 provides interesting data, the way it was collated is highly illegal in most countries. Using insecure configurations and default passwords to gain access to remote devices and run code on them is unethical, and taking precautions to not interfere with any normal operation of the devices being used doesn't make it OK,"

He has a point. Monday's sentence of three years and five months in prison for Andrew Auernheimer, a member of the grey-hat hacking collective Goatse Security, after he used a server vulnerability to expose iPad user accounts is causing great concern to some in the security research industry.

The two situations aren’t exactly the same, but a strict interpretation of the law in both the US and elsewhere would make the Carna botnet used highly illegal and each node could be worth its own charge to an over-zealous prosecutor. No wonder the researcher in question wishes to remain anonymous. ®

Read the whole story

1 public comment

brico

4874 days ago

web scale ftw

Brooklyn, NY

March 8, 2013: Tardigrade by ZenGum
Wednesday March 20^th, 2013 at 4:15 AM

The Cellar - Image of the Day

Attachment 43141

http://www.popsci.com.au/science/if-...ould-look-like

These crazy little freaks are darn near unkillable.

Quote:

known to be able to go for decades without food or water, to survive temperatures from near absolute zero to well above the boiling point of water, to survive pressures from near zero to well above that on ocean floors, and to survive direct exposure to dangerous radiations.

So, just for kicks, I guess, the European space chaps stuck some of these on the outside of a rocket that went into orbit for 12 days, then reentered. No problems, it seems.

Attached Images

Read the whole story

samuel

4874 days ago

I bet this is what aliens really look like.

San Francisco

satadru

4874 days ago

The Russians apparently tried to send these to Mars, so yes, literally aliens.

brico

4874 days ago

whut

Dadster

4874 days ago

Hummmm. Look like Manatee to me....

MacDiva

4873 days ago

Wait. Wha? That's gotta be Fimo.

3 public comments

jbloom

4874 days ago

No words to describe how strange this is.

Columbus, Ohio

grammargirl

4874 days ago

There's something creepily adorable about these things. Maybe it's the feet?

Brooklyn, NY

tyrantlizard

4874 days ago

Our future masters, the mighty tardrigrades.

A Canadian in Boston

Why Technology Won't Solve Virtual Teams Challenges
Monday March 18^th, 2013 at 6:34 AM

Chris Lema

Technology Won’t Solve Personal Ownership Issues

On the teams I manage we don’t have “juice box” moments. Those are the moments when I have to tell my staff exactly what to do, and exactly how to do it. We don’t have those moments because I don’t manage children. I manage adults. Adults who know how to make decisions, and whose moms don’t pack their juice boxes.

Read the whole story

1 public comment

indra

4876 days ago

"juice box" moments is a very apt description for some interactions i've had to deal with :)

Canberra

About PayPal's Node vs Java “fight” by noreply@blogger.com (Stephen Connolly) Monday December 16th, 2013 at 4:34 AM

Which is the “faster” virtual machine...

What are PayPal comparing?

Update

NewsBlur Puzzle T-shirt 2013 Thursday July 25th, 2013 at 5:06 PM

Your Child Will Never Be Safer In a Car Than In the Arms of Batman by Andrew Liszewski Friday April 26th, 2013 at 9:06 PM

Researcher sets up illegal 420,000 node botnet for IPv4 internet map • The Register Wednesday March 20th, 2013 at 4:20 AM

The home spy

March 8, 2013: Tardigrade by ZenGum Wednesday March 20th, 2013 at 4:15 AM

Why Technology Won't Solve Virtual Teams Challenges Monday March 18th, 2013 at 6:34 AM

Technology Won’t Solve Personal Ownership Issues

About PayPal's Node vs Java “fight” by noreply@blogger.com (Stephen Connolly)
Monday December 16^th, 2013 at 4:34 AM

NewsBlur Puzzle T-shirt 2013
Thursday July 25^th, 2013 at 5:06 PM

Your Child Will Never Be Safer In a Car Than In the Arms of Batman by Andrew Liszewski
Friday April 26^th, 2013 at 9:06 PM

Researcher sets up illegal 420,000 node botnet for IPv4 internet map • The Register
Wednesday March 20^th, 2013 at 4:20 AM

March 8, 2013: Tardigrade by ZenGum
Wednesday March 20^th, 2013 at 4:15 AM

Why Technology Won't Solve Virtual Teams Challenges
Monday March 18^th, 2013 at 6:34 AM