Monitoring and Profiling Tips: Trouble-shooting Node.js and NodeFly

The NodeFly team is not shy in recommending that you start monitoring the progress of your product the moment you start development. And we won’t hesitate to suggest using the NodeFly agent to do that monitoring. But sometimes, hiccups occur, and the particular coding of a product, or the modules it makes use of, or some other circumstance may mean that your product doesn’t play as nicely with the agent as we would like.

For example, we recently connected with Stephan Smith, one of our users. He initially reached out to us about some log in issues he was experiencing. However, as we corresponded a bit more, we shared details that not only let us better understand what may have been impacting our access to getting and displaying data, but also overall details that might help our users.

Stephan started off by describing his product a bit for us, and saying: “I have encountered a number of similar issues. I love your production. We have NodeFly in production on our AWS servers.”

We definitely appreciated the kind words about our product, but of course were concerned about the original log in issues. Luckily, Stephan expanded on his thoughts. Quite a bit, actually. Here’s a very detailed run-down of his trouble-shooting…

“One thing you might want to note for new NodeFly users is the Node.js MaxSockets issue. We found that once we enabled NodeFly we had major connection issues. It seems our ulimit and MaxSockets were still at the linux/node defaults, (ulimit 1024 and node at 5). It seems when we enable NodeFly maxed out our outbound connections and started to get huge latency. Once we realized the issue and solve the problem we found NodeFly made things amazingly clear. We have since moved on to optimizing the ElasticSearch and MongoDB layers to get our response times as low as possible.

I would recommend you think about sharing best practices for Node.js users. We found that once we implemented three changes our code got fast. Some of the crazy request stats I have seen for Node.js were just not panning out, until I did some serious debugging and tweaks to the stack.  

We have three AWS node instance, a load balancer, two varnish severs, a three instance Mongo cluster and three instance Elasticsearch cluster. We started to onboard clients to our API and were started to get worried. NodeFly was showing that small loads were resulting in 50% CPU.

I started by jacking up the ulimit and Maxsockets values past the 1024 and 5 defaults. This made a huge and immediate difference. I then changed our mongo driver to allow secondary reads. This took the latency down and the throughput up. Almost all the CPU load dropped off. And we stopped worrying.”

Whew – we’re stoked that Stephan was able to stop worrying, and impressed that he put together such a detail-filled account of how he approached his problem. Amidst our communications, Stephan also added: “I use your product on my pet project formagg.io. And I think I have you listed in the ‘secret sauce’ page on my API.”  It is a cool page, especially for cheese lovers!”

Our sincere thanks to you, Stephan. And hey, the http://formagg.io/ site is quite fun as well. Check it out as well as its API.

Of course, we know that all of our users have their own unique set-ups and needs. If you have made discoveries about how to make the agent run more smoothly with your set-up, we would love to hear from you and share that info. Send us a line at feedback@nodefly.com to share. And as always, you can access the NodeFly agent at http://www.nodefly.com

image