Every two weeks we profile new, cool and fun Node.js products to keep you “In the Loop”. This time around we’re looking at Nodezoo, a search engine for Node.js modules. We chatted with Richard Rodger to get the low-down on Nodezoo.
Can you tell us about Nodezoo? What inspired you to create this search engine?
Richard Rodger: Nodezoo.com is a search engine for Node.js modules. It uses the idea behind Google’s PageRank algorithm to give you reasonable results from keyword searches.
The idea is that the more modules depend on a given module, the better it is. This gives us a “NodeRank” for each module - valued from 1 to 6 of course, since Node is hexagonal!
The inspiration really comes from our clients - we do Node.js consulting. So many of them are struggling as Node.js learners. It’s great that there are so many modules, but as an experienced Node.js developer you can forget that it’s hard to find the right modules in the early days.
We all use a bunch of heuristics to choose modules. Experienced Node.js people can eyeball a GitHub page and make a pretty good assessment, but that level of knowledge takes time to develop.
There are other module search engines and directories out there, and things are getting better. This is still a mountain we have to climb.
It’s written in Node.js and it’s open source - mostly because that’s the way we roll!
How was the creation process? Did you run into any challenges?
Richard Rodger: This project started as an intense hackathon - I just got the idea and had to run with it! Sometimes you need to focus and tell the world to f**k off. Put everything else to one side and go with the flow. This feeling is precious, and it’s how great things happen.
Of course, most of the time, what you produce is rubbish, but it’s worth it for the gems. The jury’s still out on nodezoo.com!
I took a pretty brain dead approach to develop on this one - no design or architecture. I wrote the whole as batch script and bolted on the website afterwards. Command line development is the fastest way to build something as your code-test cycle is really tight.
This does create huge amounts of technical debt - just take a look at the source code on GitHub. It needs a while lot of love. But you know what, there’s a working site and that’s what counts.
I love the quote from General Patton: “A good plan violently executed now is better than a perfect plan executed next week.” Or indeed, any plan at all!
What is your coding and creation process? How do you handle QA and just making sure things work?
Richard Rodger: To burn down the technical debt I’ve teamed up with Peter Elger - we need to write some tests for a start! The approach we’ll take is something called “continuous production testing”. Deploy at will, but always be testing the live system for issues. You need to set up a bit of infrastructure for this, and it ranges from pingdom.com, to airbrake.io, to custom independent processes.
The other approach we’re using these days is something called “micro-service architecture” - you’ll see this reflected in the GitHub commits soon. Basically, break everything into lots of small (and I mean small) processes. This means running lots of Node.js instances, and they communicate with each other in various ways, synchronously and asynchronously. This has a big impact on testing - each small part is easy to verify, and easy to change.
Things look pretty simple and straight-forward from a user interface perspective. How are things behind the scenes?
Richard Rodger: Dreadful! Like I said, the technical debt is at loan shark interest rates. I’ve been somewhat busy with a new baby in the family this year, which is one of the reasons Peter is coming onboard to whip things into shape.
There is one key challenge we have to resolve. The search engine we use is elasticsearch which is a really scalable Java-based open source search engine. However it’s a bit over-the-top for our needs, and we can’t seem to tweak the boosting to get it to work properly. The issue is that search results tend to be dominated by the most popular modules, even for less relevant terms - try http://nodezoo.com/#q=foo - the search results are dominated by modules that use the work “foo” in the documentation. We’re looking at rolling our own minimalist engine using LevelDB and all the good stuff Dominic Tarr and Rod Vagg have been doing.
We’re always curious about feedback people receive about their projects. Have you heard from any users?
Richard Rodger: "Make it faster" is the main thing we get. Hence the move to a pure Node.js system! And there’s a lot of feedback about relevancy - we have some doozies in there! There are lots of module search engines all trying various approaches, so Node.js developers will end up with a good search engine one way or another.
Have you built all the functionality you wanted into Nodezoo, or is there more to come?
Richard Rodger: The real thing we’re trying to do is give you a selection of modules that you can then eyeball quickly. It’s about the perfect module coming first, more about the scan-ability of the results.
A search engine is pretty simple in terms of UI - we want to keep it that way. It’s all about increasing the relevancy of results.
Do you have any other Node.js projects in the pipeline?
Richard Rodger: We’re a consultancy, so we tend to work as a team on projects that help us deliver for clients. The two big ones at the moment are TacoDB (a Node.js-size database) and Seneca (a toolkit for micro-services).
Also I’ve just bought a soldering iron, so who knows! Time to disappear again for a few days and return with some more madness…
In the spirit of blatant interviewee self-promotion, I should also mention our other little project http://nodeconf.eu/ - there will be Vikings!
Sounds fun! Richard, thanks for chatting with us! For anyone interested in Nodezoo, you can go to Github and reach out to Richard there as well.
As always, if you have a cool Node.js project or product you think we should profile, reach out to us at firstname.lastname@example.org and we’d be happy to get you In the Loop.