Introducing reactive programming and declarative UIs in CoffeeScript

(Update: see also the Hacker News thread)

(Cross-posted from the Infer Engineering Blog)

At Infer, our engineering team faces no shortage of technical challenges, but it’s not all just about data lifting and machine learning. Everything always comes back to the user, whether it’s helping companies better understand their customers through our dashboards, or providing the visual tooling necessary for our own team to build models more productively. We build these out as web-based user interfaces of varying complexity.

Today we’re announcing the release of a new tool that will hopefully make building these web app frontends a bit more expressive and fun. is a small, lightweight CoffeeScript library that provides two sets of functionality:

  • a set of primitive data structures for reactive programming an HTML template
  • DSL with for building dynamic, reactive web UIs

Why another client-side framework? Surely with Angular, Ember, Knockout, and a long tail of other libraries, there must be something else that we could have run with!

In fact, we come from having used these other frameworks on various projects in the past. While we have deep respect for these trailblazers, we wanted to capture some of the lessons we learned from our own experiences. Our key goals with are:

  • simplicity: minimize magic and have foundations in a small set of primitives/concepts
  • scalability: in terms of both performance and application architecture

Continue reading

True Scala complexity

Update 1: See also the discussion over at Hacker News.

Update 2: Sorry for the downtime. Leave it to the distributed systems guy to make his blog unavailable. Nginx saves the day.

It’s always frustrating reading rants about Scala because they never articulate the actual complexities in the core language.

Understandable—this post is intended fill that gap, and it wasn’t exactly easy to put together. But there’s been so much resistance to the very thought that the complexity exists at all, even from on up high, that I thought it would be constructive to provide a clearer illustration of why it’s real and how it manifests itself.

So, here goes yet another Scala complexity rant, from someone who labels himself as a Scala advocate. And since this tends to be an insanely touchy subject for some lonely people, please read every sentence as implicitly beginning with “In my opinion…” before you fire off those death threats.

Continue reading

What’s there to like about R?

Update 10/11/2011: There’s a good discussion on Reddit
Update 10/12/2011: Note manipulate package and highlight data.table package

The R statistical computing platform is a rising star that’s been gaining popularity and attention, but it gets no respect in the hood. It’s telling that a popular guide to R is called The R Inferno, and that advocacy pieces are titled “Why R Doesn’t Suck.” Even the creator of R had this to say about the language in a damning article suggesting starting over with R:

I [Ross Ihaka] have been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too.

So why do people still use R? Would we lose anything if we just migrated to (say) Python, which many consider to be a major contender/alternative to R?

In this post I’m going to highlight a few things that are nice about R—not just in the platform itself, but in the whole ecosystem. These are things that you won’t necessarily find in alternate universes like Python’s.

Continue reading

Functional reactive programming for the web, or: where’s my Lunascript?!

Update: there’s some good discussion on Reddit.

Even when using the latest and greatest frameworks and disciplines, writing fast, highly interacting web applications involves a lot of accidental complexity:

First you need server code to figure out what data the browser needs. Hopefully you have an ORM layer, but you still need to carefully structure your code to minimize your backend dispatches, and you need to carefully keep that in sync with your front-end code lest you don’t fetch enough data or hurt performance by fetching too much. If it’s a Web 2.0-style app, you re-implement a ton of that server-side code in JavaScript, once for creating the page and then again as controller code for keeping it up to date and consistent. And when the user changes something, you bottle that up — typically in a custom over-the-wire format — and send it as an XHR to the server. The server has to de-serialize it into different data structures in a different language, notify the persistence store, figure out what other clients care about the change, and send them a notification over your Comet pipe, which is handled by yet more JavaScript controller code. Offline support? More code.

Strike a chord? That’s the motivation setting the stage for [Lunascript], a reactive programming framework for web apps. What?

Continue reading

Fast, native-C Protocol Buffers from Python

As of the latest release of Protocol Buffers (2.3), protoc --py_out generates only pure Python code. While PB can generate fast parsing and serialization code for C++, this isn’t made available to Python, and manually wrapping generated code amounts to very tedious maintenance work. This is a recurring feature request from the discussion group, but pure Python code was a higher priority because of certain client requirements—namely (according to the team), AppEngine.

Luckily, native code is slated for PB 2.4, and has been available in the svn trunk, so you’ve been able to use fast PBs for a while now. (We’ve been using r352 for some time and have not seen any problems.) The PB team has been understandably reluctant to specify any release date, but in my thread Kenton Varda does mention early 2011 as a rough estimate.

I haven’t seen this documented anywhere else yet, so hopefully this will be useful to others.

Continue reading

Google’s lesson learned: distributed transactions matter

In retrospect I think that [not supporting distributed transactions] was a mistake. We probably should have added that in because what ended up happening is a lot of people did want distributed transactions, and so they hand-rolled their own protocols, sometimes incorrectly, and it would have been better to build it into the infrastructure. So in Spanner we do have distributed transactions. We don’t have a lot of experience with it yet.

–Jeff Dean, in response to “What was the biggest mistake?” at the end of his Stanford EE380 lecture (note: video has since been replaced with that of another lecture)

Infrequently asked questions on deterministic distributed transaction management

Updated since my original post: added plug; explained why centralization is still hard; noted the issues unaddressed by this work; clarified that Daniel is focusing on ACID.

My homeboy Daniel Abadi baits flames in true Stonebraker style in his latest blog post, proclaiming: “the NoSQL decision to give up on ACID is the lazy solution to these scalability and replication issues.” Then he shows you how he promises to pimp your database by imposing a deterministic serial ordering to all transactions.

Without further ado, it’s time for some Infrequently Asked Questions On Deterministic Distributed Transaction Management!

I’m busy. Gimme the executive summary.

Daniel introduces his new VLDB paper, where he says he’s got the distributed DBMS your distributed DBMS could smell like, if only you used a centralized, deterministic transaction manager instead of lady-scented body wash. Specifically, he proposes avoiding the dreaded two-phase commit by centrally controlling concurrency, and minimizing lock time with optimistic operation.

I react.

Continue reading

gbookmark2delicious 3.0

gbookmark2delicious is a simple command-line tool that will synchronize your Delicious account against your Google Bookmarks account, effectively enabling a public feed for your Google Bookmarks. I recently rewrote it to work again (the new Google Bookmarks broke the old code), introduced bulk export and import for substantially improved performance, and ironed out encoding quirks/differences between the two services.

Not too long ago, Google Bookmarks introduced Lists, which got me excited because I thought they were a way to publish bookmark feeds, which would allow me to retire gbookmark2delicious. It turned out to be pretty different—Lists generate a feed for any updates to the pages in the List. (Plus, adding new bookmarks requires that it be explicitly added to the List, and the bookmarklet for doing so is IMO less usable than the regular Google Bookmarks bookmarklet.)

Continue reading

Locale-sensitive IO encoding in GHC 6.12

One thing to watch out for in the latest versions of GHC (6.12) is the new locale-sensitive text IO. For instance, when using EasyFilter to render Pandoc documents in WordPress, you must make sure you set the LANG environment variable to en_US.UTF-8 (e.g. in /etc/apache2/envvars), or it will typically default to POSIX and thus cause Pandoc to crash whenever it reads a non-ASCII character:

pandoc: <stdin>: hGetContents: invalid argument (Invalid or incomplete multibyte or wide character)

or writes one:

pandoc: <stdout>: commitAndReleaseBuffer: invalid argument (Invalid or incomplete multibyte or wide character)