Like many revolutionary changes in human history, it started with a flash of frustration. Today, Julia is ranked among the top programming languages, and is deployed by the likes of Amazon, Apple, Facebook, NASA, and Uber. But when its creators started building it nearly a decade ago, their goal was a lot smaller.
“We were really just building something for ourselves,” said Julia co-creator Stefan Karpinski, a Harvard Mathematics alum with a PhD in computer science from the University of California, Santa Barbara. (Karpsinki also set a Guinness World Record in 2006 for the fastest single-fare journey across the whole of New York City’s subway system; he’s a determined guy who doesn’t like to waste time.)
The initial drive behind Julia was the desire for a programming language that combined elements of the high-level functionality of MATLAB and R with the speed of C or Ruby—as Karpinski put it, “the best of all worlds.”
In random matrix theory, Edelman’s law is also known as the Edelman distribution of the smallest singular value of random matrices.
That such a language didn’t exist frustrated Karpinski, and he expressed the sentiment to his friend Viral Shah toward the end of his time at UC Santa Barbara. Shah, who had previously worked at Interactive Supercomputing for Alan Edelman—an MIT professor and world-renowned mathematician responsible for Edelman’s law—and had since moved to Bengaluru, India, to work on a countrywide biometric identification project, agreed. So did Jeff Bezanson, a colleague of Shah’s at Interactive Supercomputing, as well as Edelman himself.
From issue 5
A crash course in compilers
Diving deeper into program language theory is a great way to grow as a developer. Here, we go through the essentials of using compilers in language design.
The four “spanned all the different areas of expertise,” Shah noted. Bezanson had a mind for compilers that dovetailed nicely with Shah’s computational science background. Edelman had the experience, and Karpinski the drive. Together, they wanted “a Goldilocks programming language—one that was high level and low level at the same time, depending on how you used it,” said Karpinski. The Goldilocks ideal gave way to the Julia project: an open-source, dynamic programming language that has since spawned a consulting company, Julia Computing, that has raised $4.6 million in seed funding.
In some senses, it was a return to romantic ideals for Karpinski, who had long dreamt of creating new languages but had been dissuaded from doing so by quirks of timing. He had taken a detour from the career path he said he wanted to follow in his 2002 graduate school application—to work in programming language design and implementation—in order to study statistical analysis and algebra.
“It was the Java winter,” Karpinski explained. “People weren’t creating new programming languages when I went to grad school.”
Seven years later, after earning his PhD, Karpinski found himself working as a data scientist and software engineer at Etsy by day, creating algorithms using MATLAB, C, and R that made personalized recommendations to shoppers on the site. At night, he spent hours in front of his home computer, trying to code a system that would end up replacing the programming languages he used at his day job. (Karpinski left Etsy in 2011, not long before Julia was introduced to the world.)
From the initial suggestion to create a new, fast programming language to the first commit, which was made in August 2009, the team moved quickly. “We didn’t spend a lot of time talking about it,” said Karpinski. “We had one thread of emails back and forth, then Jeff, Viral, and I said, ‘Let’s do it.’”
One of the biggest bones of contention at the start of the process was around using GitHub. Bezanson and Shah had an aversion to using the site, simply as a matter of personal preference. But Karpinski created a repository and, by his own admission, “foisted Git on those two.”
The first bits of code committed to Julia’s GitHub repository were recycled. Bezanson, who Karpinski described as “a programming language hobbyist,” had some small scraps of code lying around that made their way into the final version of Julia and helped shape its final form. The I/O runtime libraries that remain in the codebase today came from Bezanson, while Julia’s Scheme parser uses a small but fast implementation of his called FemtoLisp. “[We got] parser code and some ASCII art [from Bezanson],” explained Karpinski. “When you start up Julia, there’s a pretty ASCII logo colored with rings that I turned into our vector art logo; he had that from a previous incarnation [of a different language he had created].”
Much of Julia’s original codebase was written in Scheme, with some parts in C. But as the group got closer to developing a usable, stable version of the language (the team thought this would take less than the oft-alluded-to rule of thumb of 10 years, but to get version 1.0, they crept fairly close to it), they were able to use more and more of their own programming language to build upon Julia.
When you decide to create a programming language from scratch, you come across countless key structural questions that have to be answered before you can progress, each with the potential to waylay the language if there’s a single misstep. Nailing down the core principles of Julia provided some of the team’s most difficult moments—including one tough debate over Julia’s use of multiple dispatch, which is a quicker way of organizing and structuring operations in computer code. When Bezanson proposed using multiple dispatch, he got a mixed response. Karpinski had read plenty on multiple dispatch, including Craig Chambers’ papers on the topic, but had never thought about using it in the domain of numerical computing. Shah was open to the idea, “but he’s more of a parallel computing and numerics guy,” said Karpinski. In the end, Bezanson’s idea triumphed.
“[It] turned out to be a good way to express a lot of things, but we were struggling for a long time with numerical conversions and promotions,” said Karpinski. While most languages have built-in integer and floating-point types, writing rules that lay out exactly what should happen when dealing with them in the specification of the language doesn’t allow for easy scaling when adding support for complex numbers. “It seems like a little thing, like what’s the big deal?” said Karpinski. “But a huge portion of the specifications for programming languages is all about these little details.”
The team also debated how to express Julia’s promotion and conversion system. The breakthrough was realizing that they could use generic functions coupled with the multiple dispatch system they had decided to promote to solve the problem. After that, Karpinski said, the rest fell into place.
Their twinned approaches to building Julia—believing that their little programming language would likely never make great shakes, while also maintaining a level of professionalism born from personal pride in the project—meant that when it was released to the world, Julia subverted expectations. The team’s goal of creating a hobbyist language that could combine speed with depth had initially been met with skepticism; Shah recalled people saying, “It’s pretty cool—but you know you’re doomed, right?” In reality, the language garnered a wholly different response. “People took a look at it and said, ‘Oh, hey, this is real, and these guys have great performance numbers,’” Karpinski recalled.
It was those performance numbers that first drew programmers’ attention to Julia. Programmers who were previously core R and MATLAB users began to dabble with Julia, translating simple code with loops into the new programming language and finding that they could get enormous speed gains—in some cases, up to 200 times faster than in the original languages. “Those people were our early converts—people who came for performance,” said Karpinski.
But no matter how fast a programming language is, users won’t stick around unless there’s a fully formed feature set. And it was Julia’s multiple dispatch system—one of the many forks in the road that the development team had to decide upon when first sketching their plans for the language—that proved to be its most popular feature. They came for the performance, but “ended up staying for the features, and one of the biggest innovations was multiple dispatch,” Karpinski said.
“Julia isn’t by any means the first language to have [multiple dispatch],” Karpinski added, “but the closest language to the mainstream to have this is Dylan.” Dylan is the programming language developed by Apple in the late 1980s and early 1990s for its abortive Newton PDA. This didn’t bode well for Dylan, which was intrinsically linked with the hardware; the Newton was considered such a mainstream failure that it was even the butt of a joke in a 1994 episode of The Simpsons. (In the episode, Kearney, the school bully, asks his friend Dolph to take down a memo on the Newton using the PDA’s notoriously janky handwriting recognition software—and the note to “Beat up Martin” is transformed into “Eat up Martha.”)
While Dylan flopped, largely due to the failure of the device for which it was developed, something clicked for Julia when the initial version was released to the public on Valentine’s Day in 2012. “It’s not complete, but it’s time for a release,” the team said in a blog post announcing the launch. “If you are also a greedy, unreasonable, demanding programmer, we want you to give it a try.”
And try it programmers did. Around 200,000 people read the blog post in the first couple of days, and programming communities began linking to the mission statement and encouraging others to download it and test it out. The blog post hit the top of Reddit’s programming languages subreddit and the front page of Hacker News. The reaction was a vindication of the team’s hard work—and a pleasant surprise.
“That was huge. Before that, it was still just a hobby project, [with us] thinking no one would care about it. After that, everything changed: We started getting invites to various places, and there was a while when I was doing a lot of programming language conference talks,” Karpinski said. By January 2018, Julia had been downloaded more than 1.8 million times.
The distribution of that first iteration of Julia had another benefit: By pushing out the language to the wider world, the open-source project that had for months been the preserve of only a handful of people was suddenly being critiqued by thousands of people at once. Some of those users became contributors, adding their expertise to the development of the project—and Julia’s contributor base rapidly expanded. “We had close to 100 contributors by the end of 2012,” said Shah. (They’re now up to 680.) “Without those contributions, I think it would be very hard for Julia to be what it is.”
With the benefit of hindsight, the team has a better understanding of why the language managed to do so well. “It hit the right niche,” Karpinski said. “We really hit the timing for numerical programming becoming a huge deal.”
When the group started their work on Julia, GUI-based programming still dominated textbooks and classrooms across the world. Now, machine learning, AI, and big data are front and center, and they require programming languages with enough heft to be able to handle them nimbly. It’s why the UK insurance company Aviva uses Julia for calculating risks on policies, and why investment firm BlackRock analyzes data with Julia. It’s why Julia runs on the world’s sixth-largest supercomputer alongside one of the planet’s largest astronomy applications at the National Energy Research Scientific Computing Center. “That’s all in our wheelhouse,” Karpinski said. “Ten years earlier, people wouldn’t have cared as much.”
“We have scientists, chemists, physicists, all of whom start tinkering with the insides, changing the way algorithms work, and writing incredibly clever things.”
Originally designed and developed to suit the particular needs of a small group of programmers, little more than a personal passion project, Julia has since been taken up—and expanded upon—by any number of individuals with a broad range of interests. “We have people who are scientists, chemists, physicists, all of whom start tinkering with the insides, changing the way algorithms work, and writing incredibly clever things,” said Karpinski. “We’ve got more than a few physicists and MRI researchers working on compiler internals, which is sort of shocking. Piercing that veil between the user and the developer has been really eye-opening.”
But now that that initial wave of success has subsided, the team has had time to think about the longer-term impact of the language. “Now we’re in the transitional period from being the hot new language that’s trending with people who like trendy new programming languages to [being] in the mainstream,” said Karpinski.
The success has been such that the foursome joined forces with two others to create Julia Computing, the commercial advisory arm for the language. Setting up the commercial arm was a positive but nerve-racking moment, Shah said. Indeed, he pegged it as the most difficult time for Julia: “It was less to do with Julia and more [to do with] how we can make it self-sustaining. We knew it was the right time, but at the same time, it was scary to imagine doing that.”
Shah and Karpinski agree that the work is never finished. Julia is constantly evolving, buoyed by its open-source ethos and the broad range of voices in its contributor base. “They enrich Julia in ways we could never have imagined ourselves,” said Shah.
“What holds us together is the goal of building the best possible numerical, mathematical software out there, far better than anything that exists today,” Shah added. “The goal is so big that nothing else matters in comparison.”
That begs the obvious question: whether they’ve reached that goal yet. Not a chance, said Shah. “It’s a work in progress, right? It’s forever a work in progress. By definition, I don’t think we’ll ever reach it, but I think we are further along than almost anyone else.”