Almost every large software project ends up using a number of domain-specific languages (DSLs). Initially, these are probably DSLs associated with tools others have written, such as Skylark rules or SQL queries, but as a project grows it can become useful to write your own DSL that’s suited to your specific needs. A good homegrown DSL can save you an enormous amount of time, but anyone who has had to interact with a bad one knows that it becomes an exercise in frustration.

While there are many factors that go into writing a good DSL, one that’s rarely discussed is how capable it needs to be. In my experience, most DSLs end up being Turing complete by default. Users often don’t need Turing completeness in their DSL, since by definition they’re already intended to be limited in scope, and can get some real benefits by avoiding it, from more predictable resource usage to curtailing misuse. Here, we’ll discuss what Turing completeness means, how to build a language that isn’t Turing complete, and what benefits you can gain by not having a Turing-complete language. We’ll also consider some example domains where using these techniques in a DSL can be particularly useful.

Note that I’m going to be covering this topic with a practical rather than a theoretical approach. Not all of the benefits discussed necessarily follow rigorously from a lack of Turing completeness, but the discussion almost always applies to the real-world DSLs people actually create.

Turing completeness

We’re going to discuss Turing completeness from a practical point of view, not a theoretical one, so we won’t worry about the formal definition of Turing completeness. Practically, what you need to know is that a Turing-complete language (also called a universal language) is one where you can compute anything that any other computational method can compute. In other words, a language that’s non-universalor Turing incompletehas some limits on the set of things that it can compute.

Arbitrarily removing some operations isn’t enough to render a language Turing incomplete, because in many cases the omitted operations can be replicated by some combination of the remaining ones. In the extreme, there are languages, from the Lambda calculus to Whitespace, that are so limited that they’re quite impractical to use, but because their operations are chosen carefully, they’re nonetheless universal. The key to making a language Turing incomplete is to omit some operations that are vital for universal computation. There are a number of ways that you can do this, but in practice most languages employ one of two options:

Use a tool that only allows Turing-incomplete languages

The less common method to prevent a language from being Turing complete is to implement it using a code generation tool that only implements a less powerful computational model. The main gotcha here is that most such tools allow the insertion of arbitrary code into the generated code, which gives you an opportunity to escape the confines of the restricted computational model. If those features are only used to do things that don’t affect the computation being carried out, such as set bookkeeping variables or print output, the result will still be Turing incomplete. The two most common classes of tools for this use are state machine generators and parser generators.

State machine generators do just what they say on the tin: They generate code that implements a finite state machine. Finite state machines are simply lists of states and rules for each state that say what state to switch to based on an input character. The languages that can fit into this structure are quite limited. An example of a commonly used state machine generator is Ragel.

A more advanced option is a parser generator, like Yacc or ANTLR. These tools usually generate code that implements a deterministic context-free grammar, which is equivalent to a finite state machine that can call itself recursively. Parser generators are predominantly used to produce abstract syntax trees for programming languages, so they are naturally used for parsing DSLs. If a DSL’s behavior can be implemented by taking simple bookkeeping or output actions as the language is being parsed, it will be Turing incomplete.

Restrict the availability of unbounded loops

Most commonly, non-universal languages restrict the ability of the program to execute unbounded loops.Unbounded loops” here means any method of repeating the same statements without knowing how many iterations will occur before the loop begins. This includes straightforward constructions, like while loops with a condition, but also less obvious constructions, like recursion. This doesn’t exclude all forms of looping; in particular, a non-universal language can still loop a fixed number of times, such as in a Python list comprehension.

This is usually the more fruitful approach. Unbounded loops are most frequently used for event loops, approximation functions, manipulating recursive data structures, and other constructions that are usually outside the scope of DSLs. Users of such languages can also understand and work with this restriction easily, since its absence tends to be conspicuous.

Arbitrarily removing some operations isn’t enough to render a language Turing incomplete, because in many cases the omitted operations can be replicated by some combination of the remaining ones.

Benefits of non-universal languages

Using a non-universal language has a number of benefits, although, as noted above, the benefits we’re interested in don’t all strictly follow from the definition of Turing completeness. Rather, we’re likely to end up with them based on the ways that Turing-incomplete languages are typically implemented.

More predictable resource usage

The main theoretical benefit that comes from using a non-universal language is that you can craft a language in which all programs terminate in finite time on all possible inputs. Note that this does not mean that every program in a non-universal language halts (as a trivial example, a language with no instructions except loop-forever isn’t Turing complete). It means that it is possible to eliminate non-terminating programs if you craft your language properly.

Making a language where you can’t write a program that doesn’t terminate is comforting, but finite isn’t a very restrictive limit. With some effort, though, we can improve on that and get effective bounds on resource usage, chiefly execution time and memory usage.

If we employ the strategy of using a code generation tool, resource usage becomes significantly bounded. Finite-state machines always execute in linear time with constant additional memory usage. Most parser generators also generate programs that execute in no more than O(n3) time, and their space requirements are also no more than O(n3). Again, we need to be careful not to break those bounds by inserting additional complexity into the generated code.

If we employ the strategy of eliminating unbounded loops from our language, guaranteeing bounds is much more difficult, especially if the language resembles a typical imperative language. It can be done, however, if that aim is kept in mind during the language’s design. A language that resembles a Unix-style command line pipeline with a predetermined list of operators, for instance, can guarantee the resource usage of the individual operations it suppliesand thus of the entire program.

Even in cases where the resource bounds can’t be fixed, eliminating unbounded loops tends to make it harder to write programs that require deceptively large amounts of memory or CPU time. Recursion, in particular, is a common source of unexpectedly long-running programs that can be avoided with this strategy.

Improved analysis

Non-universal languages also lend themselves to analysis, both manual and automated, much better than their universal counterparts.

At the most basic level, constraining the available operations makes it easier for humans to wrap their heads around what programs do. This is not a panacea, as anyone who has had to struggle with an overly complex regular expression can attest, but on the whole it tends to improve the situation. The simpler the computational model, the more easily a person can simulate it in their head.

Also, because the halting problem and other such issues don’t necessarily apply any longer, a non-universal language allows for much more thorough automated analysis of programs than universal languages do. In the extreme, this manifests in languages like Coq, in which programs contain proofs of their own properties, including proofs of their termination on all inputs. Analysis at that level of sophistication requires a lot of effort at the language design stage, but we don’t have to go that far to reap the rewards; for example, you can generate valid and invalid input strings when given a regular expression or some other kinds of simple languages, and this type of information can be used to implement QuickCheck-style testing tools.

Focus on the domain

One final benefit that is relevant to all DSLs but is of particular significance here is that it can encourage users to only use the DSL for the domain for which it was intended.

DSLs are intended for use in a single domain, but if they’re convenient enough, users will start to use them for neighboring or even completely unrelated domains. That usually causes problems, because if usage in new areas becomes significant, there will be attempts to adjust the language to improve its suitability in those areas, often to the detriment of its original purpose. Going far enough down that road results in an ad hoc, poorly designed general-purpose language with entrenched usage, which isn’t good for anyone.

Turing-incomplete languages benefit in this area by providing a strong signal that using them outside their domain is discouraged. For many programs, it can be literally impossible to express them in a non-universal languageand even when it is possible, the restrictions will usually make a potential misuser pause. This helps keep the language focused on its original purpose.

Example domains

Now that we’ve looked at how we can produce a Turing-incomplete language and surveyed the chief benefits, let’s consider some domains where these tools can be applied.

Configuration languages

One area that’s particularly well suited for non-universal DSLs is configuration. When configuring a system, you often just need some combination of dictionaries and lists, which allows you to use a format like YAML or JSON quite straightforwardly. If this satisfies your needs, then you not only get the benefits of being non-universal, you also reap the benefits of having no computation at all.

However, it can be useful to have your configuration system include a bit more expressive power. For example, suppose you have a job-scheduling system for running Java servers. You need to be able to set both the overall memory limit of the container and the Java heap memory limit via a command line flag. Rather than having to change both values whenever you grow or shrink your container, it’s nicer to measure the non-heap memory usage and then calculate the Java heap limit from that. The general situation where one configuration value depends on some others is very common, and frequently motivates the change to configuring systems with a programming language.

A common solution to this type of issue would be to generate the final configuration using a scripting language like Python, but an alternateand sometimes bettersolution is to use a DSL and limit the computational power of that language. Eliminating unbounded loops is a good strategy here. This is especially true if you support some limited loop constructs like list comprehensions, which cover most of the common looping operations in configurations (e.g., updating a property for an entire list of objects or summing up a property across a collection). In my experience, there are few or no places in most configurations that require unbounded looping, and they tend to be among the more troublesome.

The major benefit you achieve in this area is improved analysis. Many SREs or DevOps people have horror stories of configurations produced by general-purpose languages that were so complicated that it required a major effort to figure out why a given command line flag had been set to a given value. In most cases, a non-universal language can save you from your worst impulses by making the connections between items much clearer, sometimes clear enough that you can make the language implementation (or a separate debugging tool) track the source of values and report on what inputs went into them, or analyze a program and report which portions of the output could be affected if an input value were to change. It also serves to discourage you from putting business logic into the configuration when it should be in the software being configured.

Query languages

Another area ripe for non-universal DSLs is query languages. Even more than with configuration languages, this is an area where it’s important to be careful, since adding relatively complicated computational features can seem quite attractive. However, you can easily stray into universal territory without knowing it. SQL:1999, for instance, is Turing complete due to the fact that common table expressions can recursively call themselves, thus allowing unbounded looping, even though the rest of the language is not.

Again, limiting looping is an important strategy, but the most common way to go about it in this domain is to carefully curate and structure queries in such a way that the only loops available are implicit. Basic SQL SELECT statements without subqueries, proprietary extensions, and so forth, are a good example of this: Although they can loop over the data in various ways, those loops are all contained inside specific operations like GROUP BY and joins, rather than a general-purpose loop construct.

As an example, suppose we’re writing a query system for a logging storage system. We might decide to create a language that consists of a pipeline of processing stages in the style of the Unix command line. If the operations available for a stage are limited in how many passes over the data they can make (and most typical operations can be done in a single pass), then we can show that no program can loop forever. Most operations that one would want to do on log data, such as filtering or aggregating across time, can be easily phrased in this pipeline style, so despite this restriction, we can still query the logging system effectively.

Here, one of the largest benefits is predictable execution time. If your query language doesn’t allow the programmer to express any computation that can’t be done in a single pass over the data, then execution time is obviously bounded. Even in cases where you allow joins or other operations that are nonlinear in nature, their execution time can be bounded if chosen properly.

Another benefit that’s not as immediately obvious is the ability to ensure that programs in the language can be sharded across tasks or machines. Using our logging query language example, if each pipeline stage produces sequences of records that are processed by the following stages, then it’s easy to envision a distributed system in which each pipeline stage is performed by a single task and its results are then streamed to the next stage. If the individual pipeline operations are also distributable, those stages can again be distributed across multiple tasks if necessary. This is far more difficult if the language allows arbitrary combinations of operations.


As we’ve seen, DSLs that aren’t Turing complete come with a number of advantages. There’s certainly a downside, as by definition there are things that these languages simply cannot compute. But the advantage of applying these techniques to DSLs is that DSLs are also designed to be limited in their scopeand, as I hope I’ve illustrated in my examples, there are domains where non-universal languages can be useful. So the next time you’re embarking on writing a DSL, I hope you’ll consider limiting its power and taking advantage of the benefits that doing so can bring.