A license to share – Increment: Open Source

Do you care to know the location of all the streetlamps in Minneapolis? The mission number and details of each Soyuz capsule launch since 1967? Or what kinds of pictures of coffee beans you can use in a blog entry?

The answers are available thanks to copyright licensing that allows individuals, institutions, companies, governments, and community-driven projects to share work of all kinds within a standardized legal framework, often with few conditions attached other than providing credit to the source.

Licenses make possible the idea of a digital commons: freely available digital resources like civic maps, encyclopedia articles, and photo libraries. Like the historic commons of medieval England (which referred to shared pastures, forests, and hay meadows), these online resources are a shared public good. Unlike their real-world forebears, however, digital commons can’t be depleted. (Ecologist Garrett Hardin’s 1968 article “The Tragedy of the Commons” made the dilemma of shared but limited public goods famous, though it’s since been debunked as historically inaccurate, and Hardin himself exposed as a white supremacist.) The most popular licensing organization, the nonprofit Creative Commons, picked the name to illustrate the point: No matter how much water we draw from this collective well, there’s always more.

Everything we create automatically receives a restrictive license that limits use to the creator alone. That means that adopting a license for sharing is the only reliable way to ensure that others can use what we’d like to give them. Open-source software paved the way for these licenses by providing behavior to model, language to use, and a history of hard, early lessons from which to learn. OSS and the digital commons are joined at the hip, but the difference between code and other forms of creative expression required a different approach. But it’s one that, decades later, remains consonant.

The joy of the commons

The broadly defined freely licensed community organized in earnest in the 1990s alongside the open-source software movement and picked up steam at the turn of the millennium. Drawing inspiration from licenses used by GNU and open-source projects (in some cases, using identical or derivative versions), the driving force was the same: to turn knowledge into something freely shared that could benefit all, instead of keeping everything inside walled gardens with complicated rules of admission.

Wikipedia, for example, emerged directly from the open-source space. “People coming out of that world and that philosophy started Wikipedia with those principles built into place,” says Stephen LaPorte, the legal director of the Wikimedia Foundation, the nonprofit that oversees Wikipedia. “What you contribute belongs to the public, available for anyone to use.”

To appear alongside and within GNU and open-source projects, freely licensed creative work had to have the same sort of copyright licensing. That could include versions of so-called “copyleft” licenses, which permit the free use of copyrighted material so long as any derivative works are shared under the same terms. The kinds of creative work covered by these licenses could include a photo captured on a smartphone and posted to Flickr with a license that lets anyone use it, even if someone were to sell a poster of the image in question with an inspirational message overlaid. They could also encompass the entire corpus of OpenStreetMap, a massively detailed global geographical database with over 40,000 monthly active contributors, some of whom are dedicated volunteers and others of whom work for corporations that benefit from its growth.

It could include academic papers, too, as the cost of accessing the online and print journals that publish them has increased at a staggering rate. This is especially galling since so much academic work is underwritten by government funding and foundations, yet remains hidden behind paywalls. The University of California system recently canceled its subscriptions to the massive Elsevier journal network in part because it couldn’t get Elsevier to agree to publish all UC papers freely. Meanwhile, the Public Library of Science (PLOS) has over 215,000 peer-reviewed articles freely available for reading and reuse.

But in order for digital commons such as Flickr, OpenStreetMap, or the Directory of Open Access Journals to exist, people have to make explicit, legally defined contributions to it. This might seem trivial, but it requires clearing two separate legal hurdles. First, almost no individual or small business can afford the legal expense of creating a content license from scratch. And while big corporations can, they have little motivation to spend money to share content as opposed to software code. After all, in the world of OSS, corporations can utilize the output of the ecosystem they’re contributing to. There’s rarely as distinct a benefit to participating in free digital works.

Second, many companies and organizations can neither contribute to nor make use of creative content without a vetted license, and custom licenses that require individual review incur greater legal expense. As such, organizations may reject them out of hand. Customized or not, content licenses are particularly complicated documents, even compared to GNU/OSS ones, because of the myriad ways in which creative work—especially visual work—may be put to use. But with the spread of standardized, free-to-use content licenses, anyone—from an individual to a global corporation (think Apple and Amazon)—can make use of them in the same way one would use OSS code. Billions of pieces of work already exist under such broad usage agreements. But it all starts with consent.

You need a license for that

Information wants to be free, to quote half of Stewart Brand’s mid-1980s formulation, and that assertion has only become more accurate since then. People freely exchange information on a vast scale, while the cost of disseminating information in a cloud-storage world has become incidental. The other, less-quoted part of Brand’s maxim is that information wants to be expensive: It has a value attached, though not necessarily a fee.

Some data might always be proprietary and strictly licensed, because its cost of acquisition or its commercial value—or both—is so high. Scarcity is a drug, and many companies base their business models on keeping information they gather or create as dear as possible.

But it’s only a relatively small subset of all new work that has both a reason to be scarce and possesses a cash value. (Beyoncé’s unreleased new album is worth significantly more than most buskers’ recordings.) Even when the cost of information is free as in free beer, that creative work faces the same structural issue that open-source and free software projects did in their early days: A license has to provide clarity to those who want to make use of what someone has created on their own or within collaborative projects like Wikipedia.

Most countries have adopted a framework under which copyright is granted upon a work’s creation—whether you write a poem, snap a picture, code a program, write an email, or compose a song. A copyright notice serves to inform other people of your right, but it isn’t required. (In the United States, you have to register a work with the U.S. Copyright Office to sue someone for misuse or stop them from continuing to use your work, but you can file at any later date.) The original design of copyright, however, didn’t foresee a time when individuals would be able to create hundreds of newly copyrighted works every day with a smartphone or laptop.

“No one can get rid of [copyright] without a lawyer,” says Ryan Merkley, CEO of Creative Commons, a global nonprofit that creates legal frameworks and offers technical advice to platforms for broadly licensing copyright. Even if you tell people it’s okay to use a given photo (or every picture you’ve ever taken), your work can exist in a gray zone without some legalese. Some individuals might take you at your word, but organizations like the Wikimedia Foundation or companies that want to use or distribute your photos—even with full credit—have to mitigate the risk of a lawsuit by having the terms and conditions spelled out in a standard way.

Take the informative case of photographer Carol Highsmith. She has dedicated a large part of her work to the public domain but sued Getty Images for reselling her photos without notifying buyers that they were freely available. The billion-dollar suit was ultimately dismissed at the federal level. Had she instead applied a permissive license, rather than relinquishing her copyright claim in full, she could have retained the right to prevent Getty’s commercial use of her photos while still making her work broadly available to the public.

A properly constructed copyright license allows you to pick and choose which rights you retain and which you give away. Such licenses can also grant projects to which you contribute work, almost always on a volunteer basis, the right to freely share what you’ve added to the collaboration. The point of creating such licenses? “You don’t have to talk to me to use my thing, ever,” says Merkley.

But while individuals can apply licenses to their work piecemeal, platforms and projects like WordPress or Medium can bake in support for a variety of licenses as a default or to automate the application of rights. This works for platforms that accept user contributions or that produce a collaborative result: Wikipedia and most wikis work this way. It also works for platforms that host user content, which allows users to choose a broad content license or to batch apply one to a number of works, shifting the default “all rights reserved” at the moment of creation or upload to “rights freely shared.” Blogging platforms, publishing content management systems, and photo-hosting sites, to name a few, can provide this kind of option.

While many sites rely on these sorts of principles, nearly every major and minor site that offers freely shared content has coalesced around one set of licenses: those developed by Creative Commons, which was in turn inspired by the OSS movement. One led to the other, and Creative Commons, along with one GNU license, now powers most of the creative work on the internet.

An open source of licenses

Since its founding in 2001, Creative Commons has made available what is now the bedrock of the digital commons: free, well-designed, and easy-to-understand licenses. The timeline is illuminating, explains CEO Merkley. Before the year 2000, no one had smartphones, web pages were largely static, and social media was still nascent. Yet, Merkley says, such licenses already existed: in the open-source and free-software movements.

Katie Lane, an attorney who specializes in creator-owned copyright law and freelancer contracts, says this sort of content is of a distinctly different nature than software or collaborative content. “Open-source software assumed you had lots of collaboration,” she says. Creative Commons took “some of those concepts and applied them to content where there was the potential for derivatives or the potential for people adding to it, but it wasn’t necessarily inherent in the content itself—like photographs.”

But, in something close to irony, a multiplicity of licenses covering copyright could have reduced sharing by creating confusion about how each of them worked. Even Creative Commons once fell victim to this: At the outset it offered several niche licenses, but Merkley explains that the organization stopped updating them as it released new versions of its core work. They remain valid.

“From the very beginning, Creative Commons contemplated interoperability, not just within the license suite but among other licenses,” Merkley says. This makes it far simpler for different organizations to use content in the same fashion, and enables algorithmic use of copyrighted material that’s properly tagged.

Since the advent of Creative Commons, alternative content licenses—like the Open Data Commons—have fallen away. Surveying hundreds of major and minor projects, I found that Creative Commons is nearly the only license in use. (In a few cases, software documentation falls under the same rubric as the open-source or free software license that covers the code, like the Apache Software Foundation, which almost recursively licenses the contents of its website and documentation under the Apache License.) The organization’s success is due in part to the legibility of its licenses. The current revision of Creative Commons, version 4.0, offers options organized around four key choices, which it abbreviates to two letters each (plus CC, for its name):

BY (Attribution)
Does the creator want to be credited?

SA (ShareAlike)
Do works that incorporate the material need to be shared under the same license?

ND (NoDerivs)
Or, in contrast, can the work not be modified in any form?

NC (NonCommercial)
Can the work, in its original form or a modified form, be sold or used for a commercial purpose?

These ideas are combined to form six standardized licenses. Creative Commons also released CC0 in 2009, which solves a problem with the “public domain” as it’s defined in some countries—the state of a work having no legal copyright protection. Another common approach uses a definition that Creative Commons calls Free Cultural Works, which encompasses different licenses that meet a four-point test: the right of use, of performance, of distribution, and of derivative works.

“If you are someone who wants to encourage sharing or wants to let your users choose something that encourages sharing,” Wikimedia’s LaPorte says, Creative Commons “is a great tool.”

The GNU Free Documentation License also gets used outside of software projects, because Wikipedia requires contributors to cross-license with the GNU license and CC BY-SA. This compatible cross-licensing allows for the inclusion of material with a single license that fits the philosophy or approach of a descendant or derivative work, which reduces complexity, and also allows some backward compatibility for older creative projects that relied on the GNU license.

Regardless of the specific license, any that allows other parties to share work without first seeking permission helps grow the digital commons.

Collaboration beyond open source

Open source didn’t just provide a licensing model for the emerging digital commons. It also provided a model for collaboration. Collaborative creative projects involve many people participating jointly in producing an aggregated, edited, and constantly evolving piece of work. The biggest examples are Wikipedia and OpenStreetMap.

Like with open-source software, these sorts of projects have symmetrical licenses, in that contributors have to agree to provide the kinds of rights necessary for both individual portions and the entire dataset to be shared under a single, consistent license.

These efforts benefit greatly from people bringing their expertise or willingness to research to bear on focused areas. Ian Dees, a board member of the U.S. chapter of OpenStreetMap, notes that this allows areas of interest outside of the most heavily populated or most commercial to receive attention.

Some companies don’t “necessarily care about spending a lot of time or getting data perfect in the center of Africa,” Dees says. He notes that businesses often rightly focus on strategies that bring them the greatest return, and less populated and unpopulated areas don’t offer that. But OpenStreetMap members can jump in and make a big difference. For instance, some advocate for the addition of curb lines to the OpenStreetMap database (crucial information for accessibility); others have charted all the buoys in the sea. “No matter what you’re mapping, you’re not preventing someone else from mapping what they’re excited about,” Dees says.

Such contributions from volunteers wouldn’t be forthcoming if they were entered into a proprietary database, but members are happy to see their work incorporated into all the maps that make use of OpenStreetMap as a component.

Wikipedia has a somewhat different philosophy, since it can’t place information on different layers and at different scales of detail as OpenStreetMap can. The goal of its nonprofit operator, the Wikimedia Foundation, “is to collect the sum of all human knowledge,” says legal director LaPorte. The group pioneered the use of freely licensed content models.

However, Wikipedia has a notability test, which is used to make a determination about whether a topic or person merits an article. That test can be hard to pass for niche topics, a process complicated by conscious and unconscious bias, concerns which are frequently raised. But persistence and interest allow volunteer editors to make the case to the community. Sometimes the only researched, edited, and compiled information about a given subject can be found in Wikipedia.

“By using this license that allows other people to make improvements, we’re able to have a massively distributed collaborative project like Wikipedia constantly growing and constantly improving,” LaPorte says.

Collaborative projects benefit from the same mix of laser focus and 10,000-foot view that software projects do: Someone has to work on consistency across an entire system, like an architect designing an apartment building, while others can obsess with the fiddly bits, like the fixtures on the sink in apartment 103B.

GitHub, npm, and other code repositories and shared libraries share a philosophy with the hosted or platform model, where participants upload words, images, 3D models, and other material for broad distribution. In choosing to upload their work to a particular platform, people offering material in this way—say, posting pictures of a boat trip around the Scottish Hebrides to share with friends or random visitors—also allow anyone to reuse the images with attribution.

Some sites, like Wikimedia Commons, also run by the Wikimedia Foundation, act more like public repositories, and require contributors to choose among one of several acceptable licenses, all of which allow free use for any purpose, and some of which require attribution. Many others allow users to choose from a variety of licenses, which can range from forswearing all rights to retaining all of them, with many varieties in between. Flickr, Medium, YouTube, Vimeo, and SoundCloud are some of the largest sites to support this sort of license choice.

Any contribution increases a commons, though not always the same commons: Each license can essentially create its own commons with its own rules, though some licenses are compatible with one another, creating a bigger pool of resources.

A rising tide lifts all boat photos

As every separate collection of freely licensed work grows, so too do the many projects that draw from one another. As individual museums, libraries, and other institutions have opened up their collections under a variety of sharing-friendly licenses, some have wound up being compatible with and then uploaded to Wikimedia Commons. From Wikimedia, they can appear in Wikipedia articles and far beyond: The Metropolitan Museum of Art’s CC0-licensed image of Van Gogh’s “Wheat Field with Cypresses” appears in its Wikipedia entry and was the most popular museum-item search on Creative Commons in mid-2017.

This is akin to shared libraries in the software world. Now, as physical libraries upload their contents to the digital commons, everyone can rely on the same “dependency” in their creative work.

To better connect the pools of the digital commons, Creative Commons launched its first software project, CC Search, in 2017 to index material across what Merkley says are nine million sites with CC-licensed content, and about 1.4 billion individual items. As of winter 2019, the index covered over 270 million works across 21 major collections.

So far, there’s no specific effort to create a repository of all work licensed for sharing. Wikimedia’s specific license requirements prevent it from including items that can’t be used commercially; the same is true for OpenStreetMap and the Internet Archive in their particular areas of interest. The Creative Commons search engine is for discovery, not storage.

That lack of a cloud-based central commons for material can lead to confusion about what (and where) the digital commons is. “A good amount of the commons [resides] on for-profit platforms that host it benevolently, for lack of a better term,” says Merkley.

This issue came up with the over 500 million CC-licensed photos on Flickr, which was acquired by SmugMug from Yahoo in early 2018. SmugMug reduced the amount of storage available to free accounts starting in February 2019, but, after discussions with Creative Commons, agreed to keep all CC-licensed images available, even on free accounts that otherwise exceeded the new limits—including newly uploaded images with those licenses.

The Flickr situation, though it was resolved, highlights a significant issue with “information wants to be free.” Costs may have become marginal for any given piece of data, but the collective price can still be high.

Software has utility, which makes it an easier sell for participation, generosity, and continued expansion. OSS and free software serve proven purposes that date back decades, including the pecuniary one of saving money on software licensing. The shared digital commons of media and creative work is still establishing itself, in part because creative endeavors can lack that critical bit of utility. Even with a trillion new pieces of writing and media created each year, people still have to take the leap and choose to freely license it. It’s also hard to measure whether 10 million high-resolution photos of visual art in galleries across the world produce a greater effect than 1 million of such pictures.

The future of the digital commons, however, doesn’t depend on outcomes so much as goodwill. With no preset goal or expectation that creative works require a particular purpose and end, the commons will continue to grow, building on the generosity and effort of those who think the best works are those shared most broadly.

The joy of the commons

Wikipedia

PLOS One

You need a license for that

An open source of licenses

Collaboration beyond open source

OpenStreetMap

Wikimedia Commons

A rising tide lifts all boat photos

Flickr

About the author

Artwork by

Topics

Buy the print edition

Continue Reading

Programming Languages

Glenn Fleishman

It’s COBOL all the way down

Documentation

Glenn Fleishman

“How-to” build a civilization

Security

Glenn Fleishman

Free certificate authorities and the rise of the encrypted web

Internationalization

Glenn Fleishman

For want of a typeface

Software Architecture

Glenn Fleishman

In space, no one can hear you kernel panic

Testing

Glenn Fleishman

Interview: Dorothy Graham

Containers

Glenn Fleishman

Interview: Joe Beda

Open Source

Devon Zuegel

The city guide to open source

Open Source

Poornima Apte

The lingua franca of LaTeX

Explore Topics

All Issues

Planning

Mobile

Containers

Reliability

Remote

APIs

Frontend

Software Architecture

Teams

Testing

Open Source

Internationalization

Security

Documentation

Programming Languages

Energy & Environment

Development

Cloud

On-Call