How to practice privacy without slowing down

Privacy has never been so relevant. Just as the Industrial Revolution led to labor laws and safety regulations, so has the Information Age exposed opportunities for digital exploitation and prompted calls to better protect people and their data.

We’ve become increasingly aware of attacks on companies’ and individuals’ property, breaches of data, and poor privacy judgment calls that have made both users and employers squirm. Privacy has always been key to users’ faith in your service. This year, however, the right to privacy has gotten enough attention to warrant the passage of new laws and regulations around the world.

Shining a spotlight on the importance of privacy is a good thing, no question about it. But for small companies with resources stretched thin, adhering to new regulations can appear daunting. More practically speaking, these privacy laws can seem like unnecessary red tape. If startups are known for their open floor plans and disruptive energy, privacy regulations threaten a throwback to the days of cubicles and filing for permission—a rigid buzzkill to the transparent, sometimes unregulated work environment common among startups. However, by implementing privacy practices from the start, by design, you’re doing the right thing for your company, your product, your employees, and your users—and you’ll most likely save both time and money fixing problems down the line. This article explains how to enact privacy principles without sacrificing your startup culture and speed.

If you have the luxury of a security team with dedicated privacy experts, awesome. If not, that’s OK—we’ll go through how you can cover your bases, no matter how lean your company.

Is security the same thing as privacy?

Working with a security team or outside firm to keep your company assets safe? Make sure you prioritize privacy in its own right, too. Although they’re closely related, security and privacy are not interchangeable terms. Let’s go over the broad definitions.

Security

An information security program protects all of the informational assets that an organization collects and maintains. The security team usually focuses on protecting the company, primarily by preventing threats from gaining access to your systems. At a basic level, that means hardening your systems, evangelizing best practices, reviewing code for insecurities, and scanning for abuse or threat signals that need to be blocked.

Privacy

A privacy program focuses on the personal information an organization collects and maintains about its users. A privacy team is usually made up of privacy experts with backgrounds in engineering, law, or policy. Security is mostly about keeping threats out. Privacy, on the other hand, is about how we handle what’s on the inside: What do we do with people’s data, and how do we prevent unwelcome surprises for our users?

In practice, privacy requires a bit of engineering, some policy work, and program management. We won’t prescribe specific privacy guidelines in this article—that depends on your business and the types of data you handle—but we will offer some advice to guide you toward solutions.

Locate your sensitive data

To adequately assess your privacy needs, start by asking yourself some questions:

What types of personal data do you use, and for what purposes?
How do you store user data?
How do employees access user data?

These questions will help you understand the data flow at your company. Document your features, especially when there’s user data involved. Understanding your data flow—where each kind of data is stored, why it’s stored in a particular place, what it’s used for, and which downstream processes read it—is a great start.

If you’re a young company, you have a huge advantage here: You can understand where data lives and how it moves before the system gets complicated and potentially leaky. Especially if you have a monolithic system or a relatively small engineering team, you still know the ins and outs of the system. Before things get too complicated to draw, before your staff begins taking tribal knowledge with them when they transfer teams or companies, you have an opportunity to shield yourself, your users, and your employees from potential headaches and heartache. Study and document where and how data is collected, where and how it’s used, and where and how it can be accessed or halted en route.

New regulations like the EU’s General Data Protection Regulation (GDPR) and the California Consumer Privacy Act empower users to request a readily accessible, easily readable copy of all the data you collect from them. These users can also demand that you delete it all. If you don’t know exactly where that data is, not only will you have to scramble to adhere to these rules on demand—which could take weeks or months of effort each time—but you’ll also face potential legal trouble if you take too long or aren’t adequately prepared to act.

Classify the data you collect

Once you document where user data enters and how it flows through your systems, the next step is to classify it. Few types of data are defined in the law at this point in time, so it’s mostly up to your company to establish boundaries around each type of data you collect. Consider the following when bucketing data.

Personal information or user data

Collecting user data matters because it reflects personal information. Any information about a user falls into this broad category. GDPR is centered around protecting this particular broad class of user information. Below are the subclassifications of personal information. Subcategorization of user data in this way, based on sensitivity, enables better management of the user information to protect the user’s privacy.

Personally identifiable information

Within the broad definition of personal information, there’s personally identifiable information, or PII. This includes names, addresses, and email addresses—anything that can call you out as a person.

Sensitive information

Another subclass of user data is sensitive information. Sensitive information is less strictly defined than PII. Instead of personally identifying a user, sensitive information is anything that could be used to enable fraud or cause harm to a user, like geolocation, passwords, and biometric data. While you might give out your name or phone number (both PII) to a stranger you’ve just met, you probably wouldn’t give them your password or fingerprint (both sensitive information). Note the distinction: PII reveals your identity, which is worth protecting, but sensitive information is often more top of mind for users.

Regulated data

Finally, there’s regulated data, which comes with its own set of regulations but is not necessarily enough to identify you personally or enable fraud. Health data falls into this category. Financial data is SOX regulated. You might need to establish access and handling rules in this arena that are different from those in your other personal data buckets.

Chances are, you have some information about people floating around in your systems. Locate it, decide what’s personal information and what’s not, and decide how you’ll handle it. Knowing your own privacy landscape is the first step to building a strong privacy program.

Set retention rules

One of the easiest ways to level up your privacy program is to set default retention periods for each type of data you collect. Once set, these retention rules apply to all personal data you collect, no matter where you store it. If you’re not doing this already, make it a priority.

A good privacy program needn’t slow you down or block your dev team. Determine what your engineers or others need to be able to do with data, how long it will take, and set predetermined retention periods. Retain data for as little time as possible. That might mean days or minutes, depending on the data. For example, credit card information is helpful to retain in case of return or refund. Precise user location—especially that of minors—must be kept for no longer than the minutes it takes to aggregate the data. The only exceptions are certain types of production data, which require longer retention periods, and backups kept for legal reasons or litigation purposes. Examples of the first case include user account information, which needs to be saved for the life of the user’s account, and financial or billing information, which needs to be saved for seven years. In the case of litigation or another legal requirement, you may have to keep certain data for longer periods of time as well. Back up only the data that’s needed, and create a different set of stricter access rules and policies for this data.

Establish data retention and deletion policies. Setting a time to live (TTL) for each type of data decreases the risk of mishandling. If the policy is to delete data after one week, you shouldn’t be keeping it anywhere for longer than one week. Consider what your engineers truly need in order to experiment and build, and erect bulwarks against data leaks, compliance breaches, and other slip-ups that can harm not only your users but your company’s business, trust, and reputation for years to come.

Prioritize access control

Security is about protecting your kingdom; privacy is more about establishing governance within it. You need to figure out how to handle and store data safely, make sure it can’t be leaked, and let users know exactly what they can expect from you. As soon as you’re operating as a company, you need to be operating with privacy in mind. Given the global privacy climate and, you know, ethics, you can’t afford to skimp on safeguards if you handle any PII.

It’s OK if you don’t have a team of privacy experts. Maybe you don’t even have anyone dedicated to security. The important things are to make privacy a priority and to design for security—and to be mindful and explicit about the people and systems you will have to implicitly trust.

To establish a culture of privacy, build upon your data classification exercise results. People’s payments, locations, and other sensitive information should be private; unless an employee needs that data to do their job, they should not be able to access it. Document an access policy, lock down the data, and create rules about when and how that data can be accessed. Make it mandatory for employees to include a legitimate business reason for any data access requests, train your managers to review these requests carefully, and then set time bounds on the access—even when people do need to work with sensitive data, they shouldn’t have access to it indefinitely.

In short: Keep data somewhere safe and out of reach; don’t let just anybody come in and take it; keep track of who does have access to it; and set very strict expiration dates to decrease the risk of sharing, abuse, or contamination.

Test data vs. production data

Don’t test on production data sets. Have a test data set.

Auditing and logging

Let people have access to data when they need it—but make sure you have a way to monitor all access.

There are two common ways to allow employees to access user data. The first is to require them to request permission every single time they want to touch it. In this case, you’re practicing the principle of least privilege: No one has the right to see or use any personal data by default. You’ll probably prevent some accidental data mishandling, and you’ll ensure that every single time personal data is being used, there’s a written reason attached. The drawbacks, however, are that you bottleneck your employees and put a strain on whoever stands at your privacy gate.

The other extreme is self-service: You let employees access whatever they want. In this case, you’ll need to rigorously track who’s accessing what. Keep logs of access to various types of data. Assign auditors to review these logs on a set schedule and verify legitimate business purposes. For any PII or sensitive information, make sure there are checks in place, like a break-glass option, a field for noting the business reason for accessing the data, and warnings that this access is audited. Add instructions about what the employee must do once they’ve accessed the data to make sure it’s not saved or copied.

Build privacy into your product by design

Privacy by design is a concept that’s been around since the 1990s, but current events have brought it front and center in tech. If you haven’t heard of it yet, you will—it’s required by GDPR. Rather than hoping for the best, or implementing privacy checks only after a privacy incident or PR nightmare or public shaming on Twitter or in the media, privacy by design means implementing privacy considerations all throughout the engineering process and the product. It’s not particularly hard to grasp, but it has been difficult to put into practice. Now, however, there are strict sets of guidelines helping us get privacy right.

It’s been a big year in privacy news. From GDPR to Cambridge Analytica, there’s been a lot to keep up on. What everyone should know by now is that user data is valuable, and governments are beginning to hold companies accountable for handling it ethically.

GDPR

In May 2018, the European Union implemented the General Data Protection Regulation. What does this mean for you? If your company collects data on any citizens in an EU country, you’ll need to comply with strict data protection rules or face hefty fines—up to €20,000,000 or 4 percent of your company’s total worldwide annual revenue, whichever is higher.

Under GDPR, you’re required to handle data carefully. The regulation imposes limits on what you can collect, how you store and process it, and what you do with it. Any processing of personal data must be lawful, fair, and transparent to the user. You can only collect data for explicit, legitimate purposes—and even when you have a purpose, you must collect the minimum amount of data necessary, and no more. You must keep that personal data accurate and up to date, treating it like the valuable asset it is. You can’t keep personal data for any longer than is necessary for the legitimate purpose for which you’re collecting it. You need to take security measures to protect against unauthorized or unlawful processing and against accidental loss, destruction, or damage of the data. Lastly, you need to be able to prove your compliance with GDPR.

For users, GDPR means that companies handling their data must operate more like banks handling their money. Users can access a copy of their data, request rectification of inaccurate personal data, request deletion of their data, restrict the processing of their data if it’s inaccurate or unlawful, receive their data in a portable format, and object to automated decision making (like profiling). The profiling rules in GDPR are nuanced. Any time you automate the processing of personal data to evaluate something about a person, you’re profiling. For example, if you’re collecting data and influencing people’s shopping habits based on your evaluations, it qualifies as profiling. However, if your profiling also produces a legal effect on the individual, that feature must be explained to the user and include an ability to opt out. For instance, if a feature can disqualify a person from something significant, like a job or a credit application, you need a few more safeguards to ensure a fair evaluation.

California Consumer Privacy Act

Soon after GDPR, the state of California passed its own privacy law, which will take effect in 2020. It grants California residents certain rights over their data, including the ability to opt out of having their data sold, access their personal information in a readily useable format, and request that companies delete their personal data.

Essentially, this means that if you have any users in California, you’ll need to be GDPR compliant for them as well. This leaves you with two basic options: either reform your data protection policies to comply with the California law, or design a segmented policy that effectively puts Californians and Europeans into one group, while all of your other users remain second-class citizens when it comes to their data. It’s probably less work to whip your privacy program into shape once and implement changes globally. And it’s undoubtedly the right thing to do for your users.

Make sure you know what’s required and what’s coming. At many young tech companies, the goal is to make something useful, ship its worst working version, and then scale globally as you iterate. When it comes to privacy, however, this do-now, redo-later approach is inappropriate. You must follow regulations if you want to enter certain countries’ markets, and doing so from the start will make it much easier to scale up quickly. Plus, choosing to monetize in creepy ways now, while no one’s looking closely, opens the door to lawsuits and hearings later on—not a good look, nor an ethical business decision.

Use the principle of least privilege

As a final step in making your company privacy-savvy, follow the principle of least privilege to make information discovery and autonomy less dangerous. Although it’s primarily a computing principle, you can apply it to human access. The idea is that everything (or every person) must be able to access only the information and resources it needs to carry out its (or their) legitimate purpose. That means securing your APIs to only interact with the necessary services. It also means locking sensitive information away from prying eyes by default.

There’s an assumption that placing limits on access also limits our ability to move fast and break things. People think that granting employees access to all information means fewer bottlenecks, less red tape, and more opportunities for autonomous ownership. In reality, limiting access to sensitive information lets you move without breaking things, which means you no longer have to slow down to sweep up broken glass and issue apologies. The principle of least privilege allows employees to explore and take risks with confidence. Making a proactive investment in organizing and safeguarding user information is well worth the time and effort.

You can build a simple web app for requesting and granting access to user data: Snap Inc., for example, uses a leasing concept for both employee and programmatic data access. As the name implies, the access is short-lived. Once the lease expires, your access is cut off.

If you’d like to do something similar, build a UI that sends email notifications to service owners. Require employees to request access every time they deal with customer data. Not only does this uphold the principle of least privilege, it also minimizes the risk of data getting copied or lost while establishing a record of who accessed what (which is important to have in case of an audit). When an engineer wants to whitelist their service to read or write some data, they’ll have to request a longer-term lease for that as well.

Privacy for the long haul

If there’s one takeaway from the recent explosion of front-page privacy scandals, it’s this: Your approach to privacy can make or break your company’s growth.

Don’t leave privacy for later or dismiss it as a set of stuffy processes meant for multinational conglomerates. Consider privacy from day one. If you haven’t yet, establish boundaries and policies for your customer data today. Privacy is not the opposite of moving fast. It’s the very thing that can enable you to scale your business without a major setback, retaining user trust every step of the way. When GDPR and similar regulations begin enabling users to take their data out of your systems and transfer it to competing service providers, you’ll have shown that you’re worth sticking with.

Is security the same thing as privacy?

Security

Privacy

Locate your sensitive data

Classify the data you collect

Personal information or user data

Personally identifiable information

Sensitive information

Regulated data

Set retention rules

Prioritize access control

Test data vs. production data

Auditing and logging

Build privacy into your product by design

GDPR

California Consumer Privacy Act

Use the principle of least privilege

Privacy for the long haul

About the author

About the author

Artwork by

Topics

Buy the print edition

Continue Reading

Security

Kevin Riggle

An introduction to approachable threat modeling

Security

Serena Chen

Design for security

Software Architecture

Lisa Phillips

Architecting privacy

APIs

Lucy Kerner

Ask an expert: How should organizations create and maintain threat models of API security risks?

Containers

Liz Rice

Containers in the keep

Containers

Amit Saha

Best practices for container compliance

Planning

Ayden Férdeline

Planning for privacy

On-Call

Ryn Daniels

Crafting sustainable on-call rotations

On-Call

Increment Staff

Ask an expert: How should startups approach on-call and incident response?

Explore Topics

All Issues

Planning

Mobile

Containers

Reliability

Remote

APIs

Frontend

Software Architecture

Teams

Testing

Open Source

Internationalization

Security

Documentation

Programming Languages

Energy & Environment

Development

Cloud

On-Call