Nick Woods is the former lead architect of Apple’s iCloud Photo Library. Starting at Apple in 2008, he worked on Mail for desktop and iPad until leaving for the startup Color Labs in 2010. After Apple acquired Color Labs in 2012, Woods rejoined Apple as a member of the Photos team. Between 2012 and 2015, he led several teams of engineers and managers working on the iOS Photos and Camera apps as well as on desktop and iOS Photos infrastructure. Today, Woods is cofounder and chief technology officer at Hazel Health, a San Francisco-based startup that provides virtual health-care clinics to students at K–12 schools.
Content has been edited and condensed for clarity.
Increment: How did you start working on the iCloud Photo Library?
Nick Woods: I joined the Photos team [in 2012]. Before iCloud Photo Library, I worked on Photo Search, Maps, and the way photos were grouped in Moments. The iCloud Photo Library project was something Apple had been attempting for a long time, but it had never really made significant progress, so I was there right from the beginning.
What did the architecture team set out to do?
Apple often works on projects for a long time and already had several cloud photo solutions, but it had not yet built something that could keep your entire photo library in the cloud. The most precious things that most of us have on our phones today are our photos and videos—moments in time that we can never recreate if lost. Safeguarding those precious memories while putting user privacy first—and maintaining compatibility across various platforms, legacy content, applications, and APIs—was a massive undertaking across many teams at Apple. [As architects,] we basically tracked everything that had been done in the past [in Apple’s cloud photo solutions] and [redesigned] it from the ground up.
What’s an example of infrastructure that you had to redesign?
We had done a lot of planning and prototyping [for the system] to be self-contained as iCloud Photo Library. In parallel, Apple was building a whole new cloud mechanism for third parties as well as internal users called CloudKit. You have teams at Apple that are building things in silos: A lot of different teams are pretty independent and don’t know what other teams are doing until you start building. In one of the overall design [meetings] for iCloud Photo Library, we came to know more about CloudKit, and it made a lot of sense to consolidate infrastructure. We changed direction a little bit to maximize our use of shared technology.
There were many challenges working that early with [the CloudKit] framework as one of the first large-scale consumers of its infrastructure. But this also provided [Apple] an opportunity to dramatically increase the performance and scaling abilities of CloudKit while adding important features to it, which only pushing it to its limits could have revealed. Live Photos [was] also being developed at the same time as iCloud Photo Library, and we had to make sure these worked and synced well across all platforms.
Of anywhere I’ve ever worked, Apple is one of the strongest proponents of dogfooding. Everyone in the organization, from entry-level to top executive positions, is very involved in testing its products and services. iCloud Photo Library was a major dogfooder of CloudKit, and so, in turn, was every user of Photos at Apple.
What was the biggest challenge you faced during the initial planning?
How we integrated the iCloud Photo Library with the existing photo apps on iOS and on Macs in a system that had been around for many, many years was really challenging, and we went into it thinking that it was going to be simpler than it was. We had to make sure it would work right out of the gate and work in a backward-compatible way.
Photo and video content [gets] used in so many different applications and in so many different places. We had applications that existed with local APIs that already could access your Camera Roll, for example. These apps needed the ability to suddenly access a photo that now might not be present on the device; you have to download it from the cloud, [and] you might not have network connectivity.
Another goal we had with iCloud Photo Library was to share as much code between internal native Apple platforms as possible—like iOS, macOS, watchOS, tvOS—including APIs as well as daemon processes and other photo image and metadata-processing algorithms. We had to keep these in mind, along with the different resources and capabilities of each platform, while creating the layers of abstraction needed to support them and maximize code sharing and reuse. [It’s an example of] cross-platform support, a critical effort you’ll find at almost any consumer-focused company today.
Working with the iCloud Photo Library from the start allowed me to always keep the 10,000-foot view in mind while still being able to get into the details of a block of code. The 10,000-foot view was really about stepping back and evaluating the entire ecosystem of applications that used photos, and really making sure that we [were working] across everything pretty seamlessly.
What were some of the ways you designed iCloud Photo Library to protect user privacy?
Apple has taken a strong customer privacy–first stance across its products. The result of that policy is that the servers storing your Photo Library content don’t have access to decrypt or analyze your photos and videos, which means more data processing needs to happen on end-user devices. Apple’s servers really can’t access the image data at all, so any processing—even making thumbnails, doing image analysis, detecting faces, or dogs, or looking at the geographic data as part of the photo, and the metadata—all of that is done on the devices rather than on the server.
For reference, iCloud Photo Library is a partial sync–based solution: It tries its best to keep metadata and photo thumbnails for content locally on each device, when possible, to make the browsing and search experience optimal no matter the current state of the network. The photo apps also support nondestructive edits by default, which means that sometimes only edits need to be synced, or other metadata changes such as favoriting a photo or adding one to an album.
A lot of privacy and security protections at the data level really come down to who has the keys to decrypt data and where the keys are. All of the data, both photo image data and metadata, is broken up into chunks that are securely encrypted and stored on servers. When they’re sent [to the device] they’re encrypted again, so they are double encrypted in transit. But where the keys are to decrypt the data and how things can access it is restricted. Only once it gets back to the user’s device does the data get decrypted, [and] the device can access all of the metadata about the photo.
This can have some limitations: There are some things you can’t do as well, or maybe there is more work distributed across devices. You might have to redo taking a geographic point and turning it into a named location, like Golden Gate Park. You might have to do that on all of your devices instead of doing it on the server once, but it keeps the server from having as much access to data, which, if something gets compromised, is also a great place to be.
How is this different from the way other companies might store or access users’ photo data?
At an architectural level, Apple is making sure that the data, and who has access, and what machines access it are all very controlled. The simple example [of] turning a geographical point into a named location is called reverse geocoding. Every photo you take, pretty much on most cameras these days, has a GPS coordinate (a latitude and longitude record inside of it) so you can send that point to a server. You give them a point, and you get back a whole list of data about that point. You can pinpoint pretty well, given a point, a lot of data you can look up about a photo.
Apple, in this latest release, added the ability to restrict the geographical data when sharing. You can share a photo with somebody, [and] they can take that photo [and] see where a location is and also look up data about it. [Most companies] would have access to that metadata on the server. Apple—if you had two different devices—would do that lookup for each device and do its best to obscure the way it is looked up, so you can’t trace it back.
It sounds like Apple built in user-privacy protections while designing iCloud Photo Library, rather than responding to concerns reactively.
I still think we’re only at the beginning of fully understanding the impact of social media and photo sharing. [At Apple,] as both a small team and as an organization up the chain, people recognized how sensitive photo data can be. There were plenty of headlines where it sounded like data security was breached and a celebrity’s photo got leaked. Knowing the sensitivity of photos, we restricted access as much as possible from the get-go. I do believe privacy is a core fundamental belief at Apple, and it has an approach for privacy that also happens to work well with the business model. Most of the company’s money comes from the products and services, so it can afford, at a business level, to be much more privacy-focused. We didn’t need to sell user data, so we didn’t have a need for the user data.
How do you see the architectural design for similar systems evolving today?
Almost always in security systems these days, passwords and a lack of passwords are the weakest link. One of the things Apple has done is slowly make [passwords] almost invisible. Before Touch ID existed, most people I know didn’t even have a PIN code on their device because it was a pain. Forcing things like PIN codes on devices and having users unaffected by that bit of security [through features like Touch ID and Face ID] is really amazing.
I have a feeling that some of the principles employed by iCloud Photo Library to safeguard privacy will become more commonplace as we move forward. At a lot of companies, data is not stored in [such] a way that it is [adequately] protected. As much as you trust everybody, you almost have to operate under the assumption that anyone could at any point act maliciously. I do think more apps and more places are looking at ways they can encrypt data: Where are the keys? Who can decrypt data, and who should decrypt the data? Secure messaging apps are a really good example, where nobody has access to a key except the people involved in a message thread.
[As of now,] a lot of people collect as much data as they can in the hope that they will make use of it later. But I think more people are [now saying], “Hey, maybe we don’t have to have access to all of the data. Maybe we have access to a sampling of the data, and we do more encryption.” Because the more data you have, the more data can be breached, leaked, compromised.
In regard to systems design, what are some best practices for storing and protecting data?
Some things are a pretty obvious standard of practice, like you never store a password in cleartext in the database. You either hash passwords or you encrypt to try not to have access to data, so even an internal person can’t compromise it. But [a security issue] that’s maybe less obvious is when passwords are stored in a logging system. If [developers] request to have a username and password to debug, but then log that information without stripping the password, even though the data’s encrypted over the internet, they’ve now put everybody’s password in their log.
When it comes to privacy and security, I would always ask, “Do I need this data?” If you don’t, try not to send, store, or expose it. I would be very thoughtful around that, and also where you store and process data. [I would] think a lot about weak links in the system, like user passwords. Is it an employee that has administrative access to everything? How do we restrict access? How do you limit data when there’s a leak? [How do] you minimize the way [it gets] compromised? That includes everything from not using one key for all your encryption [to] encrypting passwords or hashing passwords, or never having access to them.
[At Hazel Health,] we’re storing medical data, which is incredibly sensitive. There are photos, videos, and things that are very personal as part of medical exams. We encrypt and store data a little bit differently than we did at iCloud Photo Library, but [we] apply some of the lessons I learned there about being very cautious of security, authentication, and data encryption.
How has your experience at Apple informed the way you design systems?
When you own the entire platform, and when you own the entire delivery mechanism, you can build things that are very tightly controlled, that you can really tailor to the needs of [a client] like a school.
In software, there are often times you have a certain task to complete, and you build the software just for that. But [if you] make something a little more general-purpose, it [can] live for a lot longer. Considerations I keep in mind often are: How generic should we make this? Is this piece of software for future growth? How much are we going to change it if we ever need to?