Community Data Privacy Toolkit (CDPT)
The Community Data Privacy Toolkit (CDPT) is a set of web-based data management tools for privacy-first, user-friendly data input, archiving, and sharing among social sector organizations. It includes tools to produce and manage semi-anonymous personal data and geodata and to view and display such data, many of which are key to advocacy work—for example, recording the time and location of pollution events or human rights incidents without revealing specific enough location as to put individuals at risk. The project team included Public Lab’s Jeffrey Yoo Warren (Research Director) and Sagarpreet Chadha (Coding Fellow).
With increased focus on location privacy in the wake of last year’s New York Times report (“Your Apps Know Where You Were Last Night, and They’re Not Keeping It Secret,” image above), it’s clear that there are many reasons for the abuse or misuse of location data. Corporate and government data use must be constrained and responsible. We have explored how location data, so useful in coordinating peer-based community strategies, may be used in systems that enable a structural approach to location privacy.
Social sector organizations developing advocacy campaigns and work often face a difficult task: to highlight and present rigorous data about issues of concern, while preserving the privacy and anonymity of participating community observers. From human rights abuses, to environmental issues and personal health data, many kinds of data can be powerful and persuasive in building accountability, but many reports involve risks on the part of the observer coming forward.
To address these problems, we’ve prototyped a privacy-oriented location sharing system called Leaflet Blurred Location, for entering geographic locations at different levels of precision, or “blur”. This project will be expanded to provide means for browsing and analysis using “blurred” data, all built on standard open-source geodata libraries such as Leaflet. Additionally, we developed a new paradigm for location privacy which we’re calling blurred location and a model for variable location privacy. We use both these terms to refer to systems that set out to share or store locations to different degrees of precision.
We have implemented this prototype system that allows for some location sharing to enable community scientists to coordinate regionally, while not requiring them to share high precision location that might expose them to risk. The keys here are:
Together, these articulate a model that is simple to use and understand, as well as universal enough—and powerful enough—to be implemented in real-world web applications.
Beyond the challenges of hosting or exchanging such data, we see an opportunity to more substantively engage those reporting data in the curation of their own level of privacy at the time of data entry, prioritizing participant agency rather than collecting at high precision and filtering out private data after the fact. This is a clearer framework for consent while reducing the risks involved when organizations must store, manage, and transmit private data.
In sum, this “variable resolution geodata” can enable rigorous data collection and coordination without unduly sacrificing individual privacy, all without having to reinvent the wheel whenever an organization needs to work with such data. With an eye to such broader adoption, we have focused on developing a reusable, modular tool set which can be easily integrated into existing platforms rather than a single platform users would have to join.
In addition to a well-documented set of GitHub code repositories, we have released the different software components as standardized Node modules on the NPM registry, provide example code and live demos. We will also provide examples and guidance for the integration of these libraries into other platforms, and cooperate with and support other project teams seeking to make use of these tools.
The key components of the toolkit will include:
Links to software libraries developed:
Public Lab has developed a strong community model for project development which emphasizes long-term sustainability through open source code and community engagement. The result is a workflow in which projects are developed in public, and integrate outreach and documentation from the earliest stages in order to assure a healthy and growing community of contributors around each of our projects. With this basic approach, we worked with long-time community contributors to develop core components, while continuing our software outreach strategies to ensure long-term project maintenance, and this approach to open source software authorship worked well for this project.
The greatest challenge we identified is in the integration phase; while we looked to provide a robust and modern UI/UX for producing blurred locations, we found that making it easy to install in various platforms with different UI libraries can be challenging for those seeking to install the library. Additionally, we found it difficult to arrive at a concrete spec that was very detailed, because many of the constraints needed to build upon will come from adoption by third-party developers seeking to use blurred location, and this will take time to develop a community around. For the moment, we have collected parameters and developed a concept for a specification where we highlighted the relevant factors to consider so that we can iteratively develop a specification with input as it becomes available.
At this point, a series of presentations at privacy-oriented developer events would do a great deal to both spread the word about this toolkit, as well as to solicit input and feedback to refine it. The domain of the problem is very wide in cultural, legal, and ethical terms, but very narrow in terms of technical implementation, and so we found that while people are receptive to the ideas and tools we’ve developed, finding the right people to lead implementations in new privacy-requiring applications is a challenge, no matter how easy we’ve made it to use.
Blog post with technical details: https://publiclab.org/notes/warren/09-27-2019/blurred-location-and-variable-location-privacy