“Active” data collection

Information that people contributed with consent and for which they must make a dedicated effort; e.g., people filling out a survey or entering information into a database

“Passive” Data Collection

Obtaining information without people needing to do anything to provide it; for example, extrapolating information about a person’s mobility by observing the GPS data that his phone automatically tracks.

Big Data

(noun) numerous pieces of machine-readable information.

(verb) the process of mining structured (spreadsheets, timestamped, geotags) and unstructured information (paragraphs of text) and applying quantitative methods to massive datasets to identify patterns.


Information captured at a specific point in time and stored temporarily to speed up processes; for example, web crawlers, which provide search engines’ information, take frequent snapshots of websites to speed up the process of returning search results.

Digital Data

Pieces of information in machine-readable form; digital information is different from analog information because it is more generative, replicable, mixable, scalable,­ storable, accessible, and perpetual than analog information and its ownership is harder to determine.


Able to be used and incorporated across different Information- processing platforms (for example, interoperable systems are able to exchange information because they share definitions or syntaxes or are otherwise comparable).

Machine Readable

Information that is able to be categorized, recognized, edited, and used by a computer; datasets that are structured or formaHed in a computer language are easier for machines to import.


Information about data; includes the data type, the creator or source, the time/date.

Open Data

Information shared in a format allowing it to be manipulated.

Personally Identifiable Information (PII)

 This is information which can be used on its own to contact, locate or identify a specific person.


To use software or human copying/pasting to retrieve unstructured information from websites; usually with the computerized method of scraping, the software identifies the underlying structure in order to extract data from websites.