7 Data Dissemination and Discovery

7.1 DEPLOY RELEASE INFRASTRUCTURE

Data dissemination may be handled through a repository or an archive. An ongoing project may also take a more hands-on approach to dissemination – with staff and/or a Web site as a contact point. The latter will require development of an infrastructure.

7.2 PREPARE DISSEMINATION PRODUCTS

A variety of dissemination products may be generated at multiple points in a project. These may include raw or processed data, summarized data, tables, graphics, and datasets and scripts for various statistical packages. The latter may involve restructuring both data and metadata to fit the underlying data models of the target packages. A variety of dynamic applications may be produced, including online or standalone visualization or analysis tools. Application programming interfaces (APIs) may be developed to allow external software to directly access the data. Access control and licensing policies may apply in the preparation of these products.

7.3 DEPLOY ACCESS CONTROL SYSTEM & POLICIES

Each of the dissemination products may need different access control polices and systems to apply them. Raw data might need to be accessible only under strict confidentiality terms. Summary data or graphics might have more lenient terms. Applications might need to have the policies built in and thoroughly tested.

7.4 PROMOTE DISSEMINATION PRODUCTS

Use of the data depends on people finding them. Citation of the data in publications is one traditional method of promotion. Ensuring that detailed, well-structured metadata are available through search mechanisms is another. Creation of persistent Digital Object Identifiers (DOIs) will enhance the ability to locate the data.

7.5 PROVIDE DATA CITATION SUPPORT

Persistent identifiers linked to the current source of the data will ensure that the data can be cited, that statistics about citation can be computed, and that the data can be found from the citation. The DataCite organization (http://datacite.org) provides one such mechanism.

7.6 ENHANCE DATA DISCOVERY

In order to make variables and questions more discoverable, they may be tagged with metadata. This tagging may occur prior to a wave. Retrospective analysis may also reveal the need to refactor, leading to changes in the way variables and questions are grouped. The organization curating the data may undertake some of these activities. Archives may create metadata such as catalog records for searching, index those records with subject terms, and prepare metadata for variable level search. As datasets grow larger, it may not be possible to transfer them easily – or even at all. Online tools to extract, summarize, analyze, and visualize data may be required.

7.7 MANAGE USER SUPPORT

Complex data, or simple data about complex topics, may require providing support to those trying to reuse the data. In a large study those users may be part of the project. A support infrastructure may be needed. Sophisticated analysis methods employed in a study may require specialized expertise.

GLBPM Overview

1 Evaluate and Specify Needs 2 Design and Redesign 3 Build and Rebuild 4 Collect 5 Process and Analysis 6 Archive and Preserve and Curate 7 Data Dissemination and Discovery 8 Research and Publish 9 Retrospective Evaluation

Download the full article as PDF

The GLBPM repository on Github