For years, administrative records have been used to support freight analysis. For example, administrative trade statistics derived from administrative records of the Department of Homeland Security (DHS) United States (U.S.) Customs and Border Protection (CBP) support a number of uses. CBP captures information in paper or electronic form at U.S. ports of entry, exit, or clearance. Electronic information is captured through the Automated Broker Interface for imports and through the Automated Export System for exports.
Together, these systems are known as the Automated Commercial System, which was recently replaced by the Automated Commercial Environment (ACE) to serve as the primary system for reporting imports and exports (Walton, et al., 2015).
The U.S. Census Bureau’s Foreign Trade Division verifies, processes and distributes the data after collection by CBP and this information is used by other federal agencies for their own needs (Walton, et al., 2015). For transportation, this information creates:
- Foreign trade statistics
- The North American Transborder Freight Data (Transborder)
- U.S. international maritime trade
- U.S. transportation-related goods and overall trade data
- Carrier-Based sources, published by the Research and Innovative Technology Administration (RITA) as the Air Carrier Statistics
- Maritime data from the Journal of Commerce’s Port Import/Export Reporting Service (PIERS)
- Special periodic surveys (e.g. Canada’s National Roadside Survey)
- Shipper-Based Sources such as the Commodity Flow Survey (Walton, et al., 2015)
Much of this information provides data inputs for the Freight Analysis Framework data published by the FHWA and Bureau of Transportation Statistics (BTS). The FHWA Freight Analysis Framework (FAF) relies on the inputs of administrative records to help round out the understanding of the commodities, trading partners, value, tonnage and mode (FHWA, 2017). Administrative records such as Surface Transportation Board Waybill samples have also been used to understand railroad business and to identify both trade zones and tonnage by rail.
A new source of administrative records is the American Community Survey (ACS) Public Use Microdata Sample (PUMS) files. These are a set of un-tabulated records about people or housing units that are available to users for the creation of custom tables not currently provided by the Census Bureau. The confidentiality of respondents is protected, but this data does provide a rich source of information about households and the people in them (Census, 2017) (Census Bureau, 2017). Analysts can pull information about households and occupants to potentially understand supply and demand within a particular region and make some assumptions about freight flows, needs and forecasts.
With the emergence of big data and new data analysis tools to help link and visualize data such as the growing use of Geographic Information Systems (GIS), administrative records can be used in increasingly new ways to help illuminate freight movements. Data from records can be gleaned using new computer-based algorithms and linked to other data sources to help provide stronger origin and destination understanding, as well as more detail on shipping characteristics. For urban planners, this detail can help inform the understanding of how freight moves in the urban area or depends on urban infrastructure and facilities.
- Regulatory Environment: May hinder data access and use depending on the source.
- Ownership: May be tightly controlled depending on the source.
- Privacy: Often contains personal information that must be protected.
The information administrative records may provide can feed numerous policy needs from transportation investment to economic policy. There could be specific regulatory or privacy issues to be addressed depending on the source. First, these records, depending on the source, may not be public documents. Administrative records may be difficult to obtain depending on the institutional arrangements needed to obtain and review the records. For example, another agency may have the record, but since it was not intended to be used for freight analysis, it may be difficult for the case to be made that the provision of records is of value. This may be the case with economic development records or property records.
Privacy and confidentiality is a concern when proprietary data is involved and cannot be protected by the public sector (Schmitt & Tang, 2011). Third party arrangements or use of anonymized data can help to alleviate these issues. Additionally, if the records are public such as CBP records, it is necessary to understand how the data was intended to be used and protect any information that is proprietary or not for public release. For example, the reason for collection or the regulation behind the collection should be understood.
- Capacity: Fairly easy to work with data but data need to processed and organized for use.
- Stewardship: Straightforward.
- Equity: Data typically representative of population of interest.
Key considerations include agency capacity to collect records and analyze the data. This type of work is not always automated or done in traditional data analysis or visualization tools and may take significant time in data conflation, cleaning and organizing. Capacity to provide this, as well as financial and analytical resources should be considered.
- Completeness: Data can be a narrow slice of information needed for analysis
- Accuracy: Raw data is typically accurate.
- Verifiability: Raw data may not be available.
- Dynamism: Time from capture to analysis can be quite long due to requirements for data cleaning and verification.
- Durability: Data are subject to funding uncertainties.
There are technical challenges that must be faced when using administrative records. Often, the data is large and complex qualitative information generated for other specific purposes (Connelly, Playford, Gayle, & Dibben, 2016). Because they are the product of another purpose, analysts usually have no input into the design, structure or content, and the data can be large, cumbersome or messy to evaluate and organize (Connelly, Playford, Gayle, & Dibben, 2016).
Another consideration is that these data can be a narrow slice of information that needs to be fused with other data to provide a comprehensive picture for understanding freight. It is often necessary to link the data, but the lack of common variables among various data sets can hinder the fusion process. This exercise presents the potential for error and bias and also presents challenges in how to clean, smooth or organize the data (Schmitt & Tang, 2011). Navigating fragmented data is challenging to ensure data integrity.