Value Proposition – Computer Vision Video Analytics
Automatic license plate recognition (ALPR) was one of the pioneering applications of computer vision for ITS, and is used in highway tolling as an enforcement aid to complement tolling transponders. Computer vision technology also plays many other roles in improving productivity and safety in traffic management and other transportation operations. For example, surveillance cameras with machine vision functionality have been mounted along freeways and main intersections for public safety, traffic incident detection, ramp metering, and traffic signal timing.
Computer vision also has a significant role in advanced driver assistance systems (ADAS), particularly for heavy vehicles such as semitrailers and other trucks (Guan et al, 2012). Since commercial carriers are generally more cognizant of the costs of crashes and the impact to their business’ reputation and bottom line, they have been the most aggressive adopters to date of lane departure warning and forward collision warning systems, many of which include computer vision technologies.
Once visual data is acquired by the image sensors, it is processed and analyzed by software to enhance image quality and extract key features for either object detection or identification as required by the application. Visual data can be in the form of a single still image, multiple images, or consecutive image sequences, also known as video.
To minimize computational complexity, many applications of machine vision segment video data into a background image and a moving object image. Background images tend to be motionless over a long period of time, and moving object images only contain foreground objects. Change detection (Kim et al., 2001; Foresti et al., 1999) is the simplest method for video segmentation. The figure below shows detection and tracking of moving vehicles by using computer vision.
When more than one camera is used, the technique is known as stereo vision, and the distance between each of the two cameras and the object is used to determine three-dimensional features. Since a full spectrum of information may be analyzed and retrieved from an image, e.g. colors, shapes, patterns, and depths, computer vision systems have long been viewed by technologists as a promising approach for a multi-purpose data acquisition or sensing solution.
- Regulatory Environment: Facilitates data access and use.
- Ownership: Not tightly controlled.
- Privacy: Current applications typically protect private information.
No specific regulatory law addresses computer vision systems. When the cameras are used to focus on particular vehicles for applications such as classification, resolution of imagery tends to be the minimum requisite and does not raise privacy concerns as vehicle signage or lettering is not visible.
The digital pictures are often enhanced by license plate recognition, where the digital picture of a license plate is digitized and compared against a database of license plate numbers and letters associated with particular vehicles and their owners. Most of the decisional law regarding privacy and recording license plates has not determined a license plate to be private information. The argument is that a license plate cannot be private because it is affixed to the exterior of the vehicle where it can be seen by anyone (Glancy, 2004). However, if a camera were to capture an image of the face of a driver or passenger, then the privacy of the individual photographed could come into question. However, computer vision technologies are usually tailored to specific types of applications (ex. classification), so agencies have ways to mitigate such concerns.
- Capacity: Special skills and significant computing resources required to work with the data.
- Stewardship: The volume and nature of data require significant storage and management capabilities.
- Equity: Data and analyses are representative of those vicinities, road segments and types of vehicles that have been sampled, however it is easy to develop samples across the entire transportation network.
The accurate interpretation of images from a computer vision-based system requires a sophisticated understanding of application requirements and the ability accommodate potential variation of a number of environmental variables, such as varying light levels. The development of complex algorithms for different applications often requires new analytical approaches, as well as extensive software development skills specifically related to machine vision. These skillsets are unique, and usually outside of the scope of skills available at a transportation agency. As a result, agencies may need to partner with private sector vendors of machine vision software, or academic institutions whose staff or students have the unique experience needed to create or modify machine vision software.
Computer vision technology requires significant computing resources (BITRE, 2014). Data processing and analytics in computer vision systems is usually intensive and requires large amounts of computational resources and memory. For example, a simple camera with 800 x 600 pixel resolution is able to capture more than one megabyte per second without image compression (image compression algorithms require additional computational resources). For many ITS applications, this massive amount of data needs to be processed and analyzed in a fairly timely manner. For applications that are not time sensitive, all this data needs to be stored for post-processing, and it is therefore no surprise that many vision-based systems are usually equipped with significant processing memory and data storage.
- Completeness: Data gaps exist due to functionality issues.
- Accuracy: Limitations due to functionality issues.
- Verifiability: Information extraction algorithms hinder verifiability.
- Dynamism: Time from capture to analysis is lengthened due to enormity of data processing requirements.
- Durability: The many promising applications ensure its future stability as a source of data.
Computer vision systems suffer from limitations in completeness and accuracy. The systems may not function or suffer impaired functionality under some lighting conditions (e.g., dusk, dawn, darkness) and inclement weather like rain, snow, or fog (Guan et al, 2012). Another technical challenge in urban environments includes partially or fully hidden vehicles in dense traffic, for example, a car partially blocked from view on the far side of a semi-truck may not be recognized. (Bush, 2011). Since many transportation applications of machine vision operate outdoors, they can be very susceptible to illumination variation such as shadows or other low lighting conditions. For example, in segmenting foreground objects from the background, color and shape are often considered to be the main attributes to compare against an existing stored image, but both color and shape can be highly affected by illumination conditions and viewing angles. Illumination variation further complicates the design of robust algorithms because of changes in shadows being cast. For example, a functional tracking algorithm for vehicles may fail due to the frequent alternation of direct light and shadow of high-rise buildings in urban downtowns. Illumination variation remains the main obstacle for robust computer vision-based ITS applications to overcome. Pairing a traditional camera with other sensors—radar, or especially an infrared-enabled camera—will improve object and pedestrian detection accuracy in a variety of lighting conditions (Iwasaki et al, 2013).
Verifiability of the source data is an issue, particularly when sophisticated machine learning algorithms have been applied to extract the required information, because of the “success rate” of correct identifications of vehicles. Historically most validation and verification is performed manually which can be prohibitive.
In terms of dynamism, the time from capture to processing is dominated by the enormity of data processing requirements. Machine vision as a transportation data source will have excellent durability because of the many applications of computer vision systems for traffic management, weigh-in-motion commercial vehicle inspections, security, parking, border control and other transportation purposes.