- Toshiba Corporation (TOKYO: 6502) has developed an ultra-fast data search and matching technology that outperforms similar systems by a factor of 50. It can be applied to any data that can be represented as a high-dimensional vector, and its wide ranging applications include big data analytics of large scale media databases*1 and facial recognition—in experiments, the system recognized a single individual among 5,800 people in a photo database of 10 million images in only 8.31 milliseconds*2.
Advances in big data analysis continue to secure dramatic refinements in such areas as machine learning and failure prediction, bringing increasing benefits to daily life. However, data volumes continue to grow exponentially and to keep pace analysis and recognition capabilities must also accelerate.
Toshiba’s technology builds indexes of high-dimensional feature data*3 extracted from objects, including complex, multi-faceted objects such as the human face or representations of product sale patterns and stock prices over time. The database can be searched for pattern matches, and produces results at an unmatched, ultra-fast rate. This performance acceleration rests on three components, shown below.
- Vector Coding Technology: encodes feature of objects as short vectors, and maintains the shortest possible difference between the vectors.
- Vector Indexing Technology: recognizes similar vectors without any need to compute the distance between them.
- Pipeline Lookup Technology: a combination of both coarse and fine lookup
Vector Indexing Technology is an original technology developed by Toshiba. It builds groups of similar vectors, and so enables rapid identification of the group close to the vector in a query. It does not need to compute the distance between individual vectors and the query, realizing ultra-fast lookup of vectors.
Toshiba initially intends to apply the technology in three areas: pattern mining, media recognition and big data analysis. For example, pattern mining would allow a particular person to be identified almost instantly among a large set of images taken by surveillance cameras, while media recognition could be used to protect soft targets, such as airports and railway stations*4by automatically identifying persons wanted by the authorities.
Toshiba also hoping to support its clients and contribute to society by deploying this technology to new fields such as deep learning.
The company plan to release a new database product based on the new recognition technology and GridDB, its scalable database, that will enable ultra-fast processing of big data and large-scale media databases in fiscal year 2016.