(Wargaming the Asymmetric Environment), TIDES (Translingual
Information Detection, Extraction and Summarization), HID (Human
Identification at Distance), Bio-Surveillance; as well as programs resulting
from the first two areas of this BAA and other programs.
Repository Issues: The National Security Community has a need for very large
scale databases covering comprehensive information about all potential terrorist
threats; those who are planning, supporting or preparing to carry out such
events; potential plans; and potential targets. In the context of this BAA, the term
“database” is intended to convey a new kind of extremely large, omni-media,
virtually-centralized, and semantically-rich information repository that is not
constrained by today’s limited commercial database products -- we use
“database” for lack of a more descriptive term. DARPA seeks innovative
technologies needed to architect, populate, and exploit such a database for
combating terrorism. Key metrics include the amount of total information that is
potentially covered, the utility of its data structures for data entry and use by
humans and machines in searching and browsing, data integration, and
capability to automatically populate, and the completeness, correctness, and
timeliness of the information when used for predictive analysis and modeling in
exploiting the information in these repositories. It is anticipated this will require
revolutionary new technology.
The database envisioned is of an unprecedented scale, will most likely be
distributed, must be capable of being continuously updated, and must support
both autonomous and semi-automated analysis. The latter requirement implies
that the representation used must, to the greatest extent possible, be
interpretable by both algorithms and human analysts. The database must support
change detection and be able to execute automated procedures implied by new
information. Because of expected growth and adaptation needs, the effective
schema must be adaptable by the user so that as new sources of information,
analytical methods, or representations arise, the representation of data may be
re-structured without great cost. If distributed, the database may require new
search methods to answer complex, less than specific queries across physical
implementations and new automated methods for maintaining consistency. The
reduced signature and misinformation introduced by terrorists who are attempting
to hide and deceive imply that uncertainty must be represented in some way. To
protect the privacy of individuals not affiliated with terrorism, DARPA seeks
technologies for controlling automated search and exploitation algorithms and for
purging data structures appropriately. Business rules are required to enforce
security policy and views appropriate for the viewer's role.
The potential sources of information about possible terrorist activities will include
extensive existing databases. Innovative technologies are sought for treating
these databases as a virtual, centralized, grand database. This will require
technologies for automatically determining schemas, access methods and