LSI not only studies a document for keywords and includes it in the database it also studies a number of relevant documents and picks out words and phrases that are common or similar in meaning between these documents. Further it scans other documents to index them according to their semantic closeness with the relevant text or subject. Unlike regular keyword searches, LSI can determines how similar a document is to another or how relevant to the search topic it is. LSI thus estimates this interdependence and evaluates the relevance of a document on a scale of 0 to 1.
As a result of LSI the data retrieved by the search engine includes clusters of documents and will even often show up relevant documents that do not contain the keyword at all. |