In measuring level of similarity between ketwords and document use vector space model algorism with cosine similarity method. From the value calculation result with TF – IDF, the calculate value document similarity will undertaken. The pattern discovery phase is valuing phase. Text transformation exsited from filtering phase. Then, the stem result will analyzed with counting word value and suitable keywords. Text preprocessing phase existed from text cleaning and sentence fission to be words (tokenizing) so that be a steam. Text mining technique has three important phase that is : text preprocessing, text transformation, and pattern discovery. This reseach uses text mining technique with vector space model algorism to measuring similarity seacrh result toward document. And expected to help the user in find information with suitable the order. The development of web technology gives convinience in accessing and searching content that relate with encyclopedia content.

Nowaday with development of internet technology so the hadist encyclopedia based on web is built and then this application contains searh engine. The thickness of the encyclopedia makes access process difficult. Many encyclopedia is formed as book that contains the collection of hadits even in thousands. There are a lot of hadits scattered in holy books that looks complex in categorizing, therefore media is needed to collect it.