To be to able to answer the research question of this thesis, certain steps are taken. A literature review is performed to find algorithms and methods to approach the problems that arise in this experiment. Also a detailed problem description is desirable to work from in the following experi- ment. Than the actual experiment starts. First off, the news article database is retrieved from the MD Info site. Then the data is prepared to find the most representative words for each subject. The data is filtered from reading signs and stop words, which are words which are filtered out prior to, or after, processing of natural language data. Then with these representative words, for each subject the optimal text query for inside the Sub Heading and the optimal free text query inside the complete article database are retrieved. After the optimal queries are found, the results are evaluated.

Kaymak, U.
Economie & Informatica
Erasmus School of Economics

Ratha, J.G. (Joshua). (2010, August 26). Finding optimal text queries to cover Subjects in a taxonomy of a news article database. Economie & Informatica. Retrieved from