It’s easily achieavable to perform a full text search on common format documents (like .doc, .pdf) stored in Alfresco repository. What’s more it’s also possible to get relevance of documents found against desired keywords.
A short example:
POST <host>/alfresco/service/cmis/queries Content-Type:application/cmisquery+xml Http basic auth required! <cmis:query xmlns:cmis="http://docs.oasis-open.org/ns/cmis/core/200908/"> 1. <cmis:statement>SELECT cmis:name, SCORE() FROM cmis:document WHERE CONTAINS('niceWord')</cmis:statement> <cmis:searchAllVersions>true</cmis:searchAllVersions> <cmis:includeAllowableActions>false</cmis:includeAllowableActions> <cmis:includeRelationships>none</cmis:includeRelationships> <cmis:renditionFilter>*</cmis:renditionFilter> 2. <cmis:maxItems>50</cmis:maxItems> 3. <cmis:skipCount>0</cmis:skipCount> </cmis:query>
A few important details:
1. SCORE() returns relevance of document found against expression passed to CONTAINS() function. In Alfresco you can use powerful FTS language as CONTAINS() parameter but it means losing compliance with CMIS-SQL standard.
- Info about FTS is delivered here: http://wiki.alfresco.com/wiki/Full_Text_Search_Query_Syntax,
- and standard CMIS-SQL CONTAINS() syntax is desribed here: http://wiki.alfresco.com/wiki/CMIS_Query_Language#text_search_predicate
2. , 3. Paging effect. It’s recommend to always use paging because Alfresco doesn’t guarantee that it will return all entries if there is a lot of them.
And that’s all, you will receive a neat list of entries meeting your requirements.