Confluence Docs 3.3 : How do I disable indexing of attachments
This page last changed on Dec 21, 2009 by ggaskell.
Sometimes a user can experience problems indexing large MSExcel or MSPowerPoint documents and the reindexing may cause potential Unknown Ptg warning messages that are harmless. There is already a request to Suppress these warnings from the re-indexing of unreadable documents by the POI library. The error is usually not serious yet can sometimes cause problems when large attachments are used. So you may like to disable indexing of a particular type of document. To do this, you can use one of the methods described below. Method 1: Using the Administration ConsoleYou can disable the relevant modules from the Attachment Extractors or Office Connector plugins, by going to Administration -> Configuration -> Plugins and disabling the relevant plugin modules:
Method 2: Editing the atlassian-plugin.xml files of pluginsYou need to modify the content of the atlassian-plugin.xml file in the following JAR files and comment out the relevant file type extractor:
Both of these JAR files are located in the confluence\WEB-INF\classes\classes\com\atlassian\confluence\setup\atlassian-bundled-plugins.zip file. If you are unfamiliar with modifying JAR files, please refer to the Editing Files within JAR Archives document for further information. You can identify file type extractors in atlassian-plugin.xml files by the occurrence of ContentExtractor in their key attribute.
The example below shows a pdfContentExtractor disabled which would prevent PDF attachments from being indexed. <atlassian-plugin key="com.atlassian.confluence.plugins.attachmentExtractors" name="Attachment Extractors"> <plugin-info> <description>This plugin extracts searchable text from various attachment types.</description> <version>1.1</version> <vendor name="Atlassian Pty Ltd" url="http://www.atlassian.com/"/> </plugin-info> <!-- <extractor name="PDF Content Extractor" key="pdfContentExtractor" class="com.atlassian.bonnie.search.extractor.PdfContentExtractor" priority="1100"> <description>Indexes contents of PDF files</description> </extractor> --> </atlassian-plugin> The following table shows the file type extractors in the atlassian-plugin.xml of the OfficeConnector-x.x.jar file, which require commenting out to prevent indexing:
|
![]() |
Document generated by Confluence on Jul 09, 2010 01:11 |