Confluence 4.0 : How do I disable indexing of attachments
This page last changed on Jul 05, 2011 by halatas@atlassian.com.
Sometimes a user can experience problems indexing large MSExcel or MSPowerPoint documents and the reindexing may cause potential The error is usually not serious yet can sometimes cause problems when large attachments are used. So you may like to disable indexing of a particular type of document. To do this, you can use one of the methods described below. Method 1: Using the Administration ConsoleYou can disable the relevant modules from the Attachment Extractors or Office Connector plugins, by going to Browse -> Confluence Admin -> Plugins -> Manage Existing and disabling the relevant plugin modules:
Method 2: Editing the
|
![]() |
Once the |
The example below shows a pdfContentExtractor disabled which would prevent PDF attachments from being indexed.
<atlassian-plugin key="com.atlassian.confluence.plugins.attachmentExtractors" name="Attachment Extractors"> <plugin-info> <description>This plugin extracts searchable text from various attachment types.</description> <version>1.1</version> <vendor name="Atlassian Pty Ltd" url="http://www.atlassian.com/"/> </plugin-info> <!-- <extractor name="PDF Content Extractor" key="pdfContentExtractor" class="com.atlassian.bonnie.search.extractor.PdfContentExtractor" priority="1100"> <description>Indexes contents of PDF files</description> </extractor> --> </atlassian-plugin>
The following table shows the file type extractors in the atlassian-plugin.xml
of the OfficeConnector-x.x.jar
file, which require commenting out to prevent indexing:
Type of attachment |
File Type Extractor |
---|---|
Word 97/2007 ( |
<extractor name="Word Content Extractor" key="wordContentExtractor" class="com.atlassian.confluence.extra.officeconnector.index.word.WordTextExtractor" priority="1099"> <description>Indexes contents of Word 97/2007 files</description> </extractor> |
PowerPoint 97 ( |
<extractor name="PowerPoint 97 Content Extractor" key="ppt97ContentExtractor" class="com.atlassian.confluence.extra.officeconnector.index.powerpoint.PowerPointTextExtractor" priority="1099"> <description>Indexes contents of PowerPoint 97 files</description> </extractor> |
PowerPoint 2007 ( |
<extractor name="PowerPoint 2007 Content Extractor" key="ppt2k7ContentExtractor" class="com.atlassian.confluence.extra.officeconnector.index.powerpoint.PowerPointXMLTextExtractor" priority="1099"> <description>Indexes contents of PowerPoint 2007 files</description> </extractor> |
Excel 97 ( |
<extractor name="Excel 97 Content Extractor" key="excel97ContentExtractor" class="com.atlassian.confluence.extra.officeconnector.index.excel.ExcelTextExtractor" priority="1099"> <description>Indexes contents of Excel 97 files</description> </extractor> |
Excel 2007 ( |
<extractor name="Excel 2007 Content Extractor" key="excel2k7ContentExtractor" class="com.atlassian.confluence.extra.officeconnector.index.excel.ExcelXMLTextExtractor" priority="1099"> <description>Indexes contents of Excel 2007 files</description> </extractor> |
![]() |
Document generated by Confluence on Sep 19, 2011 02:50 |