Navigation
 
Welcome

Program

News

Download

S3Drive - Amazon S3 filesystem

WinFUSE - Filesystems with .NET

Writing IFilters in .NET

Sample IFilter
Download the IFilter Template
 
Community

Contact

Deutsche Version

 
Writing IFilters in .NET
Startseite  \  Writing IFilters in .NET  \  

Writing IFilters with .NET

IFilters are used by search enginies and other applications like the Sharepoint server to read the content of files and documents. For each document type you need the fitting IFilter installed on your system.

For the common document types IFilters are available, see IFilter.org, IFitlerShop, Citeknet or Channel9.

The IFilter Explorer shows you all IFilters installed on your system.

There are a lot of code samples how to use IFilters in .NET, see:
First article about using IFilters, the start of everything ...,
... and the second one,
IFilter sample is not finished, but saves you a lot of typing,
IFilter at work, very good explanation,
dotLucene and IFilter usage,
IFilter for experts. Bypassung the COM stack and solving the Adobe IFilter problem.

The missing part is how to write IFilters easily. Of course there is some documentation from Microsoft, but it's heavy C stuff.

By 'accident' I found the solution in a huge project from Stephen Toub, he implemented all the needed stuff :-)

Based on his code I implemented an IFilter Template. You just have to implement two functions reporting the content of your document, that all. Look at the sample code for an IFilter reading TXT files how simple it is.

The IFilters based on the IFilter Template are really working fine, pass all tests provided by Microsoft, can be used with Sharepoint and MS Desktop Search and even with the new Windows Vista Search.

But there is one big issue with the IFilters bases on the IFilter template, they can not used by .NET applications. Upps.

The case is tricky.

In .NET you can call real/unmanaged COM objects, the framework will automatically create a RCW wrapper for it. Thatīs how reading the content of files is done, look the provided code sources above.

With .NET you can create COM objects, therefore the framework creates automatically a CCW wrapper that handles the transition from unmanaged code to the managed code, letīs call this COM objects, managed COM objects.

What happens if a .NET application calls a managed COM object?
In this case both wrappers, the RCW and the CCW wrapper should be created and handle the calls. But reallity show that the framework goes a shortcut and just makes a call from managed code to managed code. By skipping the COM interface the .NET framework enforces an excat match between the managed types. This will fail, shown an invalid cast error, because you can not garantuee that the writter and consumer of a IFilter uses exact the same signatures for the Interface definition.

I worked hard to find a solution, but I was not successful up to now. I guess in future may developers will face the same problem in different areas and hopfully find a solution :-))

The answer from Microsoft helpdesk wasn't very helpful either:
"It looks like our product group does not support calling the IFilter interfaces via .Net due to performance and dependency reasons. Given they do not support this for the current product, it's safe to say it's not supported for the currently shipping / legacy products either.
In general managed IFilter is not a supported scenario. Loading CLR is very costly to indexing performance and different IFilters could take dependency on different versions of .Net Framework. We currently have no plan to support it."
 
Sample IFilter
Shows how easy you can create IFilters with .NET

 
Download the IFilter Template
Get the template and write your own IFilter