Jerry_D

I have a large document archive consisting of scanned files in TIF format,
where all text has been OCR-ed. With Windows XP I was using the Windows
Desktop Search ver. 2.6 that was neatly handling the OCR-ed text.

After updating to Vista, that is no longer the case, in spite of my having
selected "index properties and file contents" option for TIF files. Is there
a way I can get this to work


Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Eric Wolz - MSFT

With Vista (WDS 3.0) introduced a tighter security model. Indexing documents are now preformed under a process with restrictive rights. Some old filters use to create temporary files which is no longer allowed. Because of the new security model, some of these legacy filters will no longer work correctly. My recommendation is to see if there is an updated version available that will run under Vista.




Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Jerry_D

The filter used is (and was) the one provided by Microsoft - MODI. Now I am using Vista Ultimate and Office 2007 (which installed the MODI filter version 12). One would think that Microsoft would take care of the pertinent security issues. If not, my archive of several years, created using a Microsoft product (Microsoft Document Imaging) suddenly becomes unsearcheable due to an upgrade of the very same Microsoft product. Surely that can't be allowed to happen



Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Eric Wolz - MSFT

We are working with the imaging team to get an updated version available. Thanks.




Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Jerry_D

Many thanks for your answer and the hope it gives .

How soon is it reasonable to expect the updated version Should I look for alternatives in the meantime, or is it going to be so soon that I should wait The reason I ask is that I really have to use my archive daily, and not being able to search is a major problem. The alternatives that I see are somewhere between writing the code myself (which I wouldn't know how to do) and converting gazillions of files to another format like PDF. Not very easy. So, if it's a question of being patient for a couple of weeks, I'd rather not embark on any adventure...

Thanks in advance for your kind consideration.





Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Eric Wolz - MSFT

I was just working with them costing the work items, so the time frame of the release is not in weeks.




Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Jerry_D

Thanks for the clear answer. I do feel cheated though. By Microsoft, I mean.



Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Keith Passaur

I have run into the same thing, I have deveoloped a program that is based on text searchable tiffs and now I don't have a search engine for it.

What I have found is two solutions for you. However they both required that you purchase something. What you can do is use dtSearch to search for the text. It is not all that difficult to setup and the program is not that expensive for what it does. What you will find is that the search will be much quicker than what you are used to.

The program will do all kinds of searchs, fuzzy, steming, proximity etc. The only catch is that when using documents that are text searchable tiffs is that you have to open the document. (which you are used to anyway). With pdf's you don't have to wait for it to open you can view the hidden text and then open it. (the words will be highlighted) It is my understanding that it is used in more than five hundred different document imaging systems. (it is very big in the legal market as it searchs about one terabyte per second)

Anyway, if you go this route, what you will need to do is change the settings to index tif files (by default it does not) and then add tif as a special file type and enter use Ifilter for it.

The other option is one that I sell, it was designed for a different use but it will export the text from your tif to a text file. it is called Identify Docs that does this and it is availble at www.edocfile.com. This program will batch convert all files in a folder structure to PDF's, Text Searchable Tiffs, and Text files. It is for law firms doing discovery.

It uses the OCR engine that is in Microsoft Document Imaging to do this. So, if they are already done, you can just run it and it will recreate everything you have with two additional docment types, PDF and Text. You will never be stuck again with this option.

With the text files it will create one per document or one per page. It will also bates stamp the document for you. What it won't do is align the text on a page if creating a PDF. It just places the text on the page.

Anyway, this would quickly resolve your issue without going broke.





Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

M Fairley

Cheated is not the word. We have been thoroughly screwed by a Microsoft development team that have not thought out their product. They obviously want us to convert all our TIF images to PDF and use a PDF ifilter. BTW - the same problem occurs in the new SharePoint as well as vista.



Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Jerry

Any progress



Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Steve Meredith

I have been impacted by this as well. I have hundreds of .tif files that were searchable on XP and are now invisible on Vista.



Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

dwdp

Since the problem is Vista security sytem I suggest you can disable the strict User Control to see if the iFilter works without this restriction. (As a temporary solution waiting for MS to realease a fix)

Another solution is abandoning WDS (MS Windows Desdtop Search) and installing GDS (Google Desktop Search). GDS itself does not index text file in TIF files but you can follow instructions on this page

http://www.ifiltershop.com/google-desktop-search-plugin.html

to take use of tiff ifilter to perform indexing.

I still use XP so I cannot test the solutions myself.

Let us know if one of these fixes works fine or you find another solution.

Regards.




Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Jerry

Thanks a lot for the advice. Unfortunately, didn't help. I had also tried ather search engines (including Google) with different flavors of iFilters, but couldn't manage to get it to work.




Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Eric Wolz - MSFT

The UAC setting does not effect the IFilter sandboxing security feature. The Office group, who owns the TIFF IFilter, is aware of this issue and will be addressing it in thier next release.






Re: Search, Organize and Visualize in Windows Vista TIF files text not indexed in Vista?

Jerry

When exactly will that be

Why is it so difficult to allow the user to forfeit the sandboxing if he so desires

Can you not instruct on how to do that

I would very much prefer to have less security (my systems are secured in many other ways anyway) while I wait for Microsoft to correct this flaw, than having to resort to 3rd party products with all sorts of possible pitfalls.

Please give me a hint on how to get out of the sandbox. A registry tweak or two