tr7

Hi,

I'm using a regular expression in my Word project and I'm using named groups in the regex. I know that VSTO creates a property bag to hold the key/value info corresponding to the named group. However, one of the named groups in my test document contains 550 characters (with spaces). In my regex editor, the entire expression is recognized with no problem but in VSTO the expression fails. I couldn't understand it until I started deleting some text from the suspect named group. When the text is down to around 450 characters then the entire expression is recognized in VSTO.

Is there a limit on the amount of characters a property bag can contain If so, is this something that can be controlled How should I resolve this problem

Thanks for any advice.



Re: Visual Studio Tools for Office property bag limitations

Christa Carpentiere - MSFT

Have you gotten this issue reolved




Re: Visual Studio Tools for Office property bag limitations

tr7

No. I kind of put the issue on the back-burner but it's still very much an issue. Thanks.



Re: Visual Studio Tools for Office property bag limitations

Misha Shneerson - MSFT

Tr7,

Can you post an example of the regular expression that fails

In the source code I saw a limitation of 512 characters on the length of the string that you are allowed to attach a smart tag to (see the length param in ISmartTagRecognizerSite::CommiteSmartTag) but I did not see any limitation on the lengh of the property name in smart tag's property bag. Probably, if I could get an example - I would be able to understand what is failing and where.






Re: Visual Studio Tools for Office property bag limitations

tr7

Hi Mischa,

Thanks for directing me to exactly the answer I was looking for.

"If the string you want recognized is rather long, it is good practice to have error code to check and trap for cases where commiting the smart tag failed because the smart tag was too long."

It makes sense now since the text I'm trying to recognize in VSTO is 550 characters long. It fails in VSTO but the same regex is successful in my regex editor. So I know that I have to include error code to handle this type of situation. Thanks so much!





Re: Visual Studio Tools for Office property bag limitations

tr7

Additional question:

Since in VSTO2005 I can simply provide a list of regular expressions in order to implement a smart tag, how can I get access to the length property of the CommitSmartTag method since I'm not actually writing code for this method directly Does this mean that I will have to move to a custom SmartTag class and then implement all the necessary methods thus not benefitting from the ease of using VSTO's SmartTag class





Re: Visual Studio Tools for Office property bag limitations

Misha Shneerson - MSFT

To have more control over how smart tags are persisted you would need to inherit from Microsoft.Office.Tools.Word.SmartTag class and override the Recognize method. The default implementation is very straightforward though. Going by memory:

It first iterates the collection of Terms, tries to find a match in the passed strind and persist the matching strings. Next it does the same for the regular expressions collections - however regular expressions matching has the convenience of persisting named groups as properties inside smart tags property bag.

If you want to use the convenience of having named groups being persisted as property names - I can probably help you by posting the code here. Let me know.






Re: Visual Studio Tools for Office property bag limitations

tr7

Thanks Mischa, whatever code you post would help me greatly.



Re: Visual Studio Tools for Office property bag limitations

Misha Shneerson - MSFT

So, assuming regexColleciton below represents the collection of regular expressions your code could look like this.

We basically iterate through the collection of regular expressions and for each one of them we look at every match.

If we have a match for a named group(s) - we create a property bag.

Once property bag has been constructed we persist the smart tag with the property bag.

Notice the use of this.PersistTag - this is a wrapper around site.CommitSmartTag(smartTagType, index, length, propertyBag);

Code Snippet

if (regexCollection != null && regexCollection.Count > 0)
{
for (int idxRegEx = 0; idxRegEx < regexCollection.Count; idxRegEx++)
{
Regex regex = regexCollection[idxRegEx];
Match match = regex.Match(theText);
while (match.Success)
{
ISmartTagProperties propBag = null;
if (match.Groups.Count > 1)
{
propBag = site.GetNewPropertyBag();
for (int idxReference = 1; idxReference < match.Groups.Count; idxReference++)
{
string referenceName = regex.GetGroupNames()[idxReference];
Group group = match.Groups[idxReference];
string value = group.Value;

// make sure value is not empty - otherwise
// ISmartTagProperties.Write throws E_INVALIDARG
if (String.IsNullOrEmpty(value) == false)
{
propBag.Write(referenceName, value);
}
}
}

try

{

this.PersistTag(match.Index + 1, match.Length, propBag);

}

catch(COMException)

{

// Oops, commiting smart tag failed.

}

match = match.NextMatch();
}
}
}






Re: Visual Studio Tools for Office property bag limitations

tr7

Thanks very much, Mischa.



Re: Visual Studio Tools for Office property bag limitations

tr7

Mischa, I promise this is the last question on this topic but no one else I've asked seems to know the answer...

in the statement

Code Snippet
regex.Match(theText);

I assume "theText" is the entire document contents (including footnotes) and I also assume that this is one of the three parameters to the Recognize method. I've studied the Recognize method signature and I can't figure out how this variable is able to represent the document contents. I've seen several examples of this on MSDN and other places on the Net and I've never seen an explicit assignment of any kind. Further, when I've tested it, the string text variable only represents a couple of paragraphs instead of the complete document which is one of the reasons why I've not been able to use a derived Smart Tag class before, instead depending on the standard VSTO Smart Tag class. What's going on under the hood that makes this variable special

Thanks for your help in advance.





Re: Visual Studio Tools for Office property bag limitations

Misha Shneerson - MSFT

In fact, the text that is passed to Recognize method in Word - is just a single paragraph (in Excel's case it will be just a content of a cell).

It is important to understand the basic principals SmartTags are been buit upon. Word essentially uses SmartTags to tag text as user types it. Once Word detects that a particular paragraph has been finished typing - it activates the SmartTag recognizers on a background thread. Word assumes that Recognizer's function is just fetching "interesting" pieces of text and tagging those - it should not interact with the rest of Word's Object Model i.e. Document object, Range objects etc. Recognizers should be fast and efficient. Following this model - Word assumes that the context a SmartTag should be given is within the boundaries of a single paragraph.

Keep in mind that the sample code that I have posted is just copy&paste from the implementation of VSTO's SmartTag's base class (well, except for try/catch statement which we do not have). There is nothing special in thisText variable - the text that VSTO gets consists only from a single paragraph as well.




Re: Visual Studio Tools for Office property bag limitations

tr7

Thanks for reminding me how MS Smart Tags are supposed to be used. I get so involved in my own uses of SmartTag's that I completely forgot how MS has originally intended them to be used.



Re: Visual Studio Tools for Office property bag limitations

Misha Shneerson - MSFT

Smile