Chris Restall

Hi,

I am struggling with an efficient generic solution to handle the disassembly of large messages. What I'd like to do is process messages something like the following:

<root attr=someatt" ns=nspace>

<somenode/>

<order/>
<order/>

</root>

into:

<root attr=someatt" ns=nspace>

<somenode/>

<order/>
</root>

<root attr=someatt" ns=nspace>

<somenode/>

<order/>
</root>

There can be many variations to this in that I could have many other "<someNodes>" or additional attributes on the root. We take a variety of XML message schemas .

I have tried the envelope approach and can't seem to get the header stuff to work out right. After the pipeline, the message will be mapped to a canonical format before publish to the messagebox

My second approach has been a custom, per instance, disassembler component that I can pass an XPath in as a property. So when the message gets to the component, in the disassemble, I get the stream into an XMLDocument, read the XPath value and use that to create a nodeList. in the above example, the XPath property might be "root/order". I use that to get a list of orders from the message. I create a fragment of the original message by removing those nodes based on the XPath property which gives me this:

<root attr=someatt" ns=nspace>

<somenode/>

</root>

Now I can iterate the nodeList and add each node to a fragment for an individual message in a simple creation method. All property promotion is handle with this message creation method.

This seems to work well for small XML messages but it of course has major issues on large XML. The culprit is the XmlDocument necessary to apply the XPath. I know a streamed approach would be best and I've looked at XPathMutatorStream, and other streams but cannot find or understand a way to break up the message generically (property supplied), without some sort of DOM.

Is this a bad approach Are there any good examples out there for breaking up large messages

Any feedback is welcome.

Thanks



Re: BizTalk R2 General Disassembly of Large XML messages

Alan Smith MVP

HI,

Straming is definatly the way to go in pipeline components. I have done something similar, and I used XmlReader and XmlWriter classes read in the inbound stream, convert it, and send the outbound message.

Regards,

Alan





Re: BizTalk R2 General Disassembly of Large XML messages

Saravana_Kumar_

You can make use of XPathReader class build by Microsoft folks (http://www.gotdotnet.com/workspaces/workspace.aspx id=a7f2e079-8ca7-4c09-b840-10351156f2eb)

or have you seen this example

http://bloggingabout.net/blogs/wellink/archive/2006/03/03/11207.aspx

snippet:

private XPathMutatorStream MatchVersion(MemoryStream OriginalStream)
{
// Define the Xpath Collection
XPathCollection VersionQueries = new XPathCollection();
VersionQueries.Add(new XPathExpression(FormatXpathQuery("/Envelope/Header/SomeHeader/SomeIdentification/BerichtNaam/")));
VersionQueries.Add(new XPathExpression(FormatXpathQuery("/Envelope/Header/SomeHeader/SomeIdentification/VersieMajor/")));
VersionQueries.Add(new XPathExpression(FormatXpathQuery("/Envelope/Header/SomeHeader/SomeIdentification/VersieMinor/")));
VersionQueries.Add(new XPathExpression(FormatXpathQuery("/Envelope/Header/SomeHeader/SomeIdentification/Buildnr/")));

// Add the Mutator to this Stream
ValueMutator VersionMatcher = new ValueMutator(this.VersionMatcher);
XPathMutatorStream VersionStream = new XPathMutatorStream(OriginalStream, VersionQueries, VersionMatcher);
VersionStream.AfterLastReadEvent +=new AfterLastReadEventHandler(EndOfVersionStreamReached);
return VersionStream ;
}

Regards,
Saravana Kumar
http://www.biztalk247.com/v1/
http://www.digitaldeposit.net/blog
[Please mark the response as "Answer" if it solves your problem.]





Re: BizTalk R2 General Disassembly of Large XML messages

mitre

Hi Alan,

I guess my big question with the stream is since they are forward only, how would I be able to generically build the XML around the each extracted node:

so in one instance I may have as a message:

<root>

<somenode/>
<order/>

<order/>

</root>

I can see using XmlTextReader methods such as, readToFollowing and ReadNextSibling to get to each order node. I don't understand how I would get the wrapping for each message with a reader:
<root>
<somenode/>
--inject the extracted order here
</root>


<root>
<somenode/>
--inject the extracted order here
</root>

I was hoping to use streaming, I just cannot find an example in a disassembling component that uses a stream in a scenario similiar to what I'm doing. There's plenty out there where the DOM is used. Do you have any sample links or suggestions regarding the above

Thanks,

Chris





Re: BizTalk R2 General Disassembly of Large XML messages

mitre

Hi Saravana,

I've seen that article and it works great pulling values via xpath out of the stream or manipulating values in the stream. I don't understand how this would help me split my message up. I'm a little new with this so maybe I'm missing the obvious.





Re: BizTalk R2 General Disassembly of Large XML messages

Mick Badran - MVP

Mitre - as Alan mentioned, the best place to do this is a custom pipeline component.
As it allows the creation of multiple biztalk messages easily.

(you could also create a multi-part biztalk message)

The components you need:
1) the BTS SDK - look under there for some sample custom pipeline components.
2) VirtualStream.cs file (also found in the SDK) - as this is a brilliant stream class. If the msg is under a certain size - it's in memory (10KB default). Over that value - the class creates a temp file on disk and uses it.
3) Streaming XPathReader - doesnt work that well with promoted bts properties, but works OK.

We want to stream, stream stream here in pipelines - no XMLDocs and NO stream.close() as boom - any other component down the pipeline gets short changed.

Your component is relatively straight forward:
(technically - it doesnt matter what 'stage' in the pipeline you target your component for. We can do all operations from all stages. There's no 'reduced' functionality for e.g. in PartyResolution stage)

The approach you need to take is:
1) create a new class in a VS.NET project and implement the 'Execute' method off the IBaseComponent interface (look at SDK examples here to get the 'shell')

public IBaseMessage Execute(PipelineContext pc, IBaseMessage pMsgIn)
{
....
}

- takes a message in, sends a message(s) out.

2) pass the incoming msg to the XMLDasm component so the initial schema can be applied and some promoted props can be added to the msg context. (you could do this by hand and it's not too hard, just easier using this component)

3) grab the resultant msg (now with a schema + properties) and start to parse + promote.

Here is a sample of how to manipulate message parts from a custom pipeline component - my blog
Along with the SDK examples - that should get you on the right track.








Re: BizTalk R2 General Disassembly of Large XML messages

mitre

Hi Mick,

I appreciate your advice and explanation. I've looked at both your example and the virtualStream and am still rather confused. Above you mention using a custom pipeline component of any category but it seems like I would specifically need a Disassembly componet and put all my message breakup logic in the Disasseble() and GetNext() interface implentations. This is where I am confused regarding the use of the stream.

I'd like this to be generic so I can use it on other partner pipelines. My messages will all be similar in that I'll have a variable number of nodes and a nodelist in the message.

<root>

<node attribute=val>

<nodesomethingelse>

<order/>

<order/>

</root>

another implementation might look like this:

<root att=val>

<custOrder/>

<custOrder/>

<custOrder/>

</root>

In either case, I would like to be able to provide configurable properties from the UI such that an admin could simply put in the path to the nodes they want to use for debatching. In scenario 1, they'd put in root/order as a property value to which would result in two messages with the exact same structure and values as the original document each message with only one order in them. Same approach for scenario 2, you'd get three messages.

Using XmlDocument and a nodeList in the disassemble I was easily able to generically get the "wrapping" for each message. Then based on the user entered XPath to obtain a nodelist, create new messages for each node with that wrapping in the GetNext(). Very short and simple but inefficent.

I tried to do something with an XmlTextReader, XmlTextWriter and the Virtual Stream in the disassemble() as suggested. My attempt up to this point has been to use the reader to loop through every node in the inbound message and check each node for the user supplied pathing, writing via a textWriter only those nodes, their attributes and sub nodes to the VirtualStream that are not the nodes of interest. This in theory should give me my message "wrapper" which is saved as a member. When a node of interest is read it is placed in a member queue.

In the GetNext() method, I should be able to read through the queue, and build my message based on the node in the queue and the created message wrapper.

The problem I'm running into is mainly with using the reader/writer on the virtualStream to create my message "wrapper". Seems like I'm using allot of code to inspect for the nodes of interest, copy each node, it's attributes, it's value, any subnode etc. I already have an XPath to these nodes supplid by the user and with an XML document I can build the wrapper document simply by removing the nodelist with 2-3 lines. I've got very limited experience using XmltextStream and it seems very cumbersome to build the wrapper.

Is this a bad approach Is there a better way to read the stream, minus the nodes of interest to construct my wrapper I need without having to basically rebuild via read() and node by node inspection, copy etc This seems like something that should be very easy to do.





Re: BizTalk R2 General Disassembly of Large XML messages

Leonid Ganeline

Please, give us the mobs of the input message and disassembled message.
Maybe we can give you more interesting decisions Smile I don't understand why the envelope is not work.
And how big is this input message