mogens

Sandcastle (Dec CTP) takes a long long time to build when given a largish assembly to document.
Our project isn't that large (XML file is 19 MB, DLL is 6 MB) (134 494 elements says CopyFromIndexComponent - which generates 36 000 topics) but Sandcastle takes some 10 hours to build using the SHFB tool.

We used to use DocumentX and HelpStudio from Innovasys and it would generate the documentation in a bit over 2 hours.

Is there any chance that the next CTP will be faster




Re: Developer Documentation and Help System Performance on large assemblies

EWoodruff

Do you know which step is taking the most time It would also help to know about your build environment (PC or virtual server, memory, etc) and what type of help file you are producing (HTML help 1 or 2). BuildAssembler and the actual help file compilation are the two most common slow steps. If it's BuildAssembler, I did add the two new build components (CodeBlockComponent and PostTransformComponent) in the 1.3.3.1 release of the help file builder. You could delete their configuration sections from the sandcastle.config template to see if they are having any adverse affect on the build time. I also did some playing around with splitting the CPref data file into individual files which did help BuildAssembler's memory usage. I can supply that information if removing the components doesn't appear to be the issue.

Eric





Re: Developer Documentation and Help System Performance on large assemblies

Anand Raman - MSFT

Christian,

There is no way Sandcastle build should take 10 hours for such a sall set of topics. Internally we build Framework 3.0 (200, 000 topics) in 6 hours. The Sandcastle time is around 2 hours and the remaining time in compiling the file to HxS.

Could you try to build it without SHFB using batch scripts to see if this is an issue with SHFB If you are building a CHM we have noticed perf issues with ReflectionToChmContents.xsl. We are working on fixing this.

Anand...






Re: Developer Documentation and Help System Performance on large assemblies

mogens

I'm using both the SHFB build components in 1.3.3.1, and I'm generating HTML Help 2.
The parts that are taking a large chunk of time are ApplyVisibilityProperties and AddingMissingDocumentationTags.
Memory usage hovers around 400 meg.





Re: Developer Documentation and Help System Performance on large assemblies

EWoodruff

Those options have been there since about the October CTP so I'd of thought they'd have been an issue earlier as well. However, in order to see if there is a problem, could you do a build and when it hits the ApplyVisibiltyProperties step abort the build. If you could then send me a copy of your project and the content of the .\Working folder I'll take a look at it and see what's happening. You don't need to include the assemblies. My e-mail address is in the help file builder's About box and in the footer of the pages in its help file. Thanks.

Eric





Re: Developer Documentation and Help System Performance on large assemblies

EWoodruff

I've resolved most of the slow build issues. The last one remaining is adding missing documentation tags. Most of the issues were related to the size of the reflection information file and performing XPath queries on it. I've reworked the code and those steps now all run in under a minute. Adding missing documentation tags still takes a while (probably about an hour or so to add the auto-documented constructors). I'm going to take a look at that later. There isn't a quick fix for that one and I may end up moving it to a build component so that the queries have less to look at as they would then deal with each topic as it gets generated. I need to make one other fix to the PostTransformComponent as well so that it doesn't load the whole reflection information file too as it only needs the assembly information to do the version number lookups.
I've released a special build which you can find in the Releases section of http://www.codeplex.com/SHFB. A test release labeled "1.3.4.0 Special" can be found under the Planned Releases option. Give that a try and it should run much faster. You can set the AutoDocumentConstructors option to false to skip that step and save more time as well.
Eric




Re: Developer Documentation and Help System Performance on large assemblies

Bucket

Anand, here is some more information.

I noticed slow performance of BuildAssembler when I first tried using the ApplyVSDocModel transform.

Notes:

  1. I do not use Eric's SHFB, I simply drive Sancastle via batch files.
  2. The project is fairly small (2100 topics with ApplyVSDocModel or 1400 topics with AddOverloads).

After looking into this I found the following:

  1. Using ApplyVSDocModel, on my desktop system [2GB ram, 3.2GHz] BuildAssembler was very fast (1 min). On my laptop [0.5GB ram, 2.8GHz] it was very slow (14 mins). This was mainly due to the virtual memory use of around 450 to 500MB but fluctuating (sometimes up to 800MB) which caused excessive page faults on the laptop system.
  2. The problem was much worse when using ApplyVSDocModel than it was when using AddOverloads. In fact, using AddOverloads the laptop build time was not significantly longer than the desktop... during initialisation there are some virtual memory fluctuations up to 500MB, then the usage settled at around 190MB and few page faults were produced.
  3. The majority of the virtual memory fluctuations seem to be when building topics which contain links to sub-topics. This is especially when building the members, methods, properties and namespace topics where there are large numbers of items which point to sub-topics. Most of these topic pages disappear when using the AddOverloads transform.
  4. During these times where excessive page faults are occuring, the BuildAssembler is not reading or writing any data, simply processing. It should be able to utilise 100% of CPU time (on one CPU). However, it only manages about 5%; the other 95% is spent waiting for virtual memory to be loaded into real memory from the page file.

I hope this information provides some help.

David

P.S. Due to this, and other problems with using the ApplyVSDocModel transform, I am still using the AddOverloads transform.






Re: Developer Documentation and Help System Performance on large assemblies

EWoodruff

David,

It may be that the CPref data files are now one huge file which causes BuildAssembler to consume a large amount of memory during initialization. If your build PC doesn't have enough memory, it can cause a lot of swapping. It also seems to hang on to several hundred meg after initializiation which probably doesn't help either. Try the following to see if it helps:

1. Change into the \Program Files\Sandcastle\Data\Cpref_reflection folder and rename cpref_reflection.xml to cpref_reflection.orig so that it won't get picked up.

2. Create a batch file called RefSplit.bat in the folder containing the following (watch out for line breaks in the posted text):

MRefBuilder /out:Accessibility.xml %1\Accessibility.dll
MRefBuilder /out:AspNetMMCExt.xml %1\AspNetMMCExt.dll
MRefBuilder /out:CustomMarshalers.xml %1\CustomMarshalers.dll
MRefBuilder /out:IEExecRemote.xml %1\IEExecRemote.dll
MRefBuilder /out:IEHost.xml %1\IEHost.dll
MRefBuilder /out:IIEHost.xml %1\IIEHost.dll
MRefBuilder /out:ISymWrapper.xml %1\ISymWrapper.dll
MRefBuilder /out:Microsoft.Build.Conversion.xml %1\Microsoft.Build.Conversion.dll
MRefBuilder /out:Microsoft.Build.Engine.xml %1\Microsoft.Build.Engine.dll
MRefBuilder /out:Microsoft.Build.Framework.xml %1\Microsoft.Build.Framework.dll
MRefBuilder /out:Microsoft.Build.Tasks.xml %1\Microsoft.Build.Tasks.dll
MRefBuilder /out:Microsoft.Build.Utilities.xml %1\Microsoft.Build.Utilities.dll
MRefBuilder /out:Microsoft.Build.VisualJSharp.xml %1\Microsoft.Build.VisualJSharp.dll
MRefBuilder /out:Microsoft.CompactFramework.Build.Tasks.xml %1\Microsoft.CompactFramework.Build.Tasks.dll
MRefBuilder /out:Microsoft.JScript.xml %1\Microsoft.JScript.dll
MRefBuilder /out:Microsoft.VisualBasic.xml %1\Microsoft.VisualBasic.dll
MRefBuilder /out:Microsoft.VisualBasic.Compatibility.xml %1\Microsoft.VisualBasic.Compatibility.dll
MRefBuilder /out:Microsoft.VisualBasic.Compatibility.Data.xml %1\Microsoft.VisualBasic.Compatibility.Data.dll
MRefBuilder /out:Microsoft.VisualBasic.Vsa.xml %1\Microsoft.VisualBasic.Vsa.dll
MRefBuilder /out:Microsoft.VisualC.xml %1\Microsoft.VisualC.dll
MRefBuilder /out:Microsoft.Vsa.xml %1\Microsoft.Vsa.dll
MRefBuilder /out:Microsoft.Vsa.Vb.CodeDOMProcessor.xml %1\Microsoft.Vsa.Vb.CodeDOMProcessor.dll
MRefBuilder /out:Microsoft_VsaVb.xml %1\Microsoft_VsaVb.dll
MRefBuilder /out:System.xml %1\System.dll
MRefBuilder /out:System.Configuration.xml %1\System.Configuration.dll
MRefBuilder /out:System.Configuration.Install.xml %1\System.Configuration.Install.dll
MRefBuilder /out:System.Data.xml %1\System.Data.dll
MRefBuilder /out:System.Data.OracleClient.xml %1\System.Data.OracleClient.dll
MRefBuilder /out:System.Data.SqlXml.xml %1\System.Data.SqlXml.dll
MRefBuilder /out:System.Deployment.xml %1\System.Deployment.dll
MRefBuilder /out:System.Design.xml %1\System.Design.dll
MRefBuilder /out:System.DirectoryServices.xml %1\System.DirectoryServices.dll
MRefBuilder /out:System.DirectoryServices.Protocols.xml %1\System.DirectoryServices.Protocols.dll
MRefBuilder /out:System.Drawing.xml %1\System.Drawing.dll
MRefBuilder /out:System.Drawing.Design.xml %1\System.Drawing.Design.dll
MRefBuilder /out:System.EnterpriseServices.xml %1\System.EnterpriseServices.dll
MRefBuilder /out:System.Management.xml %1\System.Management.dll
MRefBuilder /out:System.Messaging.xml %1\System.Messaging.dll
MRefBuilder /out:System.Runtime.Remoting.xml %1\System.Runtime.Remoting.dll
MRefBuilder /out:System.Runtime.Serialization.Formatters.Soap.xml %1\System.Runtime.Serialization.Formatters.Soap.dll
MRefBuilder /out:System.Security.xml %1\System.Security.dll
MRefBuilder /out:System.ServiceProcess.xml %1\System.ServiceProcess.dll
MRefBuilder /out:System.Transactions.xml %1\System.Transactions.dll
MRefBuilder /out:System.Web.xml %1\System.Web.dll
MRefBuilder /out:System.Web.Mobile.xml %1\System.Web.Mobile.dll
MRefBuilder /out:System.Web.RegularExpressions.xml %1\System.Web.RegularExpressions.dll
MRefBuilder /out:System.Web.Services.xml %1\System.Web.Services.dll
MRefBuilder /out:System.Windows.Forms.xml %1\System.Windows.Forms.dll
MRefBuilder /out:System.Xml.xml %1\System.Xml.dll
MRefBuilder /out:VJSSupUILib.xml %1\VJSSupUILib.dll
MRefBuilder /out:VJSharpCodeProvider.xml %1\VJSharpCodeProvider.dll
MRefBuilder /out:VjsWfcBrowserStubLib.xml %1\VjsWfcBrowserStubLib.dll
MRefBuilder /out:cscompmgd.xml %1\cscompmgd.dll
MRefBuilder /out:mscorlib.xml %1\mscorlib.dll
MRefBuilder /out:sysglobl.xml %1\sysglobl.dll
MRefBuilder /out:vjscor.xml %1\vjscor.dll
MRefBuilder /out:vjsjbc.xml %1\vjsjbc.dll
MRefBuilder /out:vjslib.xml %1\vjslib.dll
MRefBuilder /out:vjslibcw.xml %1\vjslibcw.dll
MRefBuilder /out:vjsvwaux.xml %1\vjsvwaux.dll
MRefBuilder /out:vjswfc.xml %1\vjswfc.dll
MRefBuilder /out:vjswfccw.xml %1\vjswfccw.dll
MRefBuilder /out:vjswfchtml.xml %1\vjswfchtml.dll

3. Run it and pass it the folder to the .NET 2.0 installation on your system. For example:

RefSplit C:\Windows\Microsoft.NET\Framework\v2.0.50727

That will recreate the same information but with one file per assembly. Re-run the build and see if it helps. I did some testing with it and memory usage went from 600MB+ at its peak down to 200MB+ during startup and then averaged around 60MB+ during the remainder of the run. The files used to be separate until the November CTP when they became one file. They are one file in the December CTP too but the size has just about doubled.

Eric





Re: Developer Documentation and Help System Performance on large assemblies

Anand Raman - MSFT

Thanks David and Eric. This is great information. We will do additional investigation on ApplyVSDocModel and Cpref_reflection.xml and will make sure to improve the perf.

Also the perf. in ReflectionToChmContents.xsl is slow as, xsl:document() gets the topic title as shown below (where because the cache mechanism of xsl:document(), all html files are loaded into memory and not released).

<xsl:value-of select="document(concat($html,'/', file/@name, '.htm'),.)/html/head/title"/>

<xsl:value-of select="document(concat($html,'/', key('index',$overloadId)/file/@name, '.htm'),.)/html/head/title"/>

We may save the topic tile into a seperate xml file to improve the perf in our next release. Cheers.

Anand..







Re: Developer Documentation and Help System Performance on large assemblies

Bucket

Eric, Anand,

The assembly I am documenting only relies on 6 assemblies: mscorlib, system, system.data, system.drawing, system.windows.forms and system.xml.

I made a single cpref_reflection.xml file for these six assemblies and this solved the problem.

I will also try later Eric's suggestion of splitting the .NET reflection information into separate files.

Will let you know if this works when done.

David






Re: Developer Documentation and Help System Performance on large assemblies

Bucket

Eric, Anand,

I have now also tried Eric's suggestion and split the entire .NET assembly reflection into 63 separate files according to the batch file in the post above.

This also worked fine; it reduced the memory usage and thus the page faults and the build time is 'normal'.

So it would seem to be a good idea to make this same split of the cpref_reflection.xml file in the next release of Sandcastle.

Many thanks for your help.

David






Re: Developer Documentation and Help System Performance on large assemblies

Bucket

I have made a more generic version of Eric's batch file which will work on any version of the .NET framework.

This includes checks, messages and warnings.

It is posted on a new thread:
http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=1146565&SiteID=1&mode=1

David






Re: Developer Documentation and Help System Performance on large assemblies

Zef

Eric, thank you very much for posting this script!

The last couple of months, it took our build server about 2.5h to compile the documentation project, largely due to the fact that the machine only has 512MB of memory. With the above fix, the compilation time is reduced to under 15 minutes.

Kudos!






Re: Developer Documentation and Help System Performance on large assemblies

Anand Raman - MSFT

Thanks all for this great thread. We have made improvements based on your feedback in February CTP. Please see my blog http://blogs.msdn.com/sandcastle/archive/2007/02/28/sandcastle-performance-improvements-in-february-ctp.aspx.

Cheers.

Anand..