Tania-chan


Hi

I've written this code for retrieving some web pages using the WebRequest and WebResponse methods:

private void webDownload()
{
foreach (WebPage w in webList)
{
// Initialize the WebRequest.
WebRequest myRequest = WebRequest.Create(w.Url);
try
{
//get the response
WebResponse myResponse = myRequest.GetResponse();

//the code for the decodeData method is taken from [1]
String htmlContent = decodeData(myResponse);

myResponse.Close();

//here I'll do some tiny operations with the html..
// ....

}
catch (WebException e)
{
if (e.Message.Contains("(404)"))
{
w.HasExpired = true;
}
}
}
}

The web pages I have to download are listed here [2] (yes, I have to download ALL of them) and it takes a lot of time). When trying the code above, I saw that the "myRequest.GetResponse()" line takes a good deal of time to execute (5 to 10 seconds): can anybody tell me why is this happening Is there any way to make it run faster

If you know of any other way to download those webs in a faster way, please, tell me.

Any help will be appreciated :)

Greetings,

Tania

[1] http://blogs.msdn.com/feroze_daud/archive/2004/03/30/104440.aspx
[2] http://johnfry.org/files/jpen_urls.txt


Re: .NET Framework Networking and Communication GetResponse method from System.Net.WebRequest

timvw

You could use BeginGetRequest instead of GetRequest.. And handle the decoding of the data in a callback method..





Re: .NET Framework Networking and Communication GetResponse method from System.Net.WebRequest

Tania-chan

Hello

I've managed to implement a code to check asynchronously if some urls have expired or not: (let's forget about the encoding for the moment....):




Re: .NET Framework Networking and Communication GetResponse method from System.Net.WebRequest

timvw

I don't see how you determine when a request has completed or not.. So i presume your code simple continues to run (does not wait for the results).. And therefore you get to see an empty list.. Here is a bit of code that could insspire you to implement what you're trying to do:

class Program
{
static void Main(string[] args)
{
List<Uri> uris = new List<Uri>();
uris.Add(new Uri("http://www.timvw.be"));
uris.Add(new Uri("http://example.com/does_not_exist"));

HttpWebResponse[] httpWebResponses = new HttpWebResponse[uris.Count];

WaitHandle[] waitHandles = new WaitHandle[uris.Count];
for (int i = 0; i < uris.Count; ++i)
{
Uri uri = uris[ i ];
HttpWebRequest httpWebRequest = WebRequest.Create(uri) as HttpWebRequest;
httpWebRequest.Method = "HEAD";
httpWebRequest.AllowAutoRedirect = true;

IAsyncResult asyncResult = httpWebRequest.BeginGetResponse(new AsyncCallback(MyAsyncCallback), new object[] { httpWebRequest, httpWebResponses, i });
waitHandles[ i ] = asyncResult.AsyncWaitHandle;
}

Console.WriteLine("Waiting for all requests to complete...");
WaitHandle.WaitAll(waitHandles);

for (int i = 0; i < uris.Count; ++i)
{
Console.WriteLine("Result for {0}: {1}", uris[ i ].AbsoluteUri, httpWebResponses[ i ].StatusCode);
}

Console.Write("{0}Press any key to continue...", Environment.NewLine);
Console.ReadKey();
}

static void MyAsyncCallback(IAsyncResult asyncResult)
{
object[] objects = asyncResult.AsyncState as object[];
HttpWebRequest httpWebRequest = objects[0] as HttpWebRequest;
HttpWebResponse[] httpWebRespsonses = objects[1] as HttpWebResponse[];
int index = (int)objects[2];

try
{

httpWebRespsonses[index] = httpWebRequest.EndGetResponse(asyncResult) as HttpWebResponse;
}
catch (WebException webException)
{
httpWebRespsonses[index] = webException.Response as HttpWebResponse;
}
}
}





Re: .NET Framework Networking and Communication GetResponse method from System.Net.WebRequest

RizwanSharp

Get response is taking time and it should do becuase its very similar to how you mannually open a web page in the browser, you request is gone to the web server webserver locates the requested web page and return it to the requester, So i think this all depends on the Internet connection speed + Web Server response, WebRequest/WebResponse classes don't create a delay.

So the answer to your question that, you can't get a speedy class becuase this delay is not caused by the class but it is dependant on Internet Connection's speed and WebServer's Power to response quickly.

In the secondpost of this thread you said that you are getting and emply list because you are not waiting for page to be downloaded and trying to access it before it was downloaded.

Next thing using Asynchronous method doesnot ensure that it'll be fatser than a Synchronous one but its event based and you dont have to wait for it untill it has blocked.

I hope you understand.

Best Regards,

Rizwan aka RizwanSharp