rtaiss

Hi, I have to maintain this can u help mer pls telling me why the regex does n find my sentence in the document:L

Regex objAlphaNumericPattern = new Regex(">[&nbsp;]*[\\s]*[a-zA-Z0-9\x0600-\x06ff\x00DF\\r\\n,' .\\-\\(\\):\"'&+!/ aaaaaaaceeeeiiiienooooo°¬ouuuuytyaeouAeACaeEuIIII][a-zA-Z0-9\x0600-\x06ff\x00DF,' .\\-\\(\\):&\"'+!/ \\saaaaaaaceeeeiiiienooooo°¬ouuuuytyaeouAeACaeEuIIII]+[&nbsp;]*<");

Here is the target sentence:

MatchCollection result = objAlphaNumericPattern.Matches("<span class=ms-formvalidation>*</span>\r\n&nbsp;indicates a required field</span>");

It s the \r\n that s causing the problem because if i remove then my regex matches the sentence without \r\n like this:
MatchCollection result = objAlphaNumericPattern.Matches("<span class=ms-formvalidation>*</span>&nbsp;indicates a required field</span>");

Thank you


Thanks




Re: Regular Expressions i are not taken into consideration in my regex

OmegaMan

I see these problems with maintainability of this regex:
  1. The pattern is fighting the languages string requirement of having to escape the escape character "\\r\\n". It would be easier to understand if one does this
    string abc = @"\r\n";
    .
  2. Specify hexadecimal values for things like quotes and single quotes by "\x27\x22" which is saying " \\' \\" " in your example. Same can be done for others, pull up the Character Map Window's program to get other letter's hex numbers.
  3. This regex seems overly complex and may be actually leading to is fragility. What is the actual goal of the process I believe there is a better regex for what is needed.
  4. From the regex you have is a list of things that are allowed and they go into ranges...instead of [a-zA-Z0-9] why can't that be replaced by [\w]
  5. Or instead of providing the list of acceptable characters...wouldn't it be easier to specify what is not allowed such as [^\s]* which says match everything until a white-space type character is hit See Character Class Definitions for help in understanding 4/5. Also check out the FAQ Regex Resources Reference at the top of this, the regex forum which will help in other areas.
I propose you rework the regex with the suggestions to make it more maintainable and less fragile. If the problem still manifests itself let us know!







Re: Regular Expressions i are not taken into consideration in my regex

rtaiss

Thanks a lot for your help omega man. for some reason arranging the (">[&nbsp;]*[ and [\\s]* made my regex work. bit i think u re right. i will use all your suggestions in the future thank you very much.



Re: Regular Expressions i are not taken into consideration in my regex

OmegaMan

Regex'es are fragile and the simpler a regex is, the better chance it won't match something unintended. Also the leaner a regex is, the faster the processor can work on the text. I hope you do try to spend sometime reworking the pattern, I feel it could be made a lot simpler. Please repost to the forum with an new questions! Good luck.