Archive for the ‘Regex’ Category

Regex For Alphanumeric Plus Punctuation

November 6, 2008

My knowledge of Regular Expressions is marginally below that of a lobotomised axolotl, but I know that Regexes are powerful and good, sort of like King Arthur of the Britons prior to the Lancelot betrayal or a Bex, though perhaps not quite as addictive. So today I made one.

Project Numbers in our Universe here at anonymous megacorp international are no more than 12 characters long, are alphanumeric but may also contain hyphen, fullstop and colon. It took me a few Googles to hit on a good introductory Regex web-page, that helped me solve the problem, so maybe while this issue didn’t quite match my full criteria for a Journey In Pain, it did go close. So here it is:

Regex projectNumberRegex = new Regex("^[-a-zA-Z0-9:.]{1,12}$");
if (!projectNumberRegex.IsMatch(dto.ProjectNumber))
{
...run away screaming
}

Here’s Why It Works

a-zA-z0-9 are the alphanumerics
:. are colon and full-stop (not the Braille symbol for ‘No Smoking’)

– (hyphen) is placed at the start of the string because if I place it between ‘9’ and ‘:’, say, the Regex Parser will interperet as a range, which can cause it to fail if ASCII(9) is greater than ASCII(-), or sneakily introduce a FALSE SUCCESSFUL range condition where none is intended.

Hyphens cannot be escaped by backslash, hence must appear either first or last in your Regex to avoid being interpereted as a range comparison. I put it at the front of the allowable characters based on the advice on this page.

[] is for grouping. It means “any of”.

^ (caret) is the ‘start of line’ character. I’m telling the Regex parser that the first character in the input string must be whatever follows the caret, which is anything in the [] grouping.

$ means “end of pattern”. This means that ONLY the sequence prior to the $ are legal. If there are additional characters in the input then the Regex fails. This is what I want. I don’t want ProjectNumber to contain anything else except the stuff in the [] group.

{1,12}means the grouping must appear between one and 12 times. That means empty string will not succeed, neither will a string of 13 or more chars.

Glorious.

In your GILLS, Ambystoma Mexicanum

Advertisements