Daniel works on a web service where users sell and buy secondhand books. In particular, user can store title and a description of a book. Before saving in DB data gets validated:
=head2 validate_form
Form validation:
* fields aren't too long
* fields constnst only of allowed characters
The subroutine returns '' if everything is OK, and error code otherwise.
=cut
{
my %LIMIT = (
author => 100,
title => 150,
description => 2000,
);
sub validate_form
{
my ($form) = @_;
for my $f (qw/author title description/){
# not too long
return "TOO_LONG_FIELD_".uc($f) if length($form->{$f}) > $LIMIT{$f};
# consists of allowed characters only
return "DISALLOWED_SYMBOLS_IN_FIELD_".uc($f) unless $form->{$f} =~ /[-_a-zа-я0-9\.,:!?;'"()\+ ]+/i;
}
return '';
}
}
After a while Daniel finds wrong data in the database. For instance, some books have html tags in their descriptions. Why validation didn’t catch them?
Once a programmer had a problem. He thought he could solve it with a regular expression. Now he has two problems ^_^
The source of problems is regexp /[-_a-zа-я0-9\.,:!?;'"()\+ ]+/i
.
Operator m//
searches a string for a pattern match,
so if the string contains at least one allowed character, it will pass the validation.
Daniel should at least specify the beginning and the end of the string:
/^[-_a-zа-я0-9\.,:!?;'"()\+ ]+$/i
.