user input question

Tue Apr 3 20:36:21 EDT 2007

Kristian Hermansen wrote
> On 4/3/07, Eric C <eric at newmag.org> wrote:
>> It will kick 'em out before anything else gets done.
>> What do you think?
> 
> The rule of thumb in securing user input is *NOT* to blacklist what
> you think is invalid, but to whitelist only that which is acceptable
> input.  If it is a hash of [a-z0-9] only, then make a whitelist on
> this grammar.  You see, the world of inputs is possibly infinite, and
> you don't want to have cases pertaining to all of them.  Also, I
> wouldn't even give an attacker a helpful message like you do in your
> patch.  I would give a more generic error like "Something went
> wrong..." and use that for every error you encounter!

What I tend do to in my LAMP applications is to create a function called
something like formatForSqlQuery($dataField).  Whenever I need to pass
data that came in from the user into a query, if goes through this
function first.  Eg:

$query = "SELECT * FROM users where id=" .
formatForSqlQuery($_REQUEST['userId');

This function serves several purposes:

- It does the whitelisting of input characters Kristian mentions.

- It delimits/formats the input based on data type.  Single quotes
around strings, but none around numbers or booleans, etc.  Escaping of
quote characters already in the string, etc.

- It often has DBMS-specific abstractions.  For instance, MySQL may want
quotes and backslashes escaped one way, and PostgreSQL may want them
escaped another way.  Doing it in this function makes for MUCH easier
porting from one DBMS to another.

You will also thank yourself to write a wrapper around the MySQL calls,
especially the query execution calls, and always call your wrappers
instead of the native calls.  Even if your wrappers do nothing else but
call the native functions.  I tend to make an OO interface over it,
myself, but that might be overkill for you.  So why do this?
- Once again, it makes it much easier to switch DBMSs later
- You can insert debug statements to track all DB calls
- You can put in standardized error checking

One more thing I almost always do is implement a debugging function that
accepts a debug level and a message.  It only outputs the debugging
message if the current debugging level is >= the debugging level for the
current message.  That allows variably verbose debugging, and you don't
have to clean out all the debugging statements when moving to
production.  For bonus points, I often have multiple output methods to
the system, so debugging statements can be written as HTML comments,
text in bold red, or written to the Apache logs or another file.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.