|
Posted by Richard Lynch on 02/02/05 01:30
Graham Cossey wrote:
> The problem is that I want to ensure that the file being uploaded is a
> CSV file, so I test the $_FILES['file']['type'] value.
That only ensures that somebody else can forge the type header being sent
to you.
Anybody with half a clue (okay, a clue and a half) could do that:
telnet example.com 80
POST /your_form.php HTTP/1.1
Host: example.com
Content-type: text/comma-separated-values
INSERT FAVORITE TROJAN WORM HERE
So it's pretty useless as a security measure...
> In Firefox & IE it is returned as "application/octet-stream" but in
> Opera it is returned as "text/comma-separated-values", the latter
> being what I would expect.
Plus, as you have discovered, the browser manufacturers have absolutely no
concept of "standards" when it comes to setting Content-type: on an
uploaded file.
> Can anyone offer some advice on how I can reliably test for a valid CSV
> file?
Actually, you're very lucky on this one, in that you can use
http://php.net/fgetcsv on it, repeatedly, and either PHP has an error, or
PHP doesn't, and then you KNOW it parses as a valid CSV file, from
beginning to end.
So, what you *MIGHT* do would be something like this:
<?php
..
..
..
flush();
ob_start();
$old_reporting = error_reportin(E_ALL | E_STRICT);
$csv = fopen($_FILES['file']['tmpname']) or print("ERROR: could not open "
.. $_FILES['file']['tmpname']);
if ($csv){
while (!feof($csv)){
$line = fgetcsv($csv, 1000000); //Lose the 1000000 in PHP 5
}
}
$php_output = ob_get_clean();
if (stristr($php_output, 'Error') || stristr($php_output, 'Warning') ||
stristr($php_output, 'Notice')){
//NOT a valid CSV file
}
else{
//CSV file is valid
}
//play nice, and set it back to what it was.
error_reporting($old_reporting);
?>
You may not be able to READ $_FILE['file']['tmpname'], so you'd have to
move_uploaded_file() it to a staging area first, and then read that.
You might want to play around with the error_reporting setting a bit, and
a bunch of CSV test files from different sources.
You may want to rule that ANY output (strlen($php_output)) is indicative
of an error, rather than checking for 'Error' 'Warning' 'Notice' as I
did... In fact, that would probably be better.
If the files might be large, you may want to cache the CSV data you read,
and then you can use it later in your script, after you've read the whole
thing in and you know it's kosher... Course, if it's REALLY large, you'll
want to cache that in something like a temp table in MySQL or something,
just so you won't fill up RAM with some monster Array in PHP...
For a small CSV file, it really won't matter that much if you read it
twice -- It will probably be in the File System cache for you anyway,
depending on server load.
--
Like Music?
http://l-i-e.com/artists.htm
[Back to original message]
|