|
Posted by Andy Hassall on 01/31/06 00:11
On Mon, 30 Jan 2006 21:05:38 +0000, splodge <splodge@blurryfox.com> wrote:
>I am working on a portal, part of which allows users to upload files.
>Part of the array within $_FILES superglobal gives the mime type for the
>file. Is this 100% reliable / accurate?
It is user-supplied data, so is not trustworthy.
>If the mime type says the file type is jpeg is it always right?
No.
>Two reasons I want to know:
>
>1. Certain types of files mustn't be uploaded, .exe files for example.
>2. It is unsafe to rely on file extentions, not least because this
>portal will be exposed to Linux.
>
>If the mime type is not reliable what techniques are available to
>discover the type of a file?
There is no reliable way to find the "type" of a file because files don't have
types as such; the data could be consistent with being a certain format of
data, but it ultimately depends what program you feed it into.
There's functions that use heuristics to make a decent guess as to the format
of the data, using "magic numbers" - looking for certain known patterns of
bytes corresponding to headers etc.
http://uk2.php.net/manual/en/ref.mime-magic.php
How it's supposed to work is that it doesn't matter what the data is, but
provided you send it _out_ with an appropriate Content-type then nothing bad
should happen. Unfortunately Internet Explorer has a "I think I know better"
mode where it guesses MIME types for downloaded files under various
circumstances, even if you've explicitly stated what type it is, potentially
resulting in them opening up in inappropriate applications.
See: http://ppewww.ph.gla.ac.uk/~flavell/www/content-type.html , and then
prepare to lose hair if you want to do apparently simple things like serve up
HTML source code as text/plain.
--
Andy Hassall :: andy@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
[Back to original message]
|