| 
	
 | 
 Posted by Bent Stigsen on 03/14/06 04:21 
greg wrote: 
>>AFAIK it depends on what kind of file it is. Not sure, but ascii are 
>>txt, csv, html etc, binary are images, mp3's etc. 
>>Correct me if i'm wrong. 
>> 
>  
> surely, but this means I must think of all the possible file extension  
> decide whether it's ascii or binary. 
> it seems to be limited, but thx anyway. 
 
In a sense he is right, it is not really straightforward to make the  
distinction, if you strictly mean the ascii character set. 
 
Binary just means that it consists of binary patterns or sequence of  
bits, varied in length and meaning. The content of a binary file only  
makes sense to an application which knows what the sequence of bits  
means. When a file is viewed in a text-editor, then the data is  
(possibly mistakenly) chopped up in 8-bits (or whatever), and the  
corresponding symbol of that value is displayed, which may or may not  
make any sense at all. Strictly speaking, the only difference between  
ascii and non-ascii would be whether or not each chunk of bits is  
*intended* to correspond to a specific symbol in the Ascii character  
table. 
 
If you by ascii generally mean plain readable/printable text, not  
necessarilly limited to ascii, then there is tools that could help you. 
 
http://dk2.php.net/mime_content_type 
http://pecl.php.net/package/fileinfo 
 
If you are on a linux/unix, check: 
http://www.freebsd.org/cgi/man.cgi?query=file 
 
You could just ignore the subtype, and only distinguish on mediatype  
between text and everything else. 
 
/Bent
 
  
Navigation:
[Reply to this message] 
 |