You are here: Re: [PHP] Correcting contractions « PHP « IT news, forums, messages
Re: [PHP] Correcting contractions

Posted by Dotan Cohen on 06/25/05 05:03

On 6/25/05, Robert Cummings <robert@interjinn.com> wrote:
> On Fri, 2005-06-24 at 21:02, Dotan Cohen wrote:
> > Hi friends, I've got a nice array of contractions (I've, I'd,
> > they'll,...). My intent is to take submitted data and replace, say,
> > every occurance of 'theyd' with 'they'd'. So far, so good. The trick
> > is doing it if the first character is uppercase. I tried going
> > throught the array, one by one, and doing the preg_replace twice, once
> > for each item, and once for each item with the first letter
> > capitalized. It wasn't very succesful, so I've been doing this:
> > $the_lyrics=str_replace("\bid\b", "I'd", $the_lyrics);
> > $the_lyrics=str_replace("\bi'd\b", "I'd", $the_lyrics);
> > $the_lyrics=str_replace("\bId\b", "I'd", $the_lyrics);
> > $the_lyrics=str_replace("\bim\b", "I'm", $the_lyrics);
> > $the_lyrics=str_replace("\bi'm\b", "I'm", $the_lyrics);
> > $the_lyrics=str_replace("\bIm\b", "I'm", $the_lyrics);
> > $the_lyrics=str_replace("\bi've\b", "I've", $the_lyrics);
> > $the_lyrics=str_replace("\bive\b", "I've", $the_lyrics);
> > $the_lyrics=str_replace("\bIve\b", "I've", $the_lyrics);
> > $the_lyrics=str_replace("\bi'll\b", "I'll", $the_lyrics);
> > $the_lyrics=str_replace("\bIll\b", "I'll", $the_lyrics);
> > $the_lyrics=str_replace("\bi\b", "I", $the_lyrics);
> > $the_lyrics=str_replace("\byoure\b", "you're", $the_lyrics);
> > $the_lyrics=str_replace("\bYoure\b", "You're", $the_lyrics);
> > $the_lyrics=str_replace("\byoull\b", "you'll", $the_lyrics);
> > $the_lyrics=str_replace("\bYoull\b", "You'll", $the_lyrics);
> > $the_lyrics=str_replace("\byouve\b", "you've", $the_lyrics);
> > $the_lyrics=str_replace("\bYouve\b", "You've", $the_lyrics);
> > $the_lyrics=str_replace("\bits\b", "it's", $the_lyrics);
> > $the_lyrics=str_replace("\bIts\b", "It's", $the_lyrics);
> > $the_lyrics=str_replace("\bwasnt\b", "wasn't", $the_lyrics);
> > $the_lyrics=str_replace("\bWasnt\b", "Wasn't", $the_lyrics);
> > $the_lyrics=str_replace("\bthats\b", "that's", $the_lyrics);
> > $the_lyrics=str_replace("\bThats\b", "That's", $the_lyrics);
> > $the_lyrics=str_replace("\btheyre\b", "they're", $the_lyrics);
> > $the_lyrics=str_replace("\bTheyre\b", "They're", $the_lyrics);
> > $the_lyrics=str_replace("\btheyll\b", "they'll", $the_lyrics);
> > $the_lyrics=str_replace("\bTheyll\b", "They'll", $the_lyrics);
> > $the_lyrics=str_replace("\bcant\b", "can't", $the_lyrics);
> > $the_lyrics=str_replace("\bCant\b", "Can't", $the_lyrics);
> > $the_lyrics=str_replace("\bdidnt\b", "didn't", $the_lyrics);
> > $the_lyrics=str_replace("\bDidnt\b", "Didn't", $the_lyrics);
> > $the_lyrics=str_replace("\bdont\b", "don't", $the_lyrics);
> > $the_lyrics=str_replace("\bDont\b", "Don't", $the_lyrics);
> > $the_lyrics=str_replace("\bdoesnt\b", "doesn't", $the_lyrics);
> > $the_lyrics=str_replace("\bDoesnt\b", "Doesn't", $the_lyrics);
> > $the_lyrics=str_replace("\bweve\b", "we've", $the_lyrics);
> > $the_lyrics=str_replace("\bWeve\b", "We've", $the_lyrics);
> >
> > Which, as you can see, is not exactly optimized code. How would
> > someone more professional than myself go about this? I was thinking
> > about maybe a two-dimentional array, but stopped short to consult with
> > you guys first.
>
> string_replace() supports taking two arrays from which to retrieve the
> needles and the replacements so that you only need to invoke the
> function once. This will speed things up considerably. On that note you
> have a couple of bugs...
>
> "its" is a valid word for possession (its woodwork is exquisite).
>
> 'Ill" is also valid (Ill beset by fortune).
>
> Cheers,
> Rob.
> --
> .------------------------------------------------------------.
> | InterJinn Application Framework - http://www.interjinn.com |
> :------------------------------------------------------------:
> | An application and templating framework for PHP. Boasting |
> | a powerful, scalable system for accessing system services |
> | such as forms, properties, sessions, and caches. InterJinn |
> | also provides an extremely flexible architecture for |
> | creating re-usable components quickly and easily. |
> `------------------------------------------------------------'
>
>

Ill I knew about, its I didn't. I didn't mean to put ill in there...

Should I enter each contraction twice (for the capitalization), or
should I try to do something smart so that the capitalization will
happen automatically. The 'I' contractions are special, I will deal
with those seperatly.

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация