More on my boolean search

As you could have been reading in a previous post: I’ve been busy creating a search-page to find music in my collection. It is still steadily growing, and searching it is very often one of few methods to find the track you wanted.

The auto-AND function is completed and fully functional now. The next thing on the menu is the exclusion of an item, the NOT-method. I’ve chosen to implement it the way Google does: to exclude a keyword, you have to prepend it with a "-" (minus-sign).
The phrase "uninvited -alanis" would find all songs with the term "uninvited", but without "alanis". Yes, it is that straight forward.

Below you’ll find the code once I’m back home. I’ve chosen to parse the search phrase searching for any exclusions, and create two sets of keywords. The first set are the inclusions (AND) and the second the exclusions (NOT). They both get passed through to the actual search-methods wich first searches for the inclusions, and after that for the exclusions.

// insert some form-stuff

$haystack = file(”filelist.txt”);
$needles = array_reverse(explode(” “, chop($formData['needle'])));
$i = sizeof($haystack);
foreach($needles as $needle) {
if($needle{0} !== “-”) {
$haystack = searchStack($needle, $haystack);
}
}
$not = parseQueryNot($formData['needle']);
if(sizeof($not) != 0) {
foreach($not as $token) {
$haystack = searchStackNot($token, $haystack);
}
}
$j = sizeof($haystack);
if($j == 0) {
$sOut .= “Helaas is er niets gevonden.

“;
} else {
$sOut .= formatFound($haystack);
}
$sOut .= ‘
Found: ‘.$j.’ items.
Searched ‘.$i.’ files and folders.’;

function searchStack($needle, $haystack) {
$tempstack = array();
foreach($haystack as $straw) {
if(stristr($straw, $needle) !== false) {
array_push($tempstack, $straw);
}
}
return $tempstack;
}

function parseQueryNot($query) {
// find all needles with a trailing minus (foo bar -spam -> spam) and return the array
$query = strtolower($query);
$not = “”;
preg_match_all(”/[s]-[^s]+/”, $query, $not);
$not = preg_replace(”/[s]-/”, “”, $not[0]);
return $not;
}

function searchStackNot($token, $haystack) {
$tempstack = array();
foreach($haystack as $straw) {
if(stristr($straw, $token) == false) {
// if the token to exclude is not found in this track
array_push($tempstack, $straw);
}
}
return $tempstack;
}

function formatFound($haystack) {
$sOut = “”;
foreach($haystack as $straw) {
// do some magical tricks in modifying the filename & location to a URL
}
return $sOut;
}

I’m still open for any (better performing) solutions!