≡ Menu
Deep in the Code

Using PHP to Search for Text in Website Source Code

As I’ve mentioned before, I use the Thesis premium theme on my WordPress site, and I generally have no problems at all. However, the newest Thesis update came out, and I am getting an HTML5 validation error.

Usually when this sort of thing happens, whether it be with a theme or with a plugin, I’ll try to fix what is causing the error and then report the fix to the author. The validation error I am getting is below.

HTML5 validation error on Thesis 2.2.1

The validation error and the rendered HTML.

The Thesis codebase is fairly complicated and is not easy to decipher if you’ve never seen it before. Even though I’d hacked on it a few times before, I’d never come across the code that generated this bit of HTML.

<style type="text/css">
#thesis_launcher { position: fixed; bottom: 0; left: 0; font: bold 16px/1em "Helvetica Neue", Helvetica, Arial, sans-serif; padding: 12px; text-align: center; color: #fff; background: rgba(0,0,0,0.5); text-shadow: 0 1px 1px rgba(0,0,0,0.75); 
#thesis_launcher input { font-size: 16px; margin-top: 6px; -webkit-appearance: none; 
</style>

My blog runs on a shared server, and I don’t have SSH enabled currently, so there was no way I could use grep to search for the text, and the cPanel search utility only looks at filenames.

After some searching, I found an article that had code for finding filenames in all subfolders from a path on your site. This code would not search the text itself, but would allow for recursive folder searching.

function rsearch($folder, $pattern) {
    $dir = new RecursiveDirectoryIterator($folder);
    $ite = new RecursiveIteratorIterator($dir);
    $files = new RegexIterator($ite, $pattern, RegexIterator::GET_MATCH);
    $fileList = array();
    foreach($files as $file) {
        $fileList = array_merge($fileList, $file);
    }
    return $fileList;
}

Also, I was able to find another post that explained text searching in a file.

$path_to_check = '';
$needle = 'match';

foreach(glob($path_to_check.'*.txt') as $filename)
{
  foreach(file($filename) as $fli=>$fl)
  {
    if(strpos($fl, $needle)!==false)
    {
      echo $filename.' on line '.($fli+1).': '.$fl;
    }
  }
}

By combining and modifying these, I was able to put together a relatively simple file that will search through all files matching a pattern (in this case, PHP files) and printing instances of the text that contains the search term.

$path_to_check = "(your folder)";
$pattern = "/.*php/";
$needle = $_GET['needle'];

function rsearch($folder, $pattern, $needle) {
    $dir = new RecursiveDirectoryIterator($folder);
    $ite = new RecursiveIteratorIterator($dir);
    $files = new RegexIterator($ite, $pattern, RegexIterator::GET_MATCH);
    //$fileList = array();
    foreach($files as $file) {    	
        //$fileList = array_merge($fileList, $file);
        foreach($file as $filename) {
           foreach (file($filename) as $fli=>$fl) {
               //echo $filename."<br /><br />\n\n";
               if(strpos($fl, $needle)!==false) {
	           echo $filename.' on line '.($fli+1).': '.$fl."<br /><br />\n\n";
               }   
           }
        }
        
    }
    //return $fileList;
    return 0;
}

//var_dump(rsearch($path_to_check,$pattern,$needle));

if (strlen($needle) > 0) {
    rsearch($path_to_check,$pattern,$needle);
}
echo "Search complete.";

The search term currently is entered using the querystring (such as search.php?needle=yoursearchterm), and the path is currently hard coded. The pattern uses a regular expression. I did find that this has the potential to use all of your allotted memory, so use it sparingly. Also, don’t leave this on your site in PHP form, but rename to TXT when not in use so that no one can use it without your knowledge – it could be used to find passwords for databases and other sensitive information.

Incidentally, I did find the code that generates the CSS above; it’s in the wp-content/themes/thesis/lib/core/skin.php file:

echo
	"<style type=\"text/css\">\n",
	"#thesis_launcher { position: fixed; bottom: 0; $position: 0; font: bold 16px/1em \"Helvetica Neue\", Helvetica, Arial, sans-serif; padding: 12px; text-align: center; color: #fff; background: rgba(0,0,0,0.5); text-shadow: 0 1px 1px rgba(0,0,0,0.75); }\n",
	"#thesis_launcher input { font-size: 16px; margin-top: 6px; -webkit-appearance: none; }\n",
	"</style>\n";

Due to the amount of time it would take for me to suss out how to move this into the head without breaking the site, I’m just going to report this one. It should be fixed in the next minor release.