Using PHP to Search for Text in Website Source Code

HTML5 validation error on Thesis 2.2.1

As I’ve mentioned before, I use the Thesis premium theme on my WordPress site, and I generally have no problems at all. However, the newest Thesis update came out, and I am getting an HTML5 validation error.

Usually when this sort of thing happens, whether it be with a theme or with a plugin, I’ll try to fix what is causing the error and then report the fix to the author. The validation error I am getting is below.

HTML5 validation error on Thesis 2.2.1
The validation error and the rendered HTML.

The Thesis codebase is fairly complicated and is not easy to decipher if you’ve never seen it before. Even though I’d hacked on it a few times before, I’d never come across the code that generated this bit of HTML.

<style type="text/css">
#thesis_launcher { position: fixed; bottom: 0; left: 0; font: bold 16px/1em "Helvetica Neue", Helvetica, Arial, sans-serif; padding: 12px; text-align: center; color: #fff; background: rgba(0,0,0,0.5); text-shadow: 0 1px 1px rgba(0,0,0,0.75); 
#thesis_launcher input { font-size: 16px; margin-top: 6px; -webkit-appearance: none; 
</style>

My blog runs on a shared server, and I don’t have SSH enabled currently, so there was no way I could use grep to search for the text, and the cPanel search utility only looks at filenames.

After some searching, I found an article that had code for finding filenames in all subfolders from a path on your site. This code would not search the text itself, but would allow for recursive folder searching.

function rsearch($folder, $pattern) {
    $dir = new RecursiveDirectoryIterator($folder);
    $ite = new RecursiveIteratorIterator($dir);
    $files = new RegexIterator($ite, $pattern, RegexIterator::GET_MATCH);
    $fileList = array();
    foreach($files as $file) {
        $fileList = array_merge($fileList, $file);
    }
    return $fileList;
}

Also, I was able to find another post that explained text searching in a file.

$path_to_check = '';
$needle = 'match';

foreach(glob($path_to_check.'*.txt') as $filename)
{
  foreach(file($filename) as $fli=>$fl)
  {
    if(strpos($fl, $needle)!==false)
    {
      echo $filename.' on line '.($fli+1).': '.$fl;
    }
  }
}

By combining and modifying these, I was able to put together a relatively simple file that will search through all files matching a pattern (in this case, PHP files) and printing instances of the text that contains the search term.

$path_to_check = "(your folder)";
$pattern = "/.*php/";
$needle = $_GET['needle'];

function rsearch($folder, $pattern, $needle) {
    $dir = new RecursiveDirectoryIterator($folder);
    $ite = new RecursiveIteratorIterator($dir);
    $files = new RegexIterator($ite, $pattern, RegexIterator::GET_MATCH);
    //$fileList = array();
    foreach($files as $file) {    	
        //$fileList = array_merge($fileList, $file);
        foreach($file as $filename) {
           foreach (file($filename) as $fli=>$fl) {
               //echo $filename."<br /><br />\n\n";
               if(strpos($fl, $needle)!==false) {
	           echo $filename.' on line '.($fli+1).': '.$fl."<br /><br />\n\n";
               }   
           }
        }
        
    }
    //return $fileList;
    return 0;
}

//var_dump(rsearch($path_to_check,$pattern,$needle));

if (strlen($needle) > 0) {
    rsearch($path_to_check,$pattern,$needle);
}
echo "Search complete.";

The search term currently is entered using the querystring (such as search.php?needle=yoursearchterm), and the path is currently hard coded. The pattern uses a regular expression. I did find that this has the potential to use all of your allotted memory, so use it sparingly. Also, don’t leave this on your site in PHP form, but rename to TXT when not in use so that no one can use it without your knowledge – it could be used to find passwords for databases and other sensitive information.

Incidentally, I did find the code that generates the CSS above; it’s in the wp-content/themes/thesis/lib/core/skin.php file:

echo
	"<style type=\"text/css\">\n",
	"#thesis_launcher { position: fixed; bottom: 0; $position: 0; font: bold 16px/1em \"Helvetica Neue\", Helvetica, Arial, sans-serif; padding: 12px; text-align: center; color: #fff; background: rgba(0,0,0,0.5); text-shadow: 0 1px 1px rgba(0,0,0,0.75); }\n",
	"#thesis_launcher input { font-size: 16px; margin-top: 6px; -webkit-appearance: none; }\n",
	"</style>\n";

Due to the amount of time it would take for me to suss out how to move this into the head without breaking the site, I’m just going to report this one. It should be fixed in the next minor release.

Adding the Property Attribute to Style Sheet Declarations in WordPress for HTML5 Validation

HTML5 validator error

Of all the blogging platforms I’ve tried, WordPress is my favorite. If it is self-hosted, it is easy to customize.

One of the pitfalls of customization, however, is how easy it can be for a plugin of dubious quality to wreak havoc on your site. One of the problems I have had is that while WordPress does an excellent job at rendering valid HTML5 code, it is not difficult at all for a plugin to cause validation errors, even as a result of calling built-in WP functions.

In this case, a style sheet declaration was being triggered by a plugin using WP’s wp_register_style function. This function and the wp_enqueue_style function end up using the WP_Styles class, which is found in the /wp-includes/class.wp-styles.php file.

This class is called by WordPress and themes to insert link tags in the head section of your page. However, a plugin or theme can use the functions mentioned above to declare a CSS file anywhere on the page, inside or outside the head section.

This is not inherently problematic in that the page will be rendered normally. However, if the link tag is declared in the body of the HTML document rather than the head (as it happened in my case), the HTML5 code will not correctly validate if no property tag is present.

HTML5 validator error
No property attribute in a link tag not in the head? It won’t validate.

This line of code looks like it comes from a plugin, specifically the Comments Evolved (formerly known as “Google+ Comments for WordPress”) plugin. However, it is not – a little hacking on WordPress is in order here. Also, I’m going to report this as an issue so that the fix may be included in the next update.

If you open the file /wp-includes/class.wp-styles.php and add “property=’stylesheet'” to two lines (93 and 103 in my editor) as below, you shouldn’t see this problem again.

		$tag = apply_filters( 'style_loader_tag', &quot;&lt;link rel='$rel' property='stylesheet' id='$handle-css' $title href='$href' type='text/css' media='$media' /&gt;\n&quot;, $handle, $href );
		if ( 'rtl' === $this-&gt;text_direction &amp;&amp; isset($obj-&gt;extra['rtl']) &amp;&amp; $obj-&gt;extra['rtl'] ) {
			if ( is_bool( $obj-&gt;extra['rtl'] ) || 'replace' === $obj-&gt;extra['rtl'] ) {
				$suffix = isset( $obj-&gt;extra['suffix'] ) ? $obj-&gt;extra['suffix'] : '';
				$rtl_href = str_replace( &quot;{$suffix}.css&quot;, &quot;-rtl{$suffix}.css&quot;, $this-&gt;_css_href( $obj-&gt;src , $ver, &quot;$handle-rtl&quot; ));
			} else {
				$rtl_href = $this-&gt;_css_href( $obj-&gt;extra['rtl'], $ver, &quot;$handle-rtl&quot; );
			}

			/** This filter is documented in wp-includes/class.wp-styles.php */
			$rtl_tag = apply_filters( 'style_loader_tag', &quot;&lt;link rel='$rel' property='stylesheet' id='$handle-rtl-css' $title href='$rtl_href' type='text/css' media='$media' /&gt;\n&quot;, $handle, $rtl_href );

			if ( $obj-&gt;extra['rtl'] === 'replace' ) {
				$tag = $rtl_tag;
			} else {
				$tag .= $rtl_tag;
			}
		}

This will not fix any plugins or themes that use another mechanism to declare style sheets outside the head, but it has fixed several errors caused by plugins.