Using PHP to Scrape the Report Card from a Code School Profile – Part 1

CodeSchool logo

I have been manually placing my Master badges from Code School onto my blog which, being a WordPress blog, runs on PHP. I don’t have the script running that shows the fraction completed on Paths that I haven’t yet mastered, but I can get the ones that I’ve completed.

The PHP script I’ve built so far is below. Due to some CSS I want to change, I haven’t implemented it yet. But I can say that it does indeed scrape the page. The jQuery required to display it in the sidebar I’ll share once I get the CSS issues worked out. This is very similar to the code I used to get the CodeEval profile, but it’s been refactored and modified for use with the Code School page.

<?php
    function getClass($classname, $htmltext)
    {
        $dom = new DOMDocument;
        $dom->loadHTML($htmltext);
        $xpath = new DOMXPath($dom);
        $results = $xpath->query("//*[@class='" . $classname . "']");
        return $results;
    }

    function buildContent($results)
    {
        $content = "";
        foreach ($results as $node) {
            $partial_content = innerHTML($node);
            $content = $content . $partial_content;
        }
        return $content;
    }

    /* this function preserves the inner content of the scraped element.
    ** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
    ** So be sure to go and give that post an uptick too ๐Ÿ™‚
    **/
    function innerHTML(DOMNode $node)
    {
      $doc = new DOMDocument();
      foreach ($node->childNodes as $child) {
        $doc->appendChild($doc->importNode($child, true));
      }
      return $doc->saveHTML();
    }

    $previous_value = libxml_use_internal_errors(TRUE);
    $profilename = $_GET['nick'];
    $profile_url =  'https://www.codeschool.com/users/' . $profilename . '/';
    $context = stream_context_create(array(
        'https' => array('ignore_errors' => true),
    ));
    $html = file_get_contents($profile_url, false, $context);  

    $class = 'bucket';
    $resultsBucket = getClass($class,$html);

    $class = 'mbl tac';
    $resultsMaster = getClass($class,$html);

    $class = 'pr-pathStatus';
    $resultsPath = getClass($class,$html);

    libxml_clear_errors();
    libxml_use_internal_errors($previous_value);
?>

<a href="<?php echo $profile_url; ?>" target="_blank">
    <div class="wrapper-scores">
        <?php
        $full_content = "";

        $full_content = $full_content . buildContent($resultsBucket);
        $full_content = $full_content . buildContent($resultsMaster);
        $full_content = $full_content . buildContent($resultsPath);

        /* changing h2 tags to h1 tags and inserting line breaks */
        $full_content = str_replace("<h2","
<h1",$full_content);
        $full_content = str_replace("</h2>","</h1>
",$full_content);

	/* disabling the anchor tags on each badge by changing to divs */
        $full_content = str_replace("<a rel=\"tooltip\" ","<div ",$full_content);
        $full_content = str_replace("href=\"/learn","data-href=\"http://codeschool.com/learn",$full_content);
	$full_content = str_replace("</a>","</div>",$full_content);

        /* changing text on heading of Path Status */
        $full_content = str_replace("Path Status","Paths In Progress",$full_content);

        /* return the html */
        echo $full_content;
        ?>
    </div>
</a>

PHP code updated on 2017.12.28.

Using CSS to Create Resizable Columns in WordPress with Thesis Theme

CSS logo

UPDATE: If you are using Thesis 2.4 or newer, you will also need the information in the updated post. Read both posts before making any changes!

Given that most monitors today are not 17″ CRTs, I decided to change the default width for the content column on this blog such that more of the screen is used. I didn’t want to affect the width of the sidebar.

I am using the Social Triggers Skin on the Thesis 2 Theme; YMMV. The blog content width is changed in the Skin CSS under “.landing .container”, the right-hand sidebar is (predictably) in “.sidebar” and both inherit from the block “.container”.

To change the Skin CSS, go to the WordPress admin dashboard, and navigate to Thesis –> Skin Content –> click the Skin drop-down button and select Editor. On the resulting page, click the Home button and select Page. Now click the CSS button at the top of the page. This will display the CSS template for the Skin.

The CSS that must be changed is near the top of the template:

.container {
	width: $w_total;
	margin: 0 auto;
}
.landing .container {
	width: $w_content;
}

If you’re new to Thesis, you may find these values strange; they are variables. On the right-hand side of the page, you will see a list of all the variables that can be used here, with the option to create more.

The default values that I was concerned with are as follows:

$w_total: 897px
$w_content: 585px
$w_sidebar: 312px

Since I did not want to change the width of the sidebar, I did not show the CSS code here where that last variable is used; I do not need to change this one, but I do need to know its name.

I wanted to make the default values in CSS be the minimum widths allowed, but otherwise to allow the content column to fill much of the screen. By clicking on the variables in the right-hand pane, dialog boxes for changing the values appear. I changed the existing variables and added new ones like so:

Existing:
$w_total: 80%
$w_content: calc(100% – $w_sidebar)

(Notice the use of the calc() CSS native method in the $w_content value!)

New:
$w_total_min: 897px
$w_content_min: 585px

After changing the value in the dialog box, click Save to save the new value. The CSS template now must be modified. The changed CSS is:

.container {
	width: $w_total;
	min-width: $w_total_min;
	margin: 0 auto;
}
.landing .container {
	width: $w_content;
	min-width: $w_content_min;
}

Nota bene: If you also want to change the width of the sidebar, just change the $w_sidebar variable value. It can be a static number, or a percent value. If the sidebar is set to a percent value, I recommend adding a min-width value for the “.sidebar” class in the CSS template as well.

Be sure to click the Save CSS button on the Skin CSS page once you are finished.

Now, the content and sidebar columns together take up 80% of the width of the window, but if the window is shrunk, the content column will not get smaller than its original default size!

Scraping a DIV Element from a Web Page with PHP

PHP logo

I recently read an article about CodeEval, a free gamified website for ranking developers, and bringing employers and developers together. Essentially, a developer can sign up, complete coding challenges, and earn badges and a “Hacker Ranking” that will compare his or her skills to others who have signed up on the site. Also, completing some challenges will allow the developer to unlock the ability to apply for jobs with various tech startups through the site.

After completing some of the challenges I decided to see if, like Klout and some other social ranking sites, I could get a widget to put on my blog that would show my “Hacker Rank”. Unlike Kred, CodeEval apparently does not have this functionality as yet. So I decided to make my own.

The ranking information is shown in a div element on the user’s public profile, assuming that the user allows the profile to be shown.

Using the PHP code below, I was able to scrape the information from CodeEval’s site. Next, in a Text widget on WordPress, I create an empty table and used jQuery to populate the empty table with the div I scraped from CodeEval along with CSS code that I included in my PHP file to give the badge a similar look and feel to what is on the CodeEval site. Ultimately, I could create a WordPress plugin for this, so that it could be done without having to create the codeeval.php file on the site, but I haven’t done that yet.

This code could be used to scrape from any site, as long as the element has a unique class name and PHP has file_get_contents enabled.

codeeval.php:

<?php
$previous_value = libxml_use_internal_errors(TRUE);
$codeeval = $_GET['codeeval'];
$score_url = 'https://www.codeeval.com/public/' . $codeeval . '/';
$html = file_get_contents($score_url);
$classname = 'wrapper-rank';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$results = $xpath->query("//*[@class='" . $classname . "']");

/* this function preserves the inner content of the scraped element.
** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
** So be sure to go and give that post an uptick too:)
**/
function innerHTML(DOMNode $node)
{
  $doc = new DOMDocument();
  foreach ($node->childNodes as $child) {
    $doc->appendChild($doc->importNode($child, true));
  }
  return $doc->saveHTML();
}
libxml_clear_errors();
libxml_use_internal_errors($previous_value);
?>
<a href="<?php echo $score_url; ?>" style="text-decoration:none;text-align:center;font-family:Arial Black;color:black;" target="_blank">
<div class="codeeval">
<img src="https://www.codeeval.com/site_media/images/logo-code-eval.png" alt="CodeEval" />
<h3>hacker ranking</h3>
<div class="wrapper-rank">
<?php
foreach ($results as $node) {
    $full_content = innerHTML($node);
   echo $full_content;
}
?>
</div>
</div>
</a>

Here is the CSS I used:

.codeeval img {
display: block;
margin-left: auto;
margin-right: auto;
background-color: white;
}
.codeeval h3 {
text-align: center;
color: #CC240A;
letter-spacing: 0.2em;
text-transform: uppercase;
margin: 0;
padding: 0;
}
.wrapper-rank {
background: none repeat scroll 0 0 #CC240A;
padding: 5px;
width: 258px;
height: 69px;
font-style: Arial;
font-weight: normal;
font-size: 12px;
}
.wrapper-rank .main-rank {
background: none repeat scroll 0 0 #BB2610;
clear: both;
overflow: hidden;
padding: 15px;
text-align: center;
width: 228px;
height: 39px;
}
.wrapper-rank .main-rank h4 {
color: white;
float: left;
font-size: 58px;
font-weight: normal;
margin: 0;
padding: 0;
text-align: center;
}
.wrapper-rank .main-rank span {
color: #FFFF00;
float: left;
font-size: 20px;
margin: 15px 0 0 5px;
text-align: left;
}
.wrapper-rank .main-rank span em {
color: #222222;
display: block;
font-style: normal;
font-size: 16px;
}

After the codeeval.php file is created, create this table in the Text widget:

<table>
   <tr style="vertical-align:middle;text-align:center;">
      <td id="codeeval" style="width:100%;vertical-align:top;text-align:center;">
      </td>
   </tr>
</table>

Lastly, you need to get the unique ID in the URL from your CodeEval public profile for use below. This jQuery statement will populate the table above with the scraped div.

(function($) {
$("#codeeval").load("/codeeval/codeeval.php?codeeval=<<your CodeEval ID>>");
})(jQuery);

For further reading about CodeEval and similar sites, read Thoughts on Professional Learning – Inspired by CodeEval & HackerRank.