Using PHP to Scrape the Report Card from a Code School Profile – Part 1

CodeSchool logo

I have been manually placing my Master badges from Code School onto my blog which, being a WordPress blog, runs on PHP. I don’t have the script running that shows the fraction completed on Paths that I haven’t yet mastered, but I can get the ones that I’ve completed.

The PHP script I’ve built so far is below. Due to some CSS I want to change, I haven’t implemented it yet. But I can say that it does indeed scrape the page. The jQuery required to display it in the sidebar I’ll share once I get the CSS issues worked out. This is very similar to the code I used to get the CodeEval profile, but it’s been refactored and modified for use with the Code School page.

<?php
    function getClass($classname, $htmltext)
    {
        $dom = new DOMDocument;
        $dom->loadHTML($htmltext);
        $xpath = new DOMXPath($dom);
        $results = $xpath->query("//*[@class='" . $classname . "']");
        return $results;
    }

    function buildContent($results)
    {
        $content = "";
        foreach ($results as $node) {
            $partial_content = innerHTML($node);
            $content = $content . $partial_content;
        }
        return $content;
    }

    /* this function preserves the inner content of the scraped element.
    ** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
    ** So be sure to go and give that post an uptick too 🙂
    **/
    function innerHTML(DOMNode $node)
    {
      $doc = new DOMDocument();
      foreach ($node->childNodes as $child) {
        $doc->appendChild($doc->importNode($child, true));
      }
      return $doc->saveHTML();
    }

    $previous_value = libxml_use_internal_errors(TRUE);
    $profilename = $_GET['nick'];
    $profile_url =  'https://www.codeschool.com/users/' . $profilename . '/';
    $context = stream_context_create(array(
        'https' => array('ignore_errors' => true),
    ));
    $html = file_get_contents($profile_url, false, $context);  

    $class = 'bucket';
    $resultsBucket = getClass($class,$html);

    $class = 'mbl tac';
    $resultsMaster = getClass($class,$html);

    $class = 'pr-pathStatus';
    $resultsPath = getClass($class,$html);

    libxml_clear_errors();
    libxml_use_internal_errors($previous_value);
?>

<a href="<?php echo $profile_url; ?>" target="_blank">
    <div class="wrapper-scores">
        <?php
        $full_content = "";

        $full_content = $full_content . buildContent($resultsBucket);
        $full_content = $full_content . buildContent($resultsMaster);
        $full_content = $full_content . buildContent($resultsPath);

        /* changing h2 tags to h1 tags and inserting line breaks */
        $full_content = str_replace("<h2","
<h1",$full_content);
        $full_content = str_replace("</h2>","</h1>
",$full_content);

	/* disabling the anchor tags on each badge by changing to divs */
        $full_content = str_replace("<a rel=\"tooltip\" ","<div ",$full_content);
        $full_content = str_replace("href=\"/learn","data-href=\"http://codeschool.com/learn",$full_content);
	$full_content = str_replace("</a>","</div>",$full_content);

        /* changing text on heading of Path Status */
        $full_content = str_replace("Path Status","Paths In Progress",$full_content);

        /* return the html */
        echo $full_content;
        ?>
    </div>
</a>

PHP code updated on 2017.12.28.

Code School will be Having a Black Friday Special!

CodeSchool logo

I have found Code School to be a great resource both for those just learning to code as well as those who are experienced coders, but want to learn a new programming language. It has multiple paths: Ruby (including Ruby on Rails), JavaScript (includes jQuery and CoffeeScript), HTML/CSS, iOS, Git, and an Electives path for miscellaneous technologies such as R and Chrome DevTools.

The normal price for a subscription at Code School is $29 per month, or $290 per year. However, it appears that they intend to run a Black Friday special for yearly subscriptions. The price will be revealed this Friday. If you don’t already have a subscription, or your subscription is about to run out, this could be a good time to sign up!

Unfortunately, Code School is still lacking Python, though you can learn it at Codecademy (which is free).