Using PHP to Scrape the Report Card from a Code School Profile – Part 2

CodeSchool logo

Earlier this week, I described how to use PHP to scrape the report card from a Code School profile. Now, it must be displayed. I chose to display mine in the sidebar of my blog. To do this, jQuery and CSS will be your friends. It’s pretty simple, and this isn’t the only way to do it. However, in this implementation, it is important that the name of the querystring parameter used in the PHP script (in this case, “nick”) matches the one in the jQuery function call below. Likewise, the id attribute of the div element must also match the one in the jQuery statement.

(Update: the CSS code below should be updated in accordance with this change to prevent bullets from appearing between the badges in the “Master Status” section.)

<style>
#codeschool {
   text-align: center;
   vertical-align: middle;
}

.badge-img {
   display:block !important;
   margin-left: auto;
   margin-right: auto;
}

.pr-avatar {
   display:block;
   margin-left: auto;
   margin-right: auto;
   margin-bottom: 10px;
}
</style>
<div id="codeschool">
</div>
<br />
<script>
(function($) {
$("#codeschool").load("/codeschool/codeschool.php?nick=DeepInTheCode");
})(jQuery);
</script>

Well, that’s it. If you debug the client-side code on both your page and the Code School profile page, you’ll see that there are path elements in the Code School script that cause the partial opacity on uncompleted paths. This is presumably done with other JavaScripts and CSS on the Code School site. I haven’t tried bringing that functionality here as yet. Perhaps for another time…

Using PHP to Scrape the Report Card from a Code School Profile – Part 1

CodeSchool logo

I have been manually placing my Master badges from Code School onto my blog which, being a WordPress blog, runs on PHP. I don’t have the script running that shows the fraction completed on Paths that I haven’t yet mastered, but I can get the ones that I’ve completed.

The PHP script I’ve built so far is below. Due to some CSS I want to change, I haven’t implemented it yet. But I can say that it does indeed scrape the page. The jQuery required to display it in the sidebar I’ll share once I get the CSS issues worked out. This is very similar to the code I used to get the CodeEval profile, but it’s been refactored and modified for use with the Code School page.

<?php
    function getClass($classname, $htmltext)
    {
        $dom = new DOMDocument;
        $dom->loadHTML($htmltext);
        $xpath = new DOMXPath($dom);
        $results = $xpath->query("//*[@class='" . $classname . "']");
        return $results;
    }

    function buildContent($results)
    {
        $content = "";
        foreach ($results as $node) {
            $partial_content = innerHTML($node);
            $content = $content . $partial_content;
        }
        return $content;
    }

    /* this function preserves the inner content of the scraped element.
    ** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
    ** So be sure to go and give that post an uptick too 🙂
    **/
    function innerHTML(DOMNode $node)
    {
      $doc = new DOMDocument();
      foreach ($node->childNodes as $child) {
        $doc->appendChild($doc->importNode($child, true));
      }
      return $doc->saveHTML();
    }

    $previous_value = libxml_use_internal_errors(TRUE);
    $profilename = $_GET['nick'];
    $profile_url =  'https://www.codeschool.com/users/' . $profilename . '/';
    $context = stream_context_create(array(
        'https' => array('ignore_errors' => true),
    ));
    $html = file_get_contents($profile_url, false, $context);  

    $class = 'bucket';
    $resultsBucket = getClass($class,$html);

    $class = 'mbl tac';
    $resultsMaster = getClass($class,$html);

    $class = 'pr-pathStatus';
    $resultsPath = getClass($class,$html);

    libxml_clear_errors();
    libxml_use_internal_errors($previous_value);
?>

<a href="<?php echo $profile_url; ?>" target="_blank">
    <div class="wrapper-scores">
        <?php
        $full_content = "";

        $full_content = $full_content . buildContent($resultsBucket);
        $full_content = $full_content . buildContent($resultsMaster);
        $full_content = $full_content . buildContent($resultsPath);

        /* changing h2 tags to h1 tags and inserting line breaks */
        $full_content = str_replace("<h2","
<h1",$full_content);
        $full_content = str_replace("</h2>","</h1>
",$full_content);

	/* disabling the anchor tags on each badge by changing to divs */
        $full_content = str_replace("<a rel=\"tooltip\" ","<div ",$full_content);
        $full_content = str_replace("href=\"/learn","data-href=\"http://codeschool.com/learn",$full_content);
	$full_content = str_replace("</a>","</div>",$full_content);

        /* changing text on heading of Path Status */
        $full_content = str_replace("Path Status","Paths In Progress",$full_content);

        /* return the html */
        echo $full_content;
        ?>
    </div>
</a>

PHP code updated on 2017.12.28.

Accessing the Google Foobar Challenges on Chrome

Google Foobar challenge prompt

Last night I was Googling Python lambda functions and a strange thing happened. The Google search results window broke open near the top and a single line of white text appeared on a black background asking if I wanted to take a test, next to a link to google.com/foobar.

Google Foobar challenge prompt

After clicking on the link, I was able to login and got a shell prompt.

For some reason, some letters could be typed, but others had no effect. Initially, I was using Chrome as my browser. On Safari, I found that I could type anything there.

Ironically, Chrome’s malware defenses affect the functionality of this page. If you want to use Chrome to do the Foobar challenges, you must uncheck the “Enable phishing and malware protection” under Advanced Settings in Chrome. Some other ad blocking or malware-related extensions may also need to be turned off if this doesn’t fix the problem.