End of Another Year! … and a Minor Update to the Code School Profile Scraper

CodeSchool logo

It’s hard to believe 2016 is already drawing to a close. It seems like just yesterday that I was writing about using PHP to search through my source code!

Though I wouldn’t call this an intractable problem, I did notice something annoying when looking at my Code School profile on the sidebar of this site.

Code School Profile with LI Bullet

Between the badges in the “Master Status” section, white dots had appeared! Upon inspecting these, I saw that these were the bullets on the list items that held the badges.

One article on Stack Overflow suggested that the CSS style for the tag for the unordered list that holds the list items should be include “list-style-type: none;”, but that seemed to have no effect.

After playing with the CSS a bit, I discovered that setting that property on the li tag instead fixed the problem.

Here is the corrected CSS code which updates the code from a past post:

<style>
#codeschool {
   border: 1px solid blue;
   text-align: center;
   vertical-align: middle;
}

#codeschool li {
   list-style-type: none;
}

.badge-img {
   display:block !important;
   margin-left: auto;
   margin-right: auto;
}

.pr-avatar {
   display:block;
   margin-left: auto;
   margin-right: auto;
   margin-bottom: 10px;
}
</style>

By implementing this minor change, the bullets disappeared, and the profile looks as it did originally.

Have a Merry Christmas and a Happy New Year!

Using PHP to Scrape the Report Card from a Code School Profile – Part 2

CodeSchool logo

Earlier this week, I described how to use PHP to scrape the report card from a Code School profile. Now, it must be displayed. I chose to display mine in the sidebar of my blog. To do this, jQuery and CSS will be your friends. It’s pretty simple, and this isn’t the only way to do it. However, in this implementation, it is important that the name of the querystring parameter used in the PHP script (in this case, “nick”) matches the one in the jQuery function call below. Likewise, the id attribute of the div element must also match the one in the jQuery statement.

(Update: the CSS code below should be updated in accordance with this change to prevent bullets from appearing between the badges in the “Master Status” section.)

<style>
#codeschool {
   text-align: center;
   vertical-align: middle;
}

.badge-img {
   display:block !important;
   margin-left: auto;
   margin-right: auto;
}

.pr-avatar {
   display:block;
   margin-left: auto;
   margin-right: auto;
   margin-bottom: 10px;
}
</style>
<div id="codeschool">
</div>
<br />
<script>
(function($) {
$("#codeschool").load("/codeschool/codeschool.php?nick=DeepInTheCode");
})(jQuery);
</script>

Well, that’s it. If you debug the client-side code on both your page and the Code School profile page, you’ll see that there are path elements in the Code School script that cause the partial opacity on uncompleted paths. This is presumably done with other JavaScripts and CSS on the Code School site. I haven’t tried bringing that functionality here as yet. Perhaps for another time…

Using PHP to Scrape the Report Card from a Code School Profile – Part 1

CodeSchool logo

I have been manually placing my Master badges from Code School onto my blog which, being a WordPress blog, runs on PHP. I don’t have the script running that shows the fraction completed on Paths that I haven’t yet mastered, but I can get the ones that I’ve completed.

The PHP script I’ve built so far is below. Due to some CSS I want to change, I haven’t implemented it yet. But I can say that it does indeed scrape the page. The jQuery required to display it in the sidebar I’ll share once I get the CSS issues worked out. This is very similar to the code I used to get the CodeEval profile, but it’s been refactored and modified for use with the Code School page.

<?php
    function getClass($classname, $htmltext)
    {
        $dom = new DOMDocument;
        $dom->loadHTML($htmltext);
        $xpath = new DOMXPath($dom);
        $results = $xpath->query("//*[@class='" . $classname . "']");
        return $results;
    }

    function buildContent($results)
    {
        $content = "";
        foreach ($results as $node) {
            $partial_content = innerHTML($node);
            $content = $content . $partial_content;
        }
        return $content;
    }

    /* this function preserves the inner content of the scraped element.
    ** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
    ** So be sure to go and give that post an uptick too 🙂
    **/
    function innerHTML(DOMNode $node)
    {
      $doc = new DOMDocument();
      foreach ($node->childNodes as $child) {
        $doc->appendChild($doc->importNode($child, true));
      }
      return $doc->saveHTML();
    }

    $previous_value = libxml_use_internal_errors(TRUE);
    $profilename = $_GET['nick'];
    $profile_url =  'https://www.codeschool.com/users/' . $profilename . '/';
    $context = stream_context_create(array(
        'https' => array('ignore_errors' => true),
    ));
    $html = file_get_contents($profile_url, false, $context);  

    $class = 'bucket';
    $resultsBucket = getClass($class,$html);

    $class = 'mbl tac';
    $resultsMaster = getClass($class,$html);

    $class = 'pr-pathStatus';
    $resultsPath = getClass($class,$html);

    libxml_clear_errors();
    libxml_use_internal_errors($previous_value);
?>

<a href="<?php echo $profile_url; ?>" target="_blank">
    <div class="wrapper-scores">
        <?php
        $full_content = "";

        $full_content = $full_content . buildContent($resultsBucket);
        $full_content = $full_content . buildContent($resultsMaster);
        $full_content = $full_content . buildContent($resultsPath);

        /* changing h2 tags to h1 tags and inserting line breaks */
        $full_content = str_replace("<h2","
<h1",$full_content);
        $full_content = str_replace("</h2>","</h1>
",$full_content);

	/* disabling the anchor tags on each badge by changing to divs */
        $full_content = str_replace("<a rel=\"tooltip\" ","<div ",$full_content);
        $full_content = str_replace("href=\"/learn","data-href=\"http://codeschool.com/learn",$full_content);
	$full_content = str_replace("</a>","</div>",$full_content);

        /* changing text on heading of Path Status */
        $full_content = str_replace("Path Status","Paths In Progress",$full_content);

        /* return the html */
        echo $full_content;
        ?>
    </div>
</a>

PHP code updated on 2017.12.28.