≡ Menu

Using MAX() in a SQL Subquery

One pro bono project I’m working on is improving a school website where parents can sign up for classes, view students’ grades, etc. One of the problems with the website was that the list of user accounts, which includes all parents and teachers, includes parents from previous years who no longer have students there.

The query for pulling this information was very simple:

SELECT * FROM UserAccounts ORDER BY lastName ASC

This query pulled all users and ordered them only by last name. The data was being dumped into an ASP.NET GridView with column sorting, and there were only a couple hundred people there, so it wasn’t completely unmanageable. However, dealing with all the parents who no longer have kids there did make visual searches more difficult. The request was to bring all current parents to the top of the list so that other admins don’t have to search through pages of people who no longer attend.

A quick fix for this would be to sort the list based on last academic year attended. We’ll do this by employing the MAX() function.

In addition to the UserAccounts table, there are other tables called “students” (which includes all current and past students) and “schedule” (which holds student schedules). In the schedule table, a column called “s_year” that holds the academic year (in the format of “2017-18”). We can join these tables based on user account IDs in the UserAccount table, the parentID column in the students table, and the student id (s_id) column in the schedule table.

One more thing: Since this list also includes people who have never had students there, the revised query will have to take that into account.

database diagram showing table relationships

The below query will not only solve this problem, but will also sort by first and last name. Users with no academic year (a NULL value) will be at the bottom of the list.

SELECT DISTINCT ua.*, sc.s_year FROM UserAccounts ua 
LEFT OUTER JOIN students s ON s.parentID = ua.ID 
LEFT OUTER JOIN schedule sc ON sc.s_id = s.ID 
WHERE sc.s_year IS NULL 
OR sc.s_year = (
   SELECT MAX(sc2.s_year) FROM schedule sc2 WHERE sc2.s_id = s.ID
) 
ORDER BY sc.s_year DESC, ua.lastName, ua.firstName

The only remaining thing to do is to add the year column to the GridView and make sure that sorting is enabled.

Using PHP to Scrape the Report Card from a DataCamp Profile

UPDATE: This does not currently work, as DataCamp has changed the structure of profile pages. I will revise this as soon as is feasible. (7/26/17)

Just as I have written scripts for displaying report cards for Code School, CodeEval, and Duolingo on my blog, I have written the below script for displaying a DataCamp profile.

DataCamp logo

datacamp.php:

<style>
#datacamp {
   border: 1px solid blue;
   text-align: center;
   vertical-align: middle;
   width: 100%;
}

#datacamp li {
   list-style-type: none;
}

.image-centered {
   display:block;
   margin-left: auto;
   margin-right: auto;
   margin-bottom: 10px;
}

.image-rounded {
    border-radius: 50%;
}

.progress-bar {
    position: relative;
    border: 1px solid #33aacc;
    width: 100%;
    height: 18px;
    margin-bottom: 1rem;
}

.progress-bar .inner {
   position: absolute;
   left: 0;
   top: 0;
   bottom: 0;
   background-color: #33aacc;
   min-width: 5px;
}

.wrapper-scores .container {
   min-width: 150px;
   max-width: 260px;
   width: 100%;
}

.course-block__completed, .course-block__certificate-download, .btn-linkedin-share,
.course-block__description, .course-block__author {
   display: none;
}

.col-sm-4 img {
   display: inline;
   padding: 1px;
}
</style>
<?php
    date_default_timezone_set('America/Los_Angeles');
    
    function getClass($classname, $htmltext)
    {
        $dom = new DOMDocument;
        $dom->loadHTML($htmltext);
        $xpath = new DOMXPath($dom);
        $results = $xpath->query("//*[@class='" . $classname . "']");
        return $results;
    }
    
    
    function buildContent($results)
    {
        $content = "";
        foreach ($results as $node) {
            $partial_content = innerHTML($node);
            $content = $content . $partial_content;
        }
        return $content;
    }
    
    
    /* this function preserves the inner content of the scraped element. 
    ** http://stackoverflow.com/questions/5349310/how-to-scrape-web-page-data-without-losing-tags
    ** So be sure to go and give that post an uptick too 🙂
    **/
    function innerHTML(DOMNode $node)
    {
      $doc = new DOMDocument();
      foreach ($node->childNodes as $child) {
        $doc->appendChild($doc->importNode($child, true));
      }
      return $doc->saveHTML();
    }
    
    $profilename = $_GET['nick'];
    if (strlen($profilename) == 0)
	exit(1);
    $profile_url =  'https://www.datacamp.com/profile/' . $profilename . '/';
    $filename = "datacamp_" . $profilename . ".txt";
    $full_content = '';   
    $norefresh = FALSE;
    $days = 1;
    $updated = 'no date';
    
    /* checks to see if file exists and is current */
    if (file_exists($filename)) {
	    $stats = stat($filename);
	    /* 86400 seconds in one day */
	    if ($stats[9] > (time() - (86400 * $days))) {
	    	$norefresh = TRUE;
	    	$updated = date("Y-m-d H:i:s", $stats[9]);
	    }
    }
    
    /* if $norefresh is still FALSE, file will be created or updated; otherwise, it will be loaded */
    if ($norefresh) {
    	$full_content = file_get_contents($filename);
    } else {         
	$previous_value = libxml_use_internal_errors(TRUE);
	$context = stream_context_create(array(
	'https' => array('ignore_errors' => true),
	));
	$html = file_get_contents($profile_url, false, $context);  
	
	$class = 'profile-page';
	$resultsBucket = getClass($class,$html);
	

	libxml_clear_errors();
	libxml_use_internal_errors($previous_value);
	        
	$full_content = $full_content . buildContent($resultsBucket);
	
	/* making sure correct path exists on images */
	$full_content = str_replace("src=\"/","src=\"http://datacamp.com/",$full_content);
	
	/* changing h2 tags to h1 tags and inserting line breaks */
	$full_content = str_replace("<h2","<br /><h1",$full_content);
	$full_content = str_replace("</h2>","</h1><br />",$full_content);
	
	/* disabling the anchor tags on each badge by changing to divs */
	$full_content = str_replace("<a href","<div class",$full_content);
	$full_content = str_replace("<a class","<div class",$full_content);
	$full_content = str_replace("</a>","</div>",$full_content);
	
	/* adding line breaks */
	$full_content = str_replace("<div class=\"stats\">","<br /><div class=\"stats\">",$full_content);
	$full_content = str_replace("Earned</span>","Earned</span><br />",$full_content);
	$full_content = str_replace("Completed</span>","Completed</span><br />",$full_content);
	$full_content = str_replace("Aced</span>","Aced</span><br /><br />",$full_content);
		
	file_put_contents($filename,$full_content);
	$updated = date("Y-m-d H:i:s");
    }
?>
<a href="<?php echo $profile_url; ?>" target="_blank">
	<div class="wrapper-scores">
	<!-- <?php echo "Last updated: $updated" ?> -->
	<?php
	    	/* return the html */
		echo $full_content;
	?>
	</div>
</a>

If you wish to display this in a WordPress widget, create a Text widget and add this code, replacing “NICKNAME” with a DataCamp username.

<div id="datacamp"></div>
<script>
(function($) {
$("#datacamp").load("/datacamp/datacamp.php?nick=NICKNAME");
})(jQuery);
</script>

Since my primary reason for writing these has been to populate widgets on this WordPress blog, at some point I’ll probably incorporate these into WP plugins.

Java VM Issues While Installing Eclipse Neon on macOS Sierra

When installing the Eclipse IDE, one must be running a version of the Java Virtual Machine (JVM) that is compatible with the version of Eclipse that is being installed.

I was trying to install Eclipse Neon on macOS Sierra, which is the most recent major release. Sierra ships with Java, but not the version which is required by Neon, which is 1.8.

As a result, upon trying to install Eclipse Neon, you will likely see a box that looks like this (except it will say 1.8 rather than 1.7):

unsuitable JVM

Though there are multiple ways to upgrade Java on the Mac, it seems getting it from Oracle is the only one that works from what I’ve seen.

To upgrade the JVM on macOS to 1.8, go to the Oracle Java SE Runtime Environment downloads page and get the file for Mac; I prefer getting the dmg package file:

Java SE downloads page

Install the Java SE package like any other Mac application and Eclipse can then be installed.

Connecting to Oracle 12g with the Instant Client and Excel VBA

One rather old application I’ve supported for several years loads data from Excel spreadsheets into a reporting database. These Excel spreadsheets have always been manually updated by several users. However, because the data that the users are entering into these spreadsheets are first being entered into another separate application database, these users have been doing double-entry – a redundant process which can be easily remedied by various means.

Ideally, the solution for this problem would be to extract the data from the application database and load it into the reporting database using an SSIS package. Unfortunately, that would require some redevelopment of the application which loads data into the reporting database, and we (and the customers) have no bandwidth for that. So I came up with a quick workaround that made everyone happy – using a VBA macro to automatically populate the spreadsheets with data when the users open them.

The tricky part here was getting Excel to connect to Oracle with the least amount of work having to be done on the users’ PCs, which are remote from my location.

First of all, since these users don’t require SQL Plus or any development tools for Oracle, the full client software was unnecessary. Also, the Oracle Instant Client could be pushed to the users with Altiris Deployment Solution.

I had the Instant Client software installed on the PCs, then I added the requisite database connection information to the tnsnames.ora file.

Nota bene: In the Instant Client (or at least in our setup, using version 11.2.0.4) the tnsnames file is in
C:\oracle\instantclient_11_2_0_4 rather than in C:\oracle\product\11.2.0.4\client_1\NETWORK\ADMIN as it often would be in the full Oracle client.

The connection in VBA was simple enough, but not immediately obvious – notice that the connection string includes “Microsoft ODBC Driver for Oracle” rather than an Oracle driver; even though this is used, no ODBC connection needs to be set up in the ODBC Data Source Administrator. It is only imperative that the proper entries exist in the tnsnames.ora file, and that the Microsoft ActiveX Data Object Library is installed and referenced in Excel. (Add References by navigating to Tools –> References in the VBA editor in Excel.)

Excel References

In a subroutine, this code was used to connect to the database and pull data.

    Dim SQL_String As String
    Dim dbConnectStr As String
    Dim con As New ADODB.Connection
    Dim recset As New ADODB.Recordset
    Dim strUid As String
    Dim strPwd As String
    Dim strEnv As String
    Dim strDSN As String
    Dim iRow As Integer    
    

    strEnv = "prod"
    strUid = "username"
    strPwd = "password"

    If strEnv = "prod" Then
        strDSN = "(prod database net_service_name* from tnsnames)"
    Else
        strDSN = "(dev database net_service_name* from tnsnames)"
    End If
          
    dbConnectStr = "Driver={Microsoft ODBC for Oracle}; " & _
            "Server=" & strDSN & ";" & _
            "uid=" & strUid & ";pwd=" & strPwd & ";"
       
    con.ConnectionString = dbConnectStr    
    con.Open   
    
    SQL_String = "(insert SQL query here)"
           
    recset.Open SQL_String, con

    iRow = 0 
    Do While Not recset.EOF
         'Have a loop here to go through all the fields
        Sheet1.Range("A" & iRow).Value = recset.Fields("colname") ' colname = Column Name from SQL query
        
        ' &c. ...

        iRow = iRow + 1
        recset.MoveNext
    Loop

    recset.Close
    con.Close

* net_service_name