Which is More Important: Newest Data, or Fastest Load Time?

Happy New Year!

As we developers often do, I have recently looked at some code that had aged a bit – more like vinegar and less like wine – and thought, “Did I write that? What was I thinking?” Constantly becoming aware of and implementing best practices, avoiding bad practices, and recovering from failures are all part of growing in many careers, and most certainly must be a part of getting better at software development.

Many of my posts have dealt with Web scraping and using APIs to display profile data from various sites. One question that I’ve never answered is whether or not the data needs to be real-time, or if it can be delayed a bit. The benefits of real-time data are obvious, but the drawbacks can be considerable.

For example, on this blog: until recently, every time the blog loaded, the profile data from several sites was retrieved from these sites on each request. This caused the page to load slowly (aside from my other performance issues), and it’s entirely possible that these other sites would have stopped allowing me to query them, had my traffic been very heavy.

Since it’s far from critical that this profile data be less than a day old, and since I would like my blog to load as quickly as possible, I decided to refactor and then enhance the PHP code I had written to pull this profile data. The new code would store the profile data on my site, and reload the data only if it was older than a given number of days.

YSlow Chrome Extension
Use YSlow to diagnose performance issues on your website.

This data could be stored either as text in a file, or as a value in a database. In this case, I stored it in a text file.

/* initialize variables */
$filename = "whatever.txt";
$html = '';
$norefresh = FALSE;
$days = 1;

/* checks to see if file exists and is current */
if (file_exists($filename)) {
    $stats = stat($filename);
    /* 86400 seconds in one day */
    if ($stats[9] > (time() - (86400 * $days))) {
    	$norefresh = TRUE;
    }
}

/* if $norefresh is still FALSE, file will be created or updated; otherwise, it will be loaded */
if ($norefresh) {
    $html = file_get_contents($filename);
} else {
    /* do whatever needs to be done to build the $html variable */
    // ...
    // ...
    // ...

    /* put the $html value into the file */
    file_put_contents($filename, $html);
}

/* display the $html variable contents */
echo $html;

The above code will check to see if a file with the expected data exists, and if so, whether it is new enough – in this case, less than one day old. If not, the data is retrieved and stored in a file for future use. Lastly, the data – whether cached or newly retrieved – is displayed.

Leave a Reply