Use “IS NULL” Rather Than “= NULL” When Upgrading to Newer Versions of SQL Server

Microsoft SQL Server logo

I’ve been upgrading an old Sybase PowerBuilder 6.5 app to a much newer version, SAP PowerBuilder 12.6.

During this little adventure, I’ve had to update much of the hard-coded SQL code that was originally written for SQL Server 4.2, and was later upgraded to SQL Server 2000.

This application’s database had been running on SQL Server 2008 since it was still compatible with SQL Server 2000 via the code below:

USE master
GO
ALTER DATABASE appdatabase
SET COMPATIBILITY_LEVEL = 80

Since we’re about to hit Microsoft’s End of Life for any version of SQL Server that will allow compatibility with SQL Server 2000, any code which is not obsolete must be rewritten.

I knew that using “*=” and “=*” instead of LEFT JOIN and RIGHT JOIN would have to change. What I had never seen before is that using “= NULL” rather than “IS NULL” in a WHERE clause would no longer work. Why that is makes sense; it’s just that older versions of SQL were more forgiving for straying from the ANSI standard.

If you have an older program and you upgrade the SQL Server database and all of a sudden you get “no rows returned” on queries that should be returning data, this could be your problem.

Comparing Multiple Patterns with LIKE Operator

pic of Facebook Like button

I really like the LIKE operator! (See what I did there?)

I use it on a daily basis. It does have one frustrating shortcoming, however. Unlike the IN operator (where a list is used for comparison), you can only compare one pattern with each LIKE statement.

If you have only a few patterns, this isn’t too much of a problem.

SELECT * FROM tbl_ThatTableOverThere
WHERE ThisColumn LIKE '%pattern1%'
OR ThisColumn LIKE '%pattern2%'
OR ThisColumn LIKE '%pattern3%'
-- [...]
OR ThisColumn LIKE '%patternX%';

But what if X is a very large number? Hundreds? Thousands? Maybe even millions?

I wrote the code below to allow for multiple patterns to be compared. The patterns are inserted into a table variable (@patterns) and are selected using a fast forward cursor. Unless the number of patterns is truly astronomical, this should suffice.

USE DatabaseName
GO
SET NOCOUNT ON
DECLARE @random INT, @upper INT, @lower INT, @randomTable NVARCHAR(20)
DECLARE @SQL NVARCHAR(MAX)
DECLARE @whereClause NVARCHAR(MAX)
DECLARE @sourceTable NVARCHAR(255)
DECLARE @columnName NVARCHAR(255)
DECLARE @pattern NVARCHAR(255)
DECLARE @patterns TABLE (pattern NVARCHAR(255) NOT NULL)
DECLARE @deleteTempTable NVARCHAR(MAX)

/* These variables should remain unchanged unless the name of the
temporary table needs to have different text or range of random numbers
for the suffix of the temp table.
[Reference: https://stackoverflow.com/questions/14756780/creating-a-temporary-table-name-with-a-randomly-generated-number] */
SET @lower = 1
SET @upper = 9999999
SELECT @random = ROUND(((@upper - @lower) * RAND() + @lower),0)
SELECT @randomTable = '##search'+CAST(@random AS NVARCHAR(20))+''

-- Name of source table holding data for comparison
SET @sourceTable = 'tbl_Name'

-- Column that is being compared with patterns using LIKE operator
SET @columnName = 'ColumnName'

-- Additional WHERE clause may be specified here
SET @whereClause = '' -- e.g., 'AND AnyColumnName LIKE ''%b%''' 

-- Insert patterns into table variable below
INSERT INTO @patterns VALUES
('%pattern1%')
-- ,('%pattern2%')
-- ,(%pattern3%) and so on

-- Deletes temp table if it already exists
SET @deleteTempTable = 'IF object_id(''tempdb..' + @randomTable + ''') IS NOT NULL DROP TABLE ' + @randomTable
EXEC(@deleteTempTable)

-- This line creates the temporary table using the same schema as the source table without inserting data.
-- The UNION ALL operator prevents the IDENTITY property from being applied to any of the columns in the temp table.
-- [Reference: https://dba.stackexchange.com/questions/916/how-do-i-copy-a-table-with-select-into-but-ignore-the-identity-property]
EXEC('SELECT TOP 0 * INTO ' + @randomTable + ' FROM ' + @sourceTable + ' UNION ALL SELECT TOP 0 * FROM ' + @sourceTable)

SET @SQL = 'INSERT INTO ' + @randomTable + CHAR(13) + 'SELECT TOP 0 * FROM ' + @sourceTable

DECLARE patternsCursor CURSOR FAST_FORWARD FOR
SELECT pattern FROM @patterns

OPEN patternsCursor
FETCH NEXT FROM patternsCursor INTO @pattern

WHILE @@FETCH_STATUS = 0
BEGIN
	IF LEN(@pattern) > 0
	BEGIN
		--Note: UNION operator (as opposed to UNION ALL) is used here to prevent duplicate rows from being inserted into table variable
		--[Reference: https://www.techonthenet.com/sql/union_all.php]
		SET @SQL = @SQL + CHAR(13) + 'UNION' + CHAR(13) + 'SELECT * FROM ' + @sourceTable + CHAR(13) + 'WHERE ' + @columnName + ' LIKE ''' + @pattern + ''''
		IF LEN(@whereClause) > 0 SET @SQL = @SQL + CHAR(13) + @whereClause
	END

	FETCH NEXT FROM patternsCursor INTO @pattern
END
CLOSE patternsCursor
DEALLOCATE patternsCursor
SET @SQL = @SQL + ';'

-- Prints final SQL statement on Results tab
SELECT @SQL
SET NOCOUNT OFF
-- Executes SQL Statement showing number of rows inserted on Messages tab
EXEC(@SQL)
SET NOCOUNT ON

-- Executes SQL Statement showing rows in Results tab
EXEC('SELECT * FROM ' + @randomTable + ';')

-- Deletes temp table
EXEC(@deleteTempTable)

Deleting Duplicate Rows in a SQL Server Table Using a CTE

SQL Server logo

Finding duplicate rows in a table is quite simple. Deleting only the duplicates – as opposed to deleting all rows that have duplicates – is a bit more complicated.

I have a table variable called “@Readings” with the following schema:

DECLARE @Readings TABLE
(
	TagName VARCHAR(100),
	ScanTime DATETIME,
	TagValue NUMERIC(10,3)
)

First, I load the table with data from another database on a linked server called “OtherDB”:

INSERT INTO @Readings(TagName,ScanTime,TagValue)
SELECT DISTINCT tag,[time],value
	FROM
	OPENQUERY(OtherDB,'select tag, time, value from archive.comp where (tag LIKE ''%test%'') AND time = DATE(''*'') ')

Then, I load that same data with some updates, both to the ScanTime column, and then to the TagValue column – always holding the TagName column constant.

DECLARE @DateTimeStamp DATETIME
SET @DateTimeStamp = GETDATE()

INSERT INTO @Readings(TagName,ScanTime,TagValue)
SELECT TagName,@DateTimeStamp,TagValue FROM @Readings

INSERT INTO @Readings(TagName,ScanTime,TagValue)
SELECT TagName,ScanTime,12345 FROM @Readings

To see what rows duplicate the TagName column alone, this query will suffice:

SELECT TagName FROM @PI_Readings GROUP BY TagName HAVING ( COUNT(*) > 1 )

By adding column names, you can look where the duplication is the TagName and the ScanTime:

SELECT TagName, ScanTime FROM @PI_Readings GROUP BY TagName, ScanTime HAVING ( COUNT(*) > 1 )

(If I were to add also the TagValue column, I would get zero rows returned, since there are currently no duplicates where all three columns are identical.)

So, what if you wanted to delete only the duplicates where TagName and ScanTime matched, irrespective of the TagValue?

The easiest way I have found is to use what is called a Common Table Expression, or CTE.

To use a CTE to delete these duplicates let’s first construct it to only show the duplicates:

WITH CTE AS
(
	SELECT TagName, ScanTime, TagValue,
	RN = ROW_NUMBER()OVER(PARTITION BY TagName, ScanTime ORDER BY TagName, ScanTime)
	FROM @Readings
)
SELECT * FROM CTE WHERE RN > 1

To now delete the offending rows, change the last line from a SELECT statement to a DELETE statement:

CTE example

When creating the CTE statement, make sure there is a semicolon terminating the previous statement. This is not usually required in SQL Server, but it is in this case.

Another good example of this is on Stack Overflow.

While I do like SQL Fiddle, it seems that the DELETE function does not work on CTEs there, though SELECT statements do.