It’s T-SQL Tuesday time folks. Bob
Pusateri (blog|twitter) is the host and has chosen Common Table Expressions (CTEs) as the topic of the month.
Several ideas came to mind for writing about CTEs, one of the best uses I’ve seen for one recently was to grab the name of the most recent backup file for a database. You’ll have to ask Aaron Nelson (blog|twitter) to hook you up with that one though.
I thought I’d write about an interesting problem that was posed to me in an interview a couple of months ago.
Here’s a table
I was given a table with two columns; ManagerID, EmployeeID.
This table was populated with a few values thusly:
USE TempDB
GO
create table #ManagersEmployees (ManagerID int, EmployeeID int)
insert into #ManagersEmployees
values(1,2), (2,3), (2,4), (2,5), (3,6), (3,7), (3,8)
, (4,10),(5,11),(5,12), (12,13), (12,14)
GO
I was asked to write a recursive procedure to pull out the manager, employee tree for a given ManagerID.
CTEs to the rescue
Having done a little work with CTEs and understanding that I could easily write a recursive query using them I was able to quite quickly put together a script to pull the information needed. By throwing it into a procedure it could quickly and easily be executed.
CREATE PROCEDURE ManagerRecursion_CTE @ManagerID INT
AS
SET NOCOUNT ON
;WITH Managers_CTE (ManagerID, EmployeeID )
AS ( SELECT ManagerID, EmployeeID FROM #ManagersEmployees
WHERE ManagerID = @ManagerID
UNION ALL
SELECT e.ManagerID, e.EmployeeID
FROM #ManagersEmployees e
INNER JOIN Managers_CTE c on e.ManagerID = c.EmployeeID)
SELECT * FROM Managers_CTE ORDER BY ManagerID, EmployeeID
GO
I tested and this worked nicely, it was a simple solution and provided the requested results.
That’s not recursion
The trouble is that while the results were not correct I was advised that this was not recursive and did not meet the criteria. Back to the drawing board then.
After a lot more work I came up with the following:
CREATE PROCEDURE ManagerRecursion_NonCTE @ManagerID INT
AS
SET NOCOUNT ON
DECLARE @rowcnt INT, @lastrow INT
DECLARE @Tbl_Results TABLE (rowid INT IDENTITY(1,1), ManagerID INT, EmployeeID INT)
INSERT INTO @Tbl_Results (ManagerID, EmployeeID)
SELECT ManagerID, EmployeeID
FROM #ManagersEmployees
WHERE ManagerID = @ManagerID
SET @rowcnt = @@ROWCOUNT
SET @lastrow = 0
WHILE @rowcnt > 0
BEGIN
INSERT INTO @Tbl_Results (ManagerID, EmployeeID)
SELECT m.ManagerID, m.EmployeeID
FROM #ManagersEmployees m
INNER JOIN @Tbl_Results t
ON m.ManagerID = t.EmployeeID
WHERE rowid > @lastrow
SELECT @rowcnt = @@ROWCOUNT
SELECT @lastrow = @@IDENTITY - @rowcnt
END
SELECT ManagerID, EmployeeID FROM @Tbl_Results ORDER BY ManagerID, EmployeeID
GO
I tested this and got back the same results as with the first procedure with all the values I passed in. Deep breath on this one as it was pushing the limits of what I could produce on the spot in an interview.
That’s still not recursion
Again, while the results we correct this was not recursive. It was back to the drawing board once more. This time I had to admit defeat, however did tell the interviewer that I would work on a solution at home and email it in. He gave me his contact information, we completed the rest of the interview and I went home determined to get the right data in the manner that the interviewer wanted.
After a whole bunch of reading and a lot of work I finally came up with correct results in a recursive procedure which I emailed in to get feedback.
CREATE PROC ManagerRecursion @EmpID INT, @InnerLoop INT = 0
AS
BEGIN
IF @InnerLoop = 0
BEGIN
CREATE TABLE #Tbl_Results (ManagerID INT, EmployeeID INT)
END
INSERT INTO #Tbl_Results (ManagerID, EmployeeID)
SELECT ManagerID, EmployeeID FROM #ManagersEmployees WHERE ManagerID = @EmpID
SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE ManagerID = @EmpID)
WHILE @EmpID IS NOT NULL
BEGIN
EXEC ManagerRecursion @EmpID, 1
SET @EmpID = (SELECT MIN(EmployeeID) FROM #ManagersEmployees WHERE EmployeeID > @EmpID
AND EmployeeID NOT IN (SELECT ManagerID FROM #Tbl_Results)
AND EmployeeID IN (SELECT EmployeeID FROM #Tbl_Results))
END
IF @InnerLoop = 0
BEGIN
SELECT * FROM #Tbl_Results order by ManagerID, EmployeeID
DROP TABLE #Tbl_Results
END
END
GO
Unfortunately no feedback was forthcoming. I felt good about providing this solution despite that. I enjoy a challenge and this was certainly one of those.
So what was the right way?
That depends on who you ask. For me the first way was the right was. It performed well, the code was clean and easy and required a minimum amount of development. I feel that the solution here was exactly the reason that CTEs were created in the first place.
The second solution was a lot more work, the query got more complex and it does not perform as well as the first.
The final procedure was true recursion in that the procedure calls itself over and over again until all of the results are returned. It’s a loop that makes cursors look like they perform well. It was easily the worst performing of the three.
It all goes to show there’s more than one way to get the results you need. It’s also interesting how an example like this shows just how much work the SQL development team have done to help reduce complexity and improve performance.