Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Interesting issue with duplicates

Author  Topic 

Bex
Aged Yak Warrior

580 Posts

Posted - 2010-11-02 : 10:24:45
I have 2 tables, 1 to record order headers, and the second to record the operator actions. The 2 tables are joined to ultimately obtain performance information about each cashier (how long are they idle between each transaction (chatting for instance)).

This works fine as long as the 3 fields to join the tables remain unique (StoreNo, POSNo, TicketNo). However, this is not always the case. For example, half way through the day, a till can be restarted, causing the TicketNo to reseed back to 1. Therefore, this causes all sorts of issues when trying to report on idle time:

The first block of code demonstrates the perfect world....
USE [master]
GO
IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = 'TEST')
DROP DATABASE TEST
GO

CREATE DATABASE TEST
GO
USE [TEST]
GO
CREATE TABLE Header
(
StoreNo INT
, POSNo INT
, TicketNo INT
, StartTransDateTime DATETIME
, CashierNo INT
, CONSTRAINT PK_Header PRIMARY KEY
( StoreNo
, POSNo
, TicketNo
, StartTransDateTime )
)

CREATE TABLE OperatorEvent
(
StoreNo INT
, POSNo INT
, TicketNo INT
, EventDateTime DATETIME
, CashierNo INT
, CONSTRAINT PK_OperatorEvent PRIMARY KEY
( StoreNo
, POSNo
, TicketNo
, EventDateTime )
)

CREATE TABLE TransactionSequence
(
ID INT IDENTITY
, StoreNo INT
, POSNo INT
, TicketNo INT
, StartTransDateTime DATETIME
, EventDateTime DATETIME
, CashierNo INT
, CONSTRAINT PK_Perf PRIMARY KEY
(
ID
)
)
GO

INSERT INTO Header
SELECT 1,1,1,'2010-11-02 12:00:00.000',1
UNION
SELECT 1,1,2,'2010-11-02 12:15:00.000',1
UNION
SELECT 1,1,3,'2010-11-02 12:20:00.000',1
UNION
SELECT 1,1,4,'2010-11-02 12:30:00.000',2
UNION
SELECT 1,1,5,'2010-11-02 12:37:00.000',2
--UNION
--SELECT 1,1,1,'2010-11-02 23:10:00.000',1
--UNION
--SELECT 1,1,2,'2010-11-02 23:15:00.000',2

INSERT INTO OperatorEvent
SELECT 1,1,1,'2010-11-02 12:00:01.000',1
UNION
SELECT 1,1,2,'2010-11-02 12:15:01.000',1
UNION
SELECT 1,1,3,'2010-11-02 12:20:01.000',1
UNION
SELECT 1,1,4,'2010-11-02 12:30:01.000',2
UNION
SELECT 1,1,5,'2010-11-02 12:37:01.000',2
--UNION
--SELECT 1,1,1,'2010-11-02 23:10:01.000',1
--UNION
--SELECT 1,1,2,'2010-11-02 23:15:02.000',2
GO

-- Obtain all the transactions and events for the Cashier
INSERT INTO TransactionSequence (StoreNo, POSNo, TicketNo, StartTransDateTime, EventDateTime, CashierNo)
SELECT h.StoreNo, h.POSNo, h.TicketNo, h.StartTransDateTime, EventDateTime, h.CashierNo
FROM Header h
INNER JOIN OperatorEvent o
ON o.StoreNo = h.StoreNo
AND o.POSNo = h.POSNo
AND o.TicketNo = h.TicketNo
AND o.CashierNo = h.CashierNo
GO
SELECT * FROM TransactionSequence

-- Get the amount of time between each transaction (idle time)
SELECT s1.ID, s2.ID, s1.StoreNo, s1.POSNo, s1.TicketNo, s1.EventDateTime, s2.StartTransDateTime, DATEDIFF(ss,s1.EventDateTime, s2.StartTransDateTime), s1.CashierNo
FROM TransactionSequence s1
INNER JOIN TransactionSequence s2
ON s1.ID = s2.ID-1
GO


However, as we all know, the world is never perfect. Here is what happens when some duplicates are introduced:

-- However, the till is reset at somepoint throughout the day. This cause the ticketNumber to seed back to 1
-- Create more transactions
INSERT INTO Header
SELECT 1,1,1,'2010-11-02 23:10:00.000',1
UNION
SELECT 1,1,2,'2010-11-02 23:15:00.000',1

INSERT INTO OperatorEvent
SELECT 1,1,1,'2010-11-02 23:10:01.000',1
UNION
SELECT 1,1,2,'2010-11-02 23:15:02.000',1
GO

-- REDO the queries

TRUNCATE TABLE TransactionSequence
INSERT INTO TransactionSequence (StoreNo, POSNo, TicketNo, StartTransDateTime, EventDateTime, CashierNo)
SELECT h.StoreNo, h.POSNo, h.TicketNo, h.StartTransDateTime, EventDateTime, h.CashierNo
FROM Header h
INNER JOIN OperatorEvent o
ON o.StoreNo = h.StoreNo
AND o.POSNo = h.POSNo
AND o.TicketNo = h.TicketNo
AND o.CashierNo = h.CashierNo
GO
SELECT * FROM TransactionSequence

SELECT s1.ID, s2.ID, s1.StoreNo, s1.POSNo, s1.TicketNo, s1.EventDateTime, s2.StartTransDateTime, DATEDIFF(ss,s1.EventDateTime, s2.StartTransDateTime), s1.CashierNo
FROM TransactionSequence s1
INNER JOIN TransactionSequence s2
ON s1.ID = s2.ID-1
GO


The result is complete nonsense. I need to be able to produce a similar report as to the first instance, and to be able to cope with a duplicate scenario (so if ticketno remained unique, this would not be a problem). Please help

Hearty head pats

Bex
Aged Yak Warrior

580 Posts

Posted - 2010-11-02 : 11:32:49
I think I have a solution, although I need to test this a bit more. However, it currently works....

If anyone can think of a better way, please tell as this may run like a dog. (I have to encorporate the changes within the actual code which is a lot more complex)

USE [master]
GO
IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = 'TEST')
DROP DATABASE TEST
GO

CREATE DATABASE TEST
GO
USE [TEST]
GO
CREATE TABLE Header
(
StoreNo INT
, POSNo INT
, TicketNo INT
, StartTransDateTime DATETIME
, CashierNo INT
, CONSTRAINT PK_Header PRIMARY KEY
( StoreNo
, POSNo
, TicketNo
, StartTransDateTime )
)

CREATE TABLE OperatorEvent
(
StoreNo INT
, POSNo INT
, TicketNo INT
, EventDateTime DATETIME
, CashierNo INT
, CONSTRAINT PK_OperatorEvent PRIMARY KEY
( StoreNo
, POSNo
, TicketNo
, EventDateTime )
)

CREATE TABLE TransactionSequence
(
ID INT IDENTITY
, StoreNo INT
, POSNo INT
, TicketNo INT
, StartTransDateTime DATETIME
, EventDateTime DATETIME
, CashierNo INT
, CONSTRAINT PK_Perf PRIMARY KEY
(
ID
)
)
GO

INSERT INTO Header
SELECT 1,1,1,'2010-11-02 12:00:00.000',1
UNION
SELECT 1,1,2,'2010-11-02 12:15:00.000',1
UNION
SELECT 1,1,3,'2010-11-02 12:20:00.000',1
UNION
SELECT 1,1,4,'2010-11-02 12:30:00.000',2
UNION
SELECT 1,1,5,'2010-11-02 12:37:00.000',2
--UNION
--SELECT 1,1,1,'2010-11-02 23:10:00.000',1
--UNION
--SELECT 1,1,2,'2010-11-02 23:15:00.000',2

INSERT INTO OperatorEvent
SELECT 1,1,1,'2010-11-02 12:00:01.000',1
UNION
SELECT 1,1,2,'2010-11-02 12:15:01.000',1
UNION
SELECT 1,1,3,'2010-11-02 12:20:01.000',1
UNION
SELECT 1,1,4,'2010-11-02 12:30:01.000',2
UNION
SELECT 1,1,5,'2010-11-02 12:37:01.000',2
--UNION
--SELECT 1,1,1,'2010-11-02 23:10:01.000',1
--UNION
--SELECT 1,1,2,'2010-11-02 23:15:02.000',2
GO

-- Obtain all the transactions and events for the Cashier
INSERT INTO TransactionSequence (StoreNo, POSNo, TicketNo, StartTransDateTime, EventDateTime, CashierNo)
SELECT h.StoreNo, h.POSNo, h.TicketNo, h.StartTransDateTime, EventDateTime, h.CashierNo
FROM Header h
INNER JOIN OperatorEvent o
ON o.StoreNo = h.StoreNo
AND o.POSNo = h.POSNo
AND o.TicketNo = h.TicketNo
AND o.CashierNo = h.CashierNo
GO
SELECT * FROM TransactionSequence

-- Get the amount of time between each transaction (idle time)
SELECT s1.ID, s2.ID, s1.StoreNo, s1.POSNo, s1.TicketNo, s1.EventDateTime, s2.StartTransDateTime, DATEDIFF(ss,s1.EventDateTime, s2.StartTransDateTime), s1.CashierNo
FROM TransactionSequence s1
INNER JOIN TransactionSequence s2
ON s1.ID = s2.ID-1
AND s1.StoreNo = s2.StoreNo
AND s1.CashierNo = s2.CashierNo
GO

-- However, the till is reset at somepoint throughout the day. This cause the ticketNumber to seed back to 1
-- Create more transactions
INSERT INTO Header
SELECT 1,1,1,'2010-11-02 23:10:00.000',1
UNION
SELECT 1,1,2,'2010-11-02 23:15:00.000',1

INSERT INTO OperatorEvent
SELECT 1,1,1,'2010-11-02 23:10:01.000',1
UNION
SELECT 1,1,2,'2010-11-02 23:15:02.000',1
GO

-- Step 1: Identify the duplicates and save to a temp table
DROP TABLE #Dupes
SELECT COUNT(*)AS Dupes, StoreNo, POSNo, TicketNo
INTO #Dupes
FROM Header
GROUP BY StoreNo, POSNo, TicketNo
HAVING COUNT(*) > 1

TRUNCATE TABLE TransactionSequence

INSERT INTO TransactionSequence (StoreNo, POSNo, TicketNo, StartTransDateTime, EventDateTime, CashierNo)
-- Step 2: Select all the results from the tables minus the dupes
SELECT StoreNo, POSNo, TicketNo, StartTransDateTime, EventDateTime, CashierNo
FROM
(
SELECT h.StoreNo, h.POSNo, h.TicketNo, h.StartTransDateTime, EventDateTime, h.CashierNo
FROM Header h
INNER JOIN OperatorEvent o
ON o.StoreNo = h.StoreNo
AND o.POSNo = h.POSNo
AND o.TicketNo = h.TicketNo
AND o.CashierNo = h.CashierNo
LEFT JOIN #Dupes d
ON d.StoreNo = h.StoreNo
AND d.POSNo = h.POSNo
AND d.TicketNo = h.TicketNo
WHERE d.StoreNo IS NULL
AND d.POSNo IS NULL
AND d.TicketNo IS NULL
UNION
-- Step 3: Select the duplicates doing a sort on the datetime to create a new RANK identity value
-- This will then enable a join on the TicketNo and the RANK id to prevent the duplicate issue
-- Shall we include a check on the datedifference????
SELECT header.StoreNo, header.POSNo, header.TicketNo, header.StartTransDateTime, opevent.EventDateTime, header.CashierNo
FROM
(
SELECT RANK() OVER(PARTITION BY h.StoreNo, h.POSNo, h.TicketNo ORDER BY StartTransDateTime) AS [Rank],h.StoreNo, h.POSNo, h.TicketNo, StartTransDateTime, CashierNo
FROM Header h
INNER JOIN #Dupes d
ON d.StoreNo = h.StoreNo AND d.POSNo = h.POSNo AND d.TicketNo = h.TicketNo
)AS header
INNER JOIN
(
SELECT RANK() OVER(PARTITION BY h.StoreNo, h.POSNo, h.TicketNo ORDER BY EventDateTime) AS [Rank],h.StoreNo, h.POSNo, h.TicketNo, EventDateTime, CashierNo
FROM OperatorEvent h
INNER JOIN #Dupes d
ON d.StoreNo = h.StoreNo AND d.POSNo = h.POSNo AND d.TicketNo = h.TicketNo
)AS opevent
ON header.StoreNo = opevent.StoreNo
AND header.POSNo = opevent.POSNo
AND header.TicketNo = opevent.TicketNo
AND header.[Rank] = opevent.[Rank]
)AS Results
ORDER BY StoreNo, CashierNo, StartTransDateTime, POSNo

SELECT * FROM TransactionSequence

SELECT s1.ID, s2.ID, s1.StoreNo, s1.POSNo, s1.TicketNo, s1.EventDateTime, s2.StartTransDateTime, DATEDIFF(ss,s1.EventDateTime, s2.StartTransDateTime), s1.CashierNo
FROM TransactionSequence s1
INNER JOIN TransactionSequence s2
ON s1.ID = s2.ID-1
AND s1.StoreNo = s2.StoreNo
AND s1.CashierNo = s2.CashierNo

GO


Hearty head pats
Go to Top of Page
   

- Advertisement -