How to delete duplicated records in MS SQL Server?

First, create one demo table

Let’s create one demo table first. It will be called Person and will hold some fictive person data.

SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE dbo.Person
(
  Id int IDENTITY(1,1) NOT NULL,
  Firstname nvarchar(50) NULL,
  Lastname nvarchar(50) NULL,
  Email varchar(100) NULL,
  Created datetime NULL,
  CONSTRAINT PK_Person PRIMARY KEY CLUSTERED
  (
    Id ASC
  )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON PRIMARY
) ON PRIMARY
GO
SET ANSI_PADDING OFF
GO

Now, fill the data

Insert some fictive person data and do some duplicated records that have the same values for fields  firstname, lastname and email.

INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('John', 'Belvien', 'john.belvien@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Peter', 'Lumberjack', 'peter.lumberjack@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Peter', 'Lumberjack', 'peter.lumberjack@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Peter', 'Lumberjack', 'peter.lumberjack@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Suzy', 'Leghorn', 'suzy.leghorn@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Suzy', 'Leghorn', 'suzy.leghorn@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Daisy', 'Sommerfield', 'daisy.sommerfield@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Daisy', 'Sommerfield', 'daisy.sommerfield@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Daisy', 'Sommerfield', 'daisy.sommerfield@example.com', GETDATE())
INSERT INTO Person (Firstname, Lastname, Email, Created) VALUES ('Daisy', 'Sommerfield', 'daisy.sommerfield@example.com', GETDATE())

Now the magic

Use this subquery trick to extract duplicate records and then delete them. Notice the use of PARTITTION and WITH Transact-SQL keywords. All records that have RowNumber > 1 are deleted.

WITH CTE AS
(
  SELECT  Id,
          Firstname,
          Lastname,
          Email,
          ROW_NUMBER() OVER (PARTITION BY Firstname, Lastname, Email
          ORDER BY	Created) AS RowNumber
  FROM Person
)
DELETE FROM Person WHERE Id IN (SELECT Id FROM CTE WHERE RowNumber > 1)
Continue Reading

Using SQL Server APPLY command in queries

APPLY clause in Transact-SQL is interesting because it can be user in the following scenario:
For example, you might create a query that returns a list of payments (in this case last 2 payment amounts and dates) for each subscriber inside your SaaS product.

SELECT c.Id AS CustomerId,
       c.Firstname + ' ' + c.Lastname AS Name,
       i.PaidDate,
       i.Amount
FROM   Customer c
OUTER APPLY
(
  SELECT TOP 2 i.PaidDate, i.Amount
  FROM Invoice i
  WHERE i.CustomerId = c.Id
  ORDER BY i.PaidDate DESC
) AS i
WHERE YEAR(c.Registered) = YEAR(GETDATE())

You get something like this

958 Usain Jordan 2014-06-01 299.00
958 Usain Jordan 2014-07-18 458.00
110 Seagal KungFu 2014-01-12 15.16
110 Seagal KungFu 2014-03-06 17.45
Continue Reading

Using SQL Server PIVOT feature to transform rows into columns

PIVOT queries in SQL Server are useful when we need to transform data from row-level to column-level.

Example: Imagine having the table named Invoice (payments per month) – for simplicity let’s say that we have PaidDate and Amount.

Id	PaidDate                    Amount
1	2014-01-05 12:34:22.000	    318.22
2	2014-01-06 11:44:00.000	    128.33
3	2014-04-07 16:44:00.000	     55.50
4	2014-02-08 15:12:00.000	     88.55
5	2014-02-16 18:39:00.000     115.45
6	2014-03-01 16:27:00.000	      9.99
7	2014-03-08 13:13:00.000	     16.99
8	2014-04-27 16:44:00.000	    568.69
9	2014-05-10 19:40:00.000	     43.65

What we wanted now is to get total money collected from the beginning of the year 2014, and in order to do this we can use this PIVOT query:

SELECT *
FROM (
  SELECT  YEAR(PaidDate) AS [Year],
          LEFT(DATENAME(month, PaidDate), 3) AS [Month],
          Amount FROM Invoice
) as PaymentsByYear
PIVOT
(
  SUM(Amount)
  FOR [Month] IN (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC)
) AS PaymentsByYear
ORDER BY [Year] DESC

And we finally get

Year    JAN     FEB     MAR     APR     MAY     JUN     JUL     AUG     SEP     OCT     NOV     DEC
2014	446.55  204.00  26.98   624.19  43.65   NULL    NULL    NULL    NULL    NULL    NULL    NULL
Continue Reading

How to get list of sequential dates in MS SQL Transact SQL and cursors alternative

Let’s imagine that we have table where we store some service prices and that this price can vary regarding period of the year.

You would then have something like this in your table:

$50 from January 1st to May 31th
$70 from June 1st to October 31th
$60 from November 1st to December 31th

What we want is to get all daily prices for given date region. So, if we give range from May 15th to 15th June we want to get this:

May 15th $50
May 16th $50
May 17th $50
May 18th $50
May 19th $50
May 20th $50
May 21th $50
May 22th $50
May 23th $50
May 24th $50
May 25th $50
May 26th $50
May 27th $50
May 28th $50
May 29th $50
May 30th $50
May 31th $50
June 1st $70
June 2nd $70
June 3rd $70
June 4th $70
June 5th $70
June 6th $70
June 7th $70
June 8th $70
June 9th $70
June 10th $70
June 11th $70
June 12th $70
June 13th $70
June 14th $70
June 15th $70

In following piece of code is explained how to do it, plus there is a nice bonus how you can avoid using slow cursors for iterating through SQL records.

DECLARE @FromDate datetime, @UntilDate datetime, @Date datetime, @Id int
SET @FromDate = '2014-05-15'
--Trick to make date independant from year (always use the current year)
SET @FromDate = CAST(CAST(YEAR(GETUTCDATE()) AS varchar) + '-' + CAST(MONTH(@FromDate) AS varchar) + '-' + CAST(DAY(@FromDate) AS varchar) AS DATETIME)
SET @UntilDate = '2014-06-15'
--Trick to make date independant from year (always use the current year)
SET @UntilDate = CAST(CAST(YEAR(GETUTCDATE()) AS varchar) + '-' + CAST(MONTH(@UntilDate) AS varchar) + '-' + CAST(DAY(@UntilDate) AS varchar) AS DATETIME)

--Declare your temporary table for sequential date list with price
DECLARE @results TABLE
(
  aId int IDENTITY(1, 1),
  aDate datetime,
  aPrice decimal(15,2)
)

--Fill temporary table with sequential dates
;WITH T(date)
AS
(
  SELECT @FromDate
  UNION ALL
  SELECT DATEADD(day, 1, T.date) FROM T WHERE T.date < @UntilDate
)
INSERT INTO @results (aDate)
SELECT date FROM T OPTION (MAXRECURSION 32767)

--This temp table will be used for iterating through the records (do not use slow cursors)
DECLARE @temp TABLE
(
  bId int,
  bDate datetime
)

INSERT INTO @temp
SELECT  aId,
        aDate
FROM    @results

WHILE EXISTS(SELECT * FROM @temp)
BEGIN
  SELECT TOP 1 @Id = bId FROM @temp
  SELECT @Date = aDate FROM @results WHERE aId = @Id

  --Do you logic here (find specific row record from SQL table using @Date) and get price
  UPDATE @results SET
    aPrice = Price
  FROM YourTable
  WHERE IntervalStart <= @Date AND @Date <= IntervalEnd

  --"Decrease" counter
  DELETE FROM @temp WHERE bId = @Id
END

SELECT   aDate AS [Date],
        aPrice AS Price
FROM    @results

Note 1

If you want to make yearly independate date comparisons in MS SQL then you should create new temporary table where you would copy interval records (with prices) from YourTable and then update these record (IntervalStart and IntervalEnd dates) by changing their year date component to current year (see at the top for the syntax)

Note 2

If @FromDate and @UntilDate have different years, i.e. @FromDate = ‘2013-12-15’ and @UntilDate = ‘2014-01-15’ then, when doing sequential dates (days) you should call the described transforming function (search WITH keywoard) twice: first from @FromDate until the end of the year ‘2013-12-31’ and then from the first day of the year ‘2014-01-01’ until the @UntilDate

Check. Mate. 🙂

chess-samuel-rachevsky

(source: Wikipedia)

Continue Reading

How to create Linked MS SQL Server in Management Studio?

Creating MS SQL Linked Server means that you can access your remote MS SQL database (on other server) like it is local.

Syntax to access some data is as simple as that:

[linked server name].[database name].dbo.[table name]

Official Microsoft documentation is really sometimes hard to use. A lot of information but none practical example of usage.

microsoft-documentation-linked-server

Google search also didn’t helped much on the first try. After few tries I finally found out how you can add MS SQL Linked server to your existing database server.

So, you need first to add new linked server going to Server Objects > Linked Servers and then right click with mouse and choose New Linked Server… and you will end up with something like this (what to fill here is the real mistery):

addiing-linked-server

I gave up with this and I finally found the SQL command to do this properly:

EXEC master.dbo.sp_addlinkedserver
@server = N'YOURSERVER\SQL2008,1444', --You can include Your server, DB instance, port in not 1433
@srvproduct=N'MSSQL',
@provider=N'SQLNCLI',
@provstr=N'PROVIDER=SQLOLEDB;SERVER=YOURSERVER\SQL2008,1444'

--Now set the name of your linked server and access (we used 'Linked')
EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=N'Linked', @useself=N'False',@locallogin=NULL,@rmtuser=N'sa',@rmtpassword='YourP@@sw0rd'

That’s all folks 🙂

Continue Reading

MongoDB vs MS SQL Server Comparison

We are using both databases for our projects and we can finalize some thoughts on both of them…

Mongo Facts

  • Free – official website is here and you can download it here
  • Because it is using NoSQL concept it is really extremely fast when it comes to the inserts into database. It is almost fast as writing directly into the disk. It is at least 10 times faster when doing inserts.
  • Because is is using NoSQL there is no scheme, you just store your object into database (JSON serialization is done automatically) and when you later add some fields into your business object you don’t need to update scheme for table, just instert it and DB will take care of  everything
  • It is not user friendly when it comes to writing more complex queries especially for grouping especially when you figure our that standard group option is limited and for returning more then 20,000 records you need to use map-reduce
  • When it comes to complex reading, especially grouping it is much slower when this operation need to be done on large group of records. If number of records is small it is still faster then MS SQL.
  • Disk consumption will remain unchanged, even when you delete some records so you will need to do manually optimization to reduce the database size – it can be long operation
  • The more memory you have and tha faster HDD you have (SSD is recommended – this expensive Intel 700 server series) it will work better because it tries to keep all indexes in memory and if number of records is big and you have more then one index then the size of these indexes will grow. We are talking here about maximum RAM capabilities you are able to put into the server.
  • Still there is some issue with a thrust and questions “Is your data safe?”, “What happen when it crash?”
  • You can write “store procedures” – almost exactly as Javascript syntax
  • Desinged for easy sharding

MS SQL Facts

  • Expensive when compared to free MongoDB, also there is a complicated system of Microsoft licensing (per CPU core)
  • Slower inserts because or relational structure
  • Every table should be designed first, all columns need to have type defined and every change of your business object need that also database need to be changed (not 100% true, but almost)
  • It is very user friendly when it comes to writing SQL queries and joins and very fast (compared to Mongo) when it comes to grouping on a large tables
  • Possibility to use Store Procedures and do complex calculations directly in MS SQL (Mongo has also possibility to write these – similar to Javascript syntax)
  • More easy to lookup your data using nice GUI such as MS SQL Server Managament Studio

Do you need some help in implementing your database model?

We have 15 years of experience and if you need any help please contact us.

Continue Reading