Technology Understood
Tuesday, September 28, 2010
Wednesday, September 1, 2010
Differences between Temporary Tables (#tmp) and Table Variables (@tmp) MS SQL
For small to medium volumes of data and simple usage scenarios you should use table variables.
Some points to consider when choosing between them:
* Temporary Tables are real tables so you can do things like CREATE INDEXes, etc. If you have large amounts of data for which accessing by index will be faster then temporary tables are a good option.
* Table variables don't participate in transactions, logging or locking. This means they're faster as they don't require the overhead, but conversely you don't get those features. So for instance if you want to ROLLBACK midway through a procedure then table variables populated during that transaction will still be populated!
* Temp tables might result in stored procedures being recompiled, perhaps often. Table variables will not.
* You can create a temp table using SELECT INTO, which can be quicker to write (good for ad-hoc querying) and may allow you to deal with changing datatypes over time, since you don't need to define your temp table structure upfront.
* You can pass table variables back from functions, enabling you to encapsulate and reuse logic much easier (eg make a function to split a string into a table of values on some arbitrary delimiter).
* Using Table Variables within user-defined functions enables those functions to be used more widely (see CREATE FUNCTION documentation for details). If you're writing a function you should use table variables over temp tables unless there's a compelling need otherwise.
* Both table variables and temp tables are stored in tempdb. This means you should be aware of issues such as COLLATION problems if your database collation is different to your server collation; temp tables and table variables will by default inherit the collation of the server, causing problems if you want to compare data in them with data in your database.
* Global Temp Tables (##tmp) are another type of temp table available to all sessions and users.
Some points to consider when choosing between them:
* Temporary Tables are real tables so you can do things like CREATE INDEXes, etc. If you have large amounts of data for which accessing by index will be faster then temporary tables are a good option.
* Table variables don't participate in transactions, logging or locking. This means they're faster as they don't require the overhead, but conversely you don't get those features. So for instance if you want to ROLLBACK midway through a procedure then table variables populated during that transaction will still be populated!
* Temp tables might result in stored procedures being recompiled, perhaps often. Table variables will not.
* You can create a temp table using SELECT INTO, which can be quicker to write (good for ad-hoc querying) and may allow you to deal with changing datatypes over time, since you don't need to define your temp table structure upfront.
* You can pass table variables back from functions, enabling you to encapsulate and reuse logic much easier (eg make a function to split a string into a table of values on some arbitrary delimiter).
* Using Table Variables within user-defined functions enables those functions to be used more widely (see CREATE FUNCTION documentation for details). If you're writing a function you should use table variables over temp tables unless there's a compelling need otherwise.
* Both table variables and temp tables are stored in tempdb. This means you should be aware of issues such as COLLATION problems if your database collation is different to your server collation; temp tables and table variables will by default inherit the collation of the server, causing problems if you want to compare data in them with data in your database.
* Global Temp Tables (##tmp) are another type of temp table available to all sessions and users.
Tuesday, August 31, 2010
Common Table Expression - MS SQL
Sometimes we need to do some processes over temporaray data just like a View.
A CTE (Common Table Expression) is the best choice for the same.
Common Table Expressions, or CTE, are a new construct introduced in Microsoft SQL Server 2005 that offer a more readable form of the derived table that can be declared once and referenced multiple times in a query. Moreover, CTEs can be recursively defined, allowing a recursive entity to be enumerated without the need for recursive stored procedures.
A simple example
with demo as (
SELECT col1, col2, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3) as 'RowNumber'
FROM DBO..tablename
)
Select bpt from dbo..tablename bpt where bpt.col2 in (select col2 from demo where RowNumber > 1)
There are 3 things 1)CTE 2) ROW_NUMBER() 3)PARTITION BY
Here 'demo' is the CTE name. The thing to remember is CTE only applies to the first query following it.
The rank functions are as follows
Row_number(): It will produce row number of each rows on the basis of a column value.
example:
Rank(): will produce the (list of) ranks on the basis of a column value. this will leave gap between groups of same values.
Dense_rank() this is same as rank but it will not leave that gap like rank().
Ntile(n): It will partition the table into n number of parts.
Partition will basically create a partition on the column name.
A CTE (Common Table Expression) is the best choice for the same.
Common Table Expressions, or CTE, are a new construct introduced in Microsoft SQL Server 2005 that offer a more readable form of the derived table that can be declared once and referenced multiple times in a query. Moreover, CTEs can be recursively defined, allowing a recursive entity to be enumerated without the need for recursive stored procedures.
A simple example
with demo as (
SELECT col1, col2, col2,
ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3) as 'RowNumber'
FROM DBO..tablename
)
Select bpt from dbo..tablename bpt where bpt.col2 in (select col2 from demo where RowNumber > 1)
There are 3 things 1)CTE 2) ROW_NUMBER() 3)PARTITION BY
Here 'demo' is the CTE name. The thing to remember is CTE only applies to the first query following it.
The rank functions are as follows
Row_number(): It will produce row number of each rows on the basis of a column value.
example:
Rank(): will produce the (list of) ranks on the basis of a column value. this will leave gap between groups of same values.
Dense_rank() this is same as rank but it will not leave that gap like rank().
Ntile(n): It will partition the table into n number of parts.
Partition will basically create a partition on the column name.
Subscribe to:
Comments (Atom)