In this SQL Server tutorial, you will learn about the VAR Function in SQL Server, where you will learn about the variance and how to compute it using the VAR() function.
Also, you will learn how to use this function when your dataset has duplicate values; as you know, duplicate values or unwanted data in your data create noise, and your analysis doesn’t give good results as expected.
You will apply the VAR() function to the real-world table data.
VAR Function in SQL Server
The VAR() function in SQL Server computes the variance of the data set (sample data). The variance is a statistical measure that shows how much values in a data set differ from the mean of the set. It helps us know the data dispersion or degree of spread.
The syntax is given below.
VAR(ALL | DISTINCT expression)
- VAR(): The function computes the variance of the given data set.
- expression: The expression can be the columns, set of values, or any numeric data set. But it can’t be bit type data type; remember, as an expression, you can’t use the aggregate function or subqueries.
- ALL | DISTINCT: When you use the VAR function in SQL Server with different options, such as ALL or DISTINCT, it computes the variance of the set of data differently.
- VAR(ALL expression): When you use the VAR() function with the ALL option, it applies the VAR() function on all the values of the given expression (columns values), including the duplicated values in the columns of the table.
- VAR(DISTINCT expression): This one only applies the VAR() function on the distinct values of the columns while excluding the duplicate ones.
The VAR () function return type is a float, which indicates that after calculating the variance of the given set of values, it returns the number, and the data type of that number is always a float type.
Let’s take an example and understand the workings of the VAR function in SQL Server. Consider you have a table ProductSales with columns ProductName and Sales, as shown below.
Here, you have a task to find the variance of all the sales values in the table ProductSales. For that, use the query below.
SELECT VAR(Sales) StdevSales FROM ProductSales;
From the output of the above query, you can see the sales variance in the Sales column is 146.68. This is the variability in the sale amount of product.
This is how to use the VAR function in SQL Server to compute the variance of the values in the table’s column.
VAR Function in SQL Server with ALL and DISTINCT
When the VAR() function is used with options ALL and DISTINCT, it returns a different variance.
First, use the ALL option within the VAR(), as it will take all the column values, including the duplicate values.
The syntax is given below.
SELECT VAR(ALL column_name)
For example, use the below Product_Sales table.
The table SalePrice column contains the duplicate values, as shown in the above picture. Use the VAR() with option ALL to compute the variability in the sale price of the products. For that, use the below query.
SELECT VAR(ALL SalePrice) AS VarianceSalePrice
From the output of the above query, you can see the variability in the product’s sale price is very large, which is 122373333333.333. But here, you don’t need to worry about the variability number; understand how to use the VAR() with option ALL for computing the variance by including the duplicate values within a column.
Next, use the DISTINCT option with the VAR() function using the syntax below.
SELECT VAR(DISTINCT column_name)
Again, compute the variability in the sale price of the product with the DISTINCT option using the query below.
SELECT VAR(DISTINCT SalePrice) AS VarianceSalePrice
The output shows that the variance is distinct this time because you are applying the VAR() functions on unique values of the SalePrice column using the DISTINCT option. As a result, it returns the value 129536111111.11.
This is how you can use the ALL or DISTINCT option with the VAR function in SQL Server to calculate the variance of the column values.
VAR Function in SQL Server using GROUP BY clause
If you want to apply the VAR() function on the data group, you can use the GROUP BY clause in your query to divide the result set into groups based on the specified column or criteria. The VAR() function is applied separately for each group.
As shown below, imagine a Sales table with columns sale_id, product_id, sale_amount, and sale_date.
So here, you need to calculate the variability in the sale amount of each product_id separately. To find that, use the below query.
SELECT product_id, VAR(sale_amount) AS VarianceSaleAmount
GROUP BY product_id;
The output of the above query shows the variability in sales for each product separately. For example, a product with ID P101 has a variability of 175, P105 has 12.5, etc.
Also, the variance for some of the product IDs is NULL because it returns the null when you pass the NULL values to the VAR() function.
But here, you need to understand from the above example that you can apply the VAR() function on a group of data or group-wise using the GROUP BY clause.
This is how to use the VAR function in SQL Server with GROUP BY clause to apply the VAR() function on each group independently.
In this SQL Server tutorial, you learned how to calculate the variability of the table column values using the VAR function in SQL Server. You also computed the variance using the GROUP BY clause to apply the VAR() function group-wise.
You may like to read:
- STDEV Function in SQL Server
- How to use MIN Function in SQL Server
- How to use MAX Function in SQL Server
After working for more than 15 years in the Software field, especially in Microsoft technologies, I have decided to share my expert knowledge of SQL Server. Check out all the SQL Server and related database tutorials I have shared here. Most of the readers are from countries like the United States of America, the United Kingdom, New Zealand, Australia, Canada, etc. I am also a Microsoft MVP. Check out more here.