In this SQL Server tutorial, you will learn about the VARP Function in SQL Server, where you will learn how to compute the variance of all data using the VARP() function.
When your dataset has duplicate values, it can create noise and lead to inaccurate analysis. This function will help you deal with such unwanted items. You will understand the syntax of the VARP() function with options like ALL and DISTINCT.
Then, you take the table column values and compute the variance of those values using the VARP() function.
VARP Function in SQL Server
VARP function in SQL Server computes the variance of the population (dataset, table data). Whether you are a data analyst or a data science enthusiast, this function benefits data analysis.
The syntax of VARP is given below.
VARP(ALL | DISTINCT expression)
- VARP(): The function computes the variance of the given data set.
- Expression: The expression can be the columns, set of values, or any numeric data set. But it can’t be bit type data type; remember, as an expression, you can’t use the aggregate function or subqueries.
- ALL | DISTINCT: When you use the VAR function in SQL Server with different options, such as ALL or DISTINCT, it computes the variance of the set of data differently.
- VARP(ALL expression): When you use the VARP() function with the ALL option, it applies the VARP() function on all the values of the given expression (columns values), including the duplicated values in the columns of the table.
- VARP(DISTINCT expression): This one only applies the VARP() function on the distinct values of the columns while excluding the duplicate ones.
The VARP() function return type is a float, which signifies that after calculating the variance of the given set of values, it returns the number, and the data type of that number is always a float type.
Let’s take an example and understand how the VARP() function works in SQL Server. Consider a table SalesData with columns ProductID, Region, and SalesAmount.
If you want to compute the sale amount variance of all the values in the SalesAmount column, you can use the VARP() function on the SalesAmount column, as shown below.
SELECT VARP(SalesAmount) AS SaleAmountVar FROM SalesData;
In the result, the variance of the values in the SalesAmount column is 56422.75, as you can see in the above output. Look how the VARP() function in SQL Server computed the variance of the values in the SalesAmount.
Now you know how the VARP function in SQL Server calculates the population variance or the table column values.
VARP Function in SQL Server with ALL and DISTINCT
The VARP() function with options ALL and DISTINCT returns a different variance value.
When using the ALL option, the VARP() function will take all column values, including duplicates.
The syntax is given below.
SELECT VAR(ALL column_name)
For example, use the below Product_Sales table.
As you can see, the SalePrice column contains duplicate values, so to include the same values while computing the variance, use the ALL option with the VARP() function, as shown in the below query.
SELECT VARP(ALL SalePrice) AS VarSalePrice
From the output of the above query, you can see the considerable variability in the product’s sale price, which is 110136000000.
But here, you don’t need to worry about the variability number; understand how to use the VARP() with option ALL for computing the variance by including the duplicate values within a column.
You can use the DISTINCT option with the VAR() function to remove duplicates. Use the following syntax to do so.
SELECT VARP(DISTINCT column_name)
Compute the product sale price variability using the DISTINCT option with the below query.
SELECT VARP(DISTINCT SalePrice) AS VarianceSalePrice
The variance is now distinct because the VARP() function is applied to unique values of the SalePrice column using the DISTINCT option. As a result, it returns the value 115143209876.543.
This is how you can use the ALL or DISTINCT option with the VARP function in SQL Server to calculate the variance of the column values.
VARP Function in SQL Server using GROUP BY clause
To apply the VARP() function on a data group, you can use the GROUP BY clause in your query. It will divide the result set into different groups based on the specified column or criteria, and the VARP() function will be applied separately for each group.
For example, use the SalesData table that you have used in the first section of this tutorial.
SELECT product_id, VARP(sale_amount) AS VarSaleAmount
GROUP BY product_id;
The query output displays the sales variability for each product separately. For instance, a product with ID P101 has a variability of 116.66, while P105 has 6.25.
The variance is returned as NULL when NULL values are passed to the VARP() function for specific product IDs.
From the above example, you can apply the VARP function in SQL Server to a group of data using the GROUP BY clause.
In this SQL Server tutorial, you learned how to use the VARP function in SQL Server to calculate the population variance (table column values). you also learned how to deal with duplicate values in a column while computing the variance. Finally, the VARP() function is on the group-based data.
You may like to read:
After working for more than 15 years in the Software field, especially in Microsoft technologies, I have decided to share my expert knowledge of SQL Server. Check out all the SQL Server and related database tutorials I have shared here. Most of the readers are from countries like the United States of America, the United Kingdom, New Zealand, Australia, Canada, etc. I am also a Microsoft MVP. Check out more here.