There is no shortcut to filter or get the top n percent scores from each group in Google Sheets.
One of my readers, Lorraine, asked me about the possibility of filtering the top 10% scores from each group.
As far as I know, in Docs Sheets, there is no built-in command or function for the same. I have workarounds. So I thought about writing this tutorial.
We can use combinations of four functions – Percentile, Filter, Vlookup, and IF – for the said purpose. There are also four steps involved.
In my last post, I have already covered the first step. I’ll link to that post when we start writing the formula below.
Without further ado, I am going to the steps below.
Get the Top N Percent Scores From Each Group – Step-by-Step Instructions
Sample Data (in cell range B1:C17) and Expected Result (in cell range in G7:H10):
There are two groups in my sample, and they are “a” and “b”.
In my result, there are three scores. Two from the first group and one from the last group. Because in the example, we are filtering the top 10% values from each group.
To get the top n percent scores from each group in Google Sheets, we should first find the k-th percentile value.
That means to get the top 10% scores, first, we should find the 90th percentile. We can use the Percentile function for the same. That’s step 1.
Here are the four steps involved.
Group Wise 90th Percentile of Scores (Step 1)
We are writing the formula to get the top n percent scores from each group in Google Sheets. Let’s consider the n is 10%.
So what we should do is to find the 90th percentile value from each group.
That means the scores above the 90th percentile values are the top 10% scores in each group.
In cell G2, unique the items using the following Unique formula.
=unique(B2:B)
In cell H2, key in the following formula and drag to H3 to get the 90th percentile of each item.
=round(percentile(filter($C$2:$C,$B$2:$B=G2),90%))
For details (formula explanation), you can read this post – Calculating Percentile for Each Group in Google Sheets.
Note:-
Rounding the percentile is a must, and I have opted for that in the formula. But it may sometimes cause issues if you have decimal scores.
For example, if the 90th percentile is 6.5 when you round, it will become 7. So if you have no scores above 7, the formula may not return any rows from that group.
Example:
The 90th percentile of the values 5, 5, 5, 5, 6.5, 6.5, and 6.5 will be 6.5. If you round this percentile value, it will become 7. So you won’t be able to filter the top 10% values from this.
That means, if you have decimal scores, either round down or do not round the percentile value.
Lookup and Assign Percentiles to Groups (Step 2)
Step 2 is the key in filtering the top n percent scores from each group.
The above formula returns two scores in cell range H2:H3. Using a Vlookup Array Formula, we can assign those values against the items.
Insert the following Vlookup in cell D2.
=ArrayFormula(IFNA(vlookup(B2:B,G2:H3,2,0)))
Vlookup searches the items (B2:B) in the first column of the range (G2:H3) and returns the 90th percentile values from the second column (H2:H3).
Now you may please take a look at the column C and D in the screenshot above.
As you can understand, we only want to filter the rows wherever the score is greater than or equal to the corresponding percentile value.
We can apply this condition in the next column using a logical statement. That’s the next step.
Logical Test Scores and Percentile Values (Step 3)
We can insert the following IF statement for the said purpose.
=ArrayFormula(if(C2:C="",,if(C2:C>=D2:D,true,false)))
This step 3 formula should go in cell E2.
What does this formula do?
It tests whether the scores (C2:C) are >= percentile values (D2:D). If it satisfies, the formula returns TRUE else FALSE.
Now to the final part.
Filter TRUE Values to Get the Top N Percent Scores from Each Group (Step 4)
Since we have already completed the groundworks, it’s now easy to get the top n (here 10) percent values from each group.
Just insert the following FILTER formula in cell G8, and voila!
=filter(B2:C,E2:E=true)
The formula filters the table (B2:C) if the column E2:E contains TRUE.
Conclusion
The only roadblock in combining all the above four steps into a single formula is the step 1 formula.
We can’t make that an array formula.
That means we can get the result in 2 steps.
The first one is, of course, the step 1 formula. The second one is the combined step 2 to 4 formulas.
You can skip the steps 2 to 4 formulas and instead use the below formula in G8.
=filter(B2:C,if(C2:C="",,if(C2:C>=IFNA(vlookup(B2:B,G2:H3,2,0)),true,false))=true)
Additional Notes:-
In the above sample data, there are only two groups. So the step 1 formula returns only two rows in G2:H3.
If you have growing data, you can’t predict the number of groups.
So better insert the above formula in cell J2 and replace G2:H3 in the formula with G2:H. Then, drag the H2 formula further down depending on the number of groups.
That’s all about how to get top n percent scores from each group in Google Sheets.
Thanks for the stay. Enjoy!
Resources:
- Average of Top N Percent of the Values in Google Sheets.
- How to Randomly Extract a Certain Percentage of the Rows in Google Sheets.
- Query to Filter a Column Contains Percentage Values in Google Sheets.
- Calculating the Percentage of Total in Google Sheets [How To].
- Percent Distribution of Grand Total in Google Sheets Query.
- Percentage Change Array Formula in Google Sheets.