# You can easily get started with Excel. Python data analysis package pandas (2): advanced filtering (2)

2021-08-23 04:46:59

Series articles ：

Preface

I often listen to others Python How powerful in the data field , As a result, I studied for a long time , Even data processing is a death of trouble . Only later , It's not Python Data processing is powerful , But he has a data analysis artifact —— pandas .

Followed by the advanced screening in the previous article , But the example in the previous article is too simple , It's a little difficult this time .

data

Continue to use the data in the previous article ：

• - An examination report form of a school (8 Academic achievements )

pandas Need to load Excel data , as follows ：

Complex filtering

" People who are above the grade average ",Excel The condition area settings for advanced filtering are as follows ：

• - This time you need to use the function formula in the condition value
• - =K2>AVERAGE(K:K) , You can imagine ,Excel Will traverse each line , Ergodic time , I'll put... In the function formula K2 , Replace... With the value of the current line

pandas The implementation idea is the same , as follows ：

The above requirements are still a little simple , More stuff .

" A school whose total score is higher than the average score of its class ",Excel The condition area settings for advanced filtering are as follows ：

• - You can't simply use AVERAGE , This is because we need to be based on class Get the average score of the class , This use AVERAGEIF(B:B,B2,K:K)

pandas Implementation is a little troublesome . as follows ：

• - .groupby(' class ') , Group by class
• - .apply , For each group, query the records whose total score exceeds the average score . there query The string is the same as the previous example

Some tasks use Excel The built-in function is very cumbersome , such as ：

" In the whole class ,8 Students whose grades exceed the grade average "

If this task uses Excel Realization , Each account must be averaged one by one , Direct look at pandas The implementation of the ：

• - Define a method , This facilitates reuse of logic
• - front 2 sentence , First find the average score of each subject
• - Then find out the number of subjects above average for each student count
• - Finally, make a simple judgment , You get the result

Look at the call result ：

Now , If you need to see "8 Students whose grades exceed the average of the class ", It's very simple ：

• - These are all in the class " Three good students "

Is it right? so easy？！

Let's see which students need to stay in class to see their parents .

"8 Students whose grades are below the average in the class ", It's still simple ：

• - Just add one parameter sjs=0 , You get the result

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the [email protected] Delete .

Original publication time ： 2019-08-21

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .