current position:Home>You can easily get started with Excel. Python data analysis package pandas (2): advanced filtering (2)

You can easily get started with Excel. Python data analysis package pandas (2): advanced filtering (2)

2021-08-23 04:46:59 Excel catalyst

Series articles :

Preface

I often listen to others Python How powerful in the data field , As a result, I studied for a long time , Even data processing is a death of trouble . Only later , It's not Python Data processing is powerful , But he has a data analysis artifact —— pandas .

Followed by the advanced screening in the previous article , But the example in the previous article is too simple , It's a little difficult this time .

data

Continue to use the data in the previous article :

  • - An examination report form of a school (8 Academic achievements )

Load data

pandas Need to load Excel data , as follows :

Complex filtering

" People who are above the grade average ",Excel The condition area settings for advanced filtering are as follows :

  • - This time you need to use the function formula in the condition value
  • - =K2>AVERAGE(K:K) , You can imagine ,Excel Will traverse each line , Ergodic time , I'll put... In the function formula K2 , Replace... With the value of the current line

pandas The implementation idea is the same , as follows :

The above requirements are still a little simple , More stuff .

" A school whose total score is higher than the average score of its class ",Excel The condition area settings for advanced filtering are as follows :

  • - You can't simply use AVERAGE , This is because we need to be based on class Get the average score of the class , This use AVERAGEIF(B:B,B2,K:K)

pandas Implementation is a little troublesome . as follows :

  • - .groupby(' class ') , Group by class
  • - .apply , For each group, query the records whose total score exceeds the average score . there query The string is the same as the previous example

Embarrassment Excel The task of

Some tasks use Excel The built-in function is very cumbersome , such as :

" In the whole class ,8 Students whose grades exceed the grade average "

If this task uses Excel Realization , Each account must be averaged one by one , Direct look at pandas The implementation of the :

  • - Define a method , This facilitates reuse of logic
  • - front 2 sentence , First find the average score of each subject
  • - Then find out the number of subjects above average for each student count
  • - Finally, make a simple judgment , You get the result

Look at the call result :

Now , If you need to see "8 Students whose grades exceed the average of the class ", It's very simple :

  • - These are all in the class " Three good students "

Is it right? so easy?!

Let's see which students need to stay in class to see their parents .

"8 Students whose grades are below the average in the class ", It's still simple :

  • - Just add one parameter sjs=0 , You get the result

This article is from WeChat official account. - Excel catalyst (ExcelCuiHuaJi)

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the [email protected] Delete .

Original publication time : 2019-08-21

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

copyright notice
author[Excel catalyst],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2021/08/20210823044656698P.html