# Pandas: how to group and calculate by index

2022-01-30 11:52:26

You can use the following methods in pandas Group by one or more index columns and perform some calculations .

Method 1： Group by an index column

``````df.groupby('index1')['numeric_column'].max()

Copy code ``````

Method 2： Group by multiple index columns

``````df.groupby(['index1', 'index2'])['numeric_column'].sum()
Copy code ``````

Method 3： Group by index column and general column

``````df.groupby(['index1', 'numeric_column1'])['numeric_column2'].nunique()
Copy code ``````

The following example shows how to use the following multi indicator pandas DataFrame To use each method .

``````import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
'position': ['G', 'G', 'G', 'F', 'F', 'G', 'G', 'F', 'F', 'F'],
'points': [7, 7, 7, 19, 16, 9, 10, 10, 8, 8],
'rebounds': [8, 8, 8, 10, 11, 12, 13, 13, 15, 11]})

#set 'team' column to be index column
df.set_index(['team', 'position'], inplace=True)

#view DataFrame
df

points	 rebounds
team	position
A	G	 7	 8
G	 7	 8
G	 7	 8
F	 19	 10
F	 16	 11
B	G	 9	 12
G	 10	 13
F	 10	 13
F	 8	 15
F	 8	 11

Copy code ``````

### Method 1： Group by an index column

The following code shows how to find 'points' The maximum value of the column , And pass 'position' Index columns are grouped .

``````#find max value of 'points' grouped by 'position index column
df.groupby('position')['points'].max()

position
F    19
G    10
Name: points, dtype: int64

Copy code ``````

### Method 2： Group by multiple index columns

The following code shows how to pass " The team " and " Location " Index columns to find " integral " The sum of the columns .

``````#find max value of 'points' grouped by 'position index column
df.groupby(['team', 'position'])['points'].sum()

team  position
A     F           35
G           21
B     F           26
G           19
Name: points, dtype: int64
Copy code ``````

### Method 3： Group by index column and general column

The following code shows how to ' rebounds ' The number of unique values found in the column , And through the index column ' The team ' And normal columns ' integral ' grouping .

``````#find max value of 'points' grouped by 'position index column
df.groupby(['team', 'points'])['rebounds'].nunique()

team  points
A     7         1
16        1
19        1
B     8         2
9         1
10        1
Name: rebounds, dtype: int64
Copy code ``````