current position:Home>Introduction to pandas operation
Introduction to pandas operation
2022-02-02 02:23:32 【Xiao Wang is not serious】
Pandas Introduction to operation
Indexes
establish & increase
Method 1 :
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df)
Copy code
Method 2 :
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name')
print(df)
Copy code
Multi level index
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index(['name','team'])
print(df)
Copy code
Do not delete the set index column 、
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name',drop=False)
print(df)
Copy code
Add index ( Keep the original index )
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name',append=True)
print(df)
Copy code
Delete ( Restore )
All level indexes are deleted by default
df=pd.read_excel('text.xlsx')
df=df.set_index('name')
print(df)
df=df.reset_index()
# Specify index columns
# df=df.reset_index(level=0)
# df=df.reset_index(level='name')
print(df)
Copy code
name From index to column
Be careful :
If when setting the index Set not to delete the set index column An error will be reported during operation
Prompt data already exists
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name',drop=False)
print(df)
df=df.reset_index()
print(df)
Copy code
attribute
import pandas as pd
df=pd.read_excel('text.xlsx')
# name
print(df.index.name)
# array Array
print(df.index.array)
# data type
print(df.index.dtype)
# Element quantity
print(df.index.size)
# array Array
print(df.index.values)
Copy code
Common operations
df.index.astype('int64') # Conversion type
df.index.isin() # Check for presence
df.index.rename('number') # Modify index name
df.index.rename(['name', 'team']) # Multi-storey , Rename index
df.index.nunique() # The number of non repeating values
df.index.sort_values(ascending=False,) # Sort , In reverse order
df.index.map(lambda x:x+'_') # map Function processing
df.index.str.replace('_', '') # str Replace
df.index.str.split('_') # Separate
df.index.to_list() # Turn to list
df.index.to_frame(index=False, name='a') # Turn into DataFrame
df.index.to_series() # To series
df.index.to_numpy() # To numpy
df.index.unique() # duplicate removal
df.index.value_counts() # Weight removal and counting
df.index.where(df.index=='adf') # Screening
df.index.max() # Maximum
df.index.argmax() # Maximum index value
df.index.min() # Maximum
df.index.argmin() # Maximum index value
df.index.T # Transposition
Copy code
rename
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name')
print(df)
df=df.rename_axis('index')
print(df)
Copy code
Modify multi-level index name
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index(['name','team'])
print(df)
df=df.rename_axis(['index1','index2'])
print(df)
Copy code
Modify index content
Change column names
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name')
print(df)
# One to one modification
df=df.rename(columns={'team':'0'})
print(df)
# Modify all
df=df.set_axis(['0','1','2','3','4'],axis=1)
print(df)
Copy code
Modify the index
import pandas as pd
df=pd.read_excel('text.xlsx')
df=df.set_index('name')
print(df)
# One to one modification
df=df.rename(index={'Liver':'1'})
print(df)
# Modify all
df=df.set_axis(list(range(0,100)),axis='index')
print(df)
Copy code
data
Style view
import pandas as pd
df=pd.read_excel('text.xlsx')
# Check the previous data Default 5 strip
print(df.head())
# Check the following data Default 5 strip
print(df.tail())
# Random view of data Default 1 strip
print(df.sample())
Copy code
Specify the number
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df.head(2))
print(df.tail(2))
print(df.sample(2))
Copy code
attribute
import pandas as pd
df=pd.read_excel('text.xlsx')
# dimension
print(df.shape)
# Information
print(df.info)
# data type
print(df.dtypes)
Copy code
Data statistics or processing
Statistical table
total 、 The average 、 Standard deviation 、 minimum value 、 Four percentile 、 Maximum
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df.describe())
Copy code
function
df.mean() # Returns the mean of all columns
df.corr() # Returns the correlation coefficient between columns
df.count() # Returns the number of non null values in each column
df.max() # Returns the maximum value of each column
df.min() # Returns the minimum value of each column
df.abs() # The absolute value
df.median() # Returns the median of each column
df.std() # Returns the standard deviation of each column , Bessel corrected sample standard deviation
df.var() # No bias
df.sem() # The standard error of the mean
df.mode() # The number of
df.prod() # multiply continuously
df.mad() # Mean absolute deviation
df.cumprod() # Cumulative ride , Multiplicative multiplication
df.cumsum(axis=0) # Add up , Add up
df.nunique() # De weight quantity , Quantities of different values
df.idxmax() # Index name of the maximum value per column
df.idxmin() # The index name of the minimum value of each column
df.cummax() # Cumulative maximum
df.cummin() # Cumulative minimum
df.skew() # sample skewness ( The third stage )
df.kurt() # Sample Kurtosis ( Fourth order )
df.quantile() # Sample quantiles ( Different % Value )
Copy code
Specify a single column
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df)
print(df['Q1'].mean())
Copy code
Specify a single line
Because the first two data are str type So use slicing
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df.loc[0])
print(df.loc[0][2:].mean())
Copy code
df.round(2) # Specify the field to retain decimal places
df.nunique()# Number of de duplication values per column
s.nunique() # The de duplication value of this column
Copy code
Difference value
import pandas as pd
df=pd.Series([2,12,6,5,10])
# The difference between the current number and the previous number
print(df.diff())
# The difference between the current number and the next number
print(df.diff(-2))
Copy code
DataFrame
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df.loc[:5,'Q1':'Q4'])
print(df.loc[:5,'Q1':'Q4'].diff())
print(df.loc[:5,'Q1':'Q4'].diff(1,axis=1))
Copy code
Position shifting
import pandas as pd
df=pd.read_excel('text.xlsx')
# Move down the 2 That's ok
print(df.shift(2))
# Move up 2 That's ok
print(df.shift(-2))
# Moving to the left
print(df.shift(2,axis=1))
# Move right
print(df.shift(-2,axis=1))
Copy code
ranking rank()
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df.head())
print(df.head().rank())
print(df.head().rank(axis=1))
Copy code
Data selection
operation | grammar |
---|---|
Select column | df[x] |
Select rows by index | df.loc[x] |
Select rows by numeric index | df.iloc[x] |
Use slice to select rows | df[0:x] |
Filter rows with expressions | df[x>=0] |
Select column
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df['name'])
Copy code
Slice select row
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df[:4])
print(df[5:10])
print(df[0::2])
Copy code
Take line by label loc
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df.loc['Arry'])
Copy code
Using slice Choose more than one
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df.loc['Arry':'Oah'])
print(df.loc[['Arry','Oah']])
Copy code
Set read column
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df.loc['Arry':'Oah',['Q1','Q2']])
Copy code
Get rows by numeric index iloc
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df.iloc[0:3])
print(df.iloc[0:10:2])
# Set the fetched column
print(df.iloc[0:5,[0,1]])
Copy code
Take the specific value
at[] iat[] The previous parameter is the index The latter parameter is the column name iat Get by numeric index
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df)
print(df.at['Arry','Q1'])
print(df.iat[1,1])
Copy code
get Get a column
import pandas as pd
df=pd.read_excel('text.xlsx',index_col='name')
print(df.get('team',0))
Copy code
Intercept data
Note that only numeric indexes
import pandas as pd
df=pd.read_excel('text.xlsx')
print(df.truncate(before=2,after=6))
Copy code
copyright notice
author[Xiao Wang is not serious],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202020223314244.html
The sidebar is recommended
- How IOS developers learn Python Programming 22 - Supplement 1
- Python can meet any API you need
- Python 3 process control statement
- The 20th of 120 Python crawlers, 1637. All the way business opportunity network joined in data collection
- Datetime of pandas time series preamble
- How to send payslips in Python
- [Python] closure and scope
- Application of Python Matplotlib color
- leetcode 1627. Graph Connectivity With Threshold (python)
- Python thread 08 uses queues to transform the transfer scenario
guess what you like
-
Python: simple single player strange game (text)
-
Daily python, chapter 27, Django template
-
TCP / UDP communication based on Python socket
-
Use of pandas timestamp index
-
leetcode 148. Sort List(python)
-
Confucius old book network data collection, take one anti three learning crawler, python crawler 120 cases, the 21st case
-
[HTB] cap (datagram analysis, setuid capability: Python)
-
How IOS developers learn Python Programming 23 - Supplement 2
-
How to automatically identify n + 1 queries in Django applications (2)?
-
Data analysis starts from scratch. Pandas reads HTML pages + data processing and analysis
Random recommended
- 1313. Unzip the coding list (Java / C / C + + / Python / go / trust)
- Python Office - Python edit word
- Collect it quickly so that you can use the 30 Python tips for taking off
- Strange Python strip
- Python crawler actual combat, pyecharts module, python realizes China Metro data visualization
- DOM breakpoint of Python crawler reverse
- Django admin custom field stores links in the database after uploading files to the cloud
- Who has powder? Just climb who! If he has too much powder, climb him! Python multi-threaded collection of 260000 + fan data
- Python Matplotlib drawing streamline diagram
- The game comprehensively "invades" life: Python releases the "cool run +" plan!
- Python crawler notes: use proxy to prevent local IP from being blocked
- Python batch PPT to picture, PDF to picture, word to picture script
- Advanced face detection: use Dlib, opencv and python to detect face markers
- "Python 3 web crawler development practice (Second Edition)" is finally here!!!!
- Python and Bloom filters
- Python - singleton pattern of software design pattern
- Lazy listening network, audio novel category data collection, multi-threaded fast mining cases, 23 of 120 Python crawlers
- Troubleshooting ideas and summary of Django connecting redis cluster
- Python interface automation test framework (tools) -- interface test tool requests
- Implementation of Morse cipher translator using Python program
- [Python] numpy notes
- 24 useful Python tips
- Pandas table beauty skills
- Python tiktok character video, CV2 module, Python implementation
- I used Python to climb my wechat friends. They are like this
- 20000 words take you into the python crawler requests library, the most complete in history!!
- Answer 2: why can you delete the table but not update the data with the same Python code
- [pandas learning notes 02] - advanced usage of data processing
- How to implement association rule algorithm? Python code and powerbi visualization are explained to you in detail (Part 2 - actual combat)
- Python adds list element append() method, extend() method and insert() method [details]
- python wsgi
- Introduction to Python gunicorn
- Python dictionary query key value pair methods and examples
- Opencv Python reads video, processes and saves it frame by frame
- Python learning process and bug
- Imitate the up master and realize a live broadcast room controlled by barrage with Python!
- Essence! Configuration skills of 12 pandas
- [Python automated operation and maintenance road] path inventory
- Daily automatic health punch in (Python + Tencent cloud server)
- [Python] variables, comments, basic data types