current position:Home>V. pandas based on Python
V. pandas based on Python
2022-01-29 18:10:59 【Favor 316】
This article has participated in 「 Digging force Star Program 」, Win a creative gift bag , Challenge creation incentive fund .
Little knowledge , Great challenge ! This article is participating in “ A programmer must have a little knowledge ” Creative activities .
1, pandas brief introduction
Pandas Is based on NumPy A tool of , The tool is created to solve data analysis tasks .Pandas It includes a large number of databases and some standard data models , Provides the tools needed to operate large datasets efficiently .pandas Provides a large number of functions and methods that enable us to process data quickly and conveniently .
2.Series object
- Pandas Based on two data types : series And dataframe.
- Series yes Pandas The most basic object in ,Series Like a one-dimensional array . in fact ,Series Basically based on NumPy From the array object of . and NumPy The array of is different ,Series Can customize labels for data , That's the index (index), Then access the data in the array through the index .
- Dataframe Is a two-dimensional table structure .Pandas Of dataframe Many different data types can be stored , And each axis has its own label . You can think of it as a series Dictionary entry for .
3,Series Perform arithmetic operations
"""
Yes series All arithmetic operations are based on index On going .
We can add, subtract, multiply and divide (+- */) Such an operator is used for two series Carry out operations ,
Pandas Will be based on the index index, Calculate the response data , The results will be stored as floating point numbers , To avoid losing accuracy .
If Pandas In two series Can't find the same index, The corresponding position returns a null value NaN'' '
"""
series1 = pd.Series( [ 1,2,3,4],[ 'London ', 'HongKong' , ' Humbai ' , 'lagos'] )
series2 = pd.Series( [ 1,3,6,4],[ 'London ' , ' Accra ' , 'lagos ' , ' Delhi ' ] )
print ( series1-series2 )
print('*'*30)
print ( series1+series2 )
print('*'*30)
print ( series1*series2)
Copy code
4,DataFrame The creation of
DataFrame( Data sheet ) It's a kind of ⒉ Dimensional data structure , Data is stored in tabular form , Divided into rows and columns . adopt DataFrame, You can easily process data . Common operations, such as selecting 、 Replace row or column data , It can also reorganize data tables 、 Modify the index 、 Multiple screening, etc . We can basically put DataFrame Understand as a group of with the same index Series Set . call DataFrame() Data in various formats can be converted into DataFrame object , Its three parameters data、index and columns They are data 、 Row index and column index .
5, DataFrame Object common properties
import pandas as pd
from pandas import Series,DataFrame
import numpy as np
# dataframe Common properties
df_dict = {
'name':['James','Curry','Iversion'],
'age':['18','20','19'],
'national':['us','China','us']
}
df = pd.DataFrame(data=df_dict,index=['0','1','2'])
print(df)
# Gets the number of rows and columns
print(df.shape)
# # Get row index
print(df.index.tolist())
# # Get column index
print(df.columns.tolist())
# The type of data obtained
print(df.dtypes)
# Get the dimensions of the data
print(df.ndim)
# values Properties are also displayed in two dimensions ndarray Form return of DataFrame The data of
print(df.values)
# Exhibition df Overview of
print(df.info())
# Display the first few lines , Default display 5 That's ok
print(df.head(2))
# Show the last few lines
print(df.tail(1))
# obtain DataFrame The column of
print(df['name'])
# Because we only get one column , So the return is a Series
print(type(df['name']))
# If you get multiple columns , Then the return is a DataFrame type :
print(df[['name','age']])
print(type(df[['name','age']]))
# Get a row
print(df[0:1])
# Go to many lines
print(df[1:3])
# Take a column in multiple rows ( You cannot select multiple rows and columns )
print(df[1:3][['name','age']])
# Be careful : df[] You can only select rows , Or column selection , You cannot select multiple rows and columns at the same time .
'''
df.loc Index row data through tags
df.iloc Get row data through location
'''
# Get the data of a row and a column
print(df.loc['0','name'])
# All columns in one row
print(df.loc['0',:])
# Data with one row and multiple columns
print(df.loc['0',['name','age']])
# Select multiple rows and columns with intervals
print(df.loc[['0','2'],['name','national']])
# Select consecutive rows and spaced columns
print(df.loc['0':'2',['name','national']])
# Take a line
print(df.iloc[1])
# Take consecutive lines
print(df.iloc[0:2])
# Take multiple lines with discontinuities
print(df.iloc[[0,2],:])
# Take a column
print(df.iloc[:,1])
# A certain value
print(df.iloc[1,0])
# Modified value
df.iloc[0,0]='panda'
print(df)
# dataframe The sorting method in
df = df.sort_values(by='age',ascending=False)
# ascending=False : Descending order , The default is ascending
print(df)
Copy code
6, dataframe modify index、columns
df1 = pd.DataFrame(np.arange(9).reshape(3, 3), index = ['bj', 'sh', 'gz'], columns=['a', 'b', 'c'])
print(df1)
# modify df1 Of index
print(df1.index) # You can print it out print Value , You can also assign a value to it
df1.index = ['beijing', 'shanghai', 'guangzhou']
print(df1)
# Customize map function (x Is the original row and column value )
def test_map(x):
return x+'_ABC'
# inplace: Boolean value , The default is False. Specifies whether to return a new DataFrame. If True, In the original df Modify the , The return value is None.
print(df1.rename(index=test_map, columns=test_map, inplace=True))
# meanwhile ,rename You can also pass in a dictionary , For a index Modify the name separately
df3 = df1.rename(index={'bj':'beijing'}, columns = {'a':'aa'})
print(df3)
# Convert column to index
df1=pd.DataFrame({'X':range(5),'Y':range(5),'S':list("abcde"),'Z':[1,1,2,2,2]})
print(df1)
# Specify a column as the index (drop=False Specifies that the columns that are indexes are retained at the same time )
result = df1.set_index('S',drop=False)
result.index.name=None
print(result)
# Row to column index
result = df1.set_axis(df1.iloc[0],axis=1,inplace=False)
result.columns.name=None
print(result)
Copy code
copyright notice
author[Favor 316],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/01/202201291810568430.html
The sidebar is recommended
- Compile D + +, and use d to call C from python
- Install tensorflow and python 3.6 in Windows 7
- Python collects and monitors system data -- psutil
- Python collects and monitors system data -- psutil
- Finally, this Python import guide has been sorted out. Look!
- Quickly build Django blog based on function calculation
- Getting started with Python - object oriented - special methods
- Teach you how to use Python to transform an alien invasion game
- You can easily get started with Excel. Python data analysis package pandas (VI): sorting
- Implementation of top-level design pattern in Python
guess what you like
-
Using linear systems in python with scipy.linalg
-
Python tiktok 5000+ V, and found that everyone love this video.
-
Using linear systems in python with scipy.linalg
-
How to get started quickly? How to learn Python
-
Modifying Python environment with Mac OS security
-
You can easily get started with Excel. Python data analysis package pandas (XI): segment matching
-
Advanced practical case: Javascript confusion of Python anti crawling
-
Better use atom to support jupyter based Python development
-
Better use atom to support jupyter based Python development
-
Fast power modulus Python implementation of large numbers
Random recommended
- Python architects recommend the book "Python programmer's Guide" which must be read by self-study Python architects. You are welcome to take it away
- Decoding the verification code of Taobao slider with Python + selenium, the road of information security
- Python game development, pyGame module, python implementation of skiing games
- This paper clarifies the chaotic switching operation and elegant derivation of Python
- You can easily get started with Excel. Python data analysis package pandas (3): making score bar
- Test Development: self study Dubbo + Python experience summary and sharing
- Python + selenium automated test: page object mode
- You can easily get started with Excel. Python data analysis package pandas (IV): any grouping score bar
- Opencv skills | saving pictures in common formats as transparent background pictures (with Python source code) - teach you to easily make logo
- You can easily get started with Excel. Python data analysis package pandas (V): duplicate value processing
- Python ThreadPoolExecutor restrictions_ work_ Queue size
- Python generates and deploys verification codes with one click (Django)
- With "Python" advanced, you can catch all the advanced syntax! Advanced function + file operation, do not look at regret Series ~
- At the beginning of "Python", you must see the series. 10000 words are only for you. It is recommended to like the collection ~
- [Python kaggle] pandas basic exercises in machine learning series (6)
- Using linear systems in python with scipy.linalg
- The founder of pandas teaches you how to use Python for data analysis (mind mapping)
- Using Python to realize national second-hand housing data capture + map display
- Python image processing, automatic generation of GIF dynamic pictures
- Pandas advanced tutorial: time processing
- How to make Python run faster? Six tips!
- Django: use of elastic search search system
- Fundamentals of Python I
- Python code reading (chapter 35): fully (deeply) expand nested lists
- Python 3.10 official release
- Solution of no Python 3.9 installation was detected when uninstalling Python
- This pandas exercise must be successfully won
- [Python homework] coupling network information dissemination
- Python application software development tool - tkinterdesigner v1.0 5.1 release!
- [Python development tool Tkinter designer]: Lecture 2: introduction to Tkinter designer's example project
- [algorithm learning] sword finger offer 64 Find 1 + 2 +... + n (Java / C / C + + / Python / go / trust)
- leetcode 58. Length of Last Word(python)
- Problems encountered in writing the HTML content of articles into the database during the development of Django blog
- leetcode 1261. Find Elements in a Contaminated Binary Tree(python)
- [algorithm learning] 1486 Array XOR operation (Java / C / C + + / Python / go / trust)
- Understand Python's built-in function and add a print function yourself
- Python implements JS encryption algorithm in thousands of music websites
- leetcode 35. Search Insert Position(python)
- leetcode 1829. Maximum XOR for Each Query(python)
- [introduction to Python visualization]: 12 small examples of complete data visualization, taking you to play with visualization ~