current position:Home>[Python kaggle] pandas basic exercises in machine learning series (6)

[Python kaggle] pandas basic exercises in machine learning series (6)

2021-08-23 10:18:58 Haibang Pro

Preface

Hello! buddy !
Thank you very much for reading Haihong's article , If there are mistakes in the text , You are welcome to point out ~
 
Self introduction. ଘ(੭ˊᵕˋ)੭
nickname : Sea boom
label : Program the ape |C++ player | Student
brief introduction : because C Language and programming , Then I turned to computer science , I was lucky to win some national awards 、 Provincial award … It has been insured . Currently learning C++/Linux/Python
Learning experience : Solid foundation + Take more notes + Knock more code + Think more + Learn English well !
 
Beginners Python Xiaobai stage
The article is only for your own study notes For knowledge system establishment and review
There are not many questions Learn a question Understand a problem
Know what it is Know why !

Previous recommendation

【Python|Kaggle】 Machine learning series Pandas Basic exercises ( One )

【Python|Kaggle】 Machine learning series Pandas Basic exercises ( Two )

【Python|Kaggle】 Machine learning series Pandas Basic exercises ( 3、 ... and )

【Python|Kaggle】 Machine learning series Pandas Basic exercises ( Four )

【Python|Kaggle】 Machine learning series Pandas Basic exercises ( 5、 ... and )

Introduction

Run the following cell to load your data and some utility functions.

Run the following code Import the library needed for the exercise 、 Data sets …

import pandas as pd

reviews = pd.read_csv("../input/wine-reviews/winemag-data-130k-v2.csv", index_col=0)

from learntools.core import binder; binder.bind(globals())
from learntools.pandas.renaming_and_combining import *
print("Setup complete.")

Exercises

View the first several lines of your data by running the cell below:

reviews.head()

Data used :
image.png

1.

subject

region_1 and region_2 are pretty uninformative names for locale columns in the dataset. Create a copy of reviews with these columns renamed to region and locale, respectively.

answer

The title mean :

modify region_1region_2 As a regionlocale
In fact, it is to modify the following names

renamed = reviews.rename(columns={
    'region_1':'region','region_2':'locale'})

Running results :
image.png
Other references Demo:

renamed = reviews.rename(columns=dict(region_1='region', region_2='locale'))

2.

subject

Set the index name in the dataset to wines.

answer

The title mean :

Name the index axis wines

reindexed = reviews.rename_axis('wines', axis='rows')

Running results :
image.png

3.

subject

The Things on Reddit dataset includes product links from a selection of top-ranked forums (“subreddits”) on reddit.com. Run the cell below to load a dataframe of products mentioned on the /r/gaming subreddit and another dataframe for products mentioned on the r//movies subreddit.

Run the following code Import the two data sets required for this question

gaming_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/g/gaming.csv")
gaming_products['subreddit'] = "r/gaming"
movie_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/m/movies.csv")
movie_products['subreddit'] = "r/movies"
gaming_products

Create a DataFrame of products mentioned on either subreddit.

answer

The title mean :

Merge two datasets

combined_products = pd.concat([gaming_products, movie_products])

Running results :
image.png

4.

subject

The Powerlifting Database dataset on Kaggle includes one CSV table for powerlifting meets and a separate one for powerlifting competitors. Run the cell below to load these datasets into dataframes:

Run the following code Import the data set required for this question

powerlifting_meets = pd.read_csv("../input/powerlifting-database/meets.csv")
powerlifting_competitors = pd.read_csv("../input/powerlifting-database/openpowerlifting.csv")
powerlifting_meets,powerlifting_competitors

The first data set is as follows ( Observe the number of columns )
image.png
The second data set is as follows ( Observe the number of columns )
image.png
Both tables include references to a MeetID, a unique key for each meet (competition) included in the database. Using this, generate a dataset combining the two tables into one.

answer

The title mean :

basis MeetID For two data sets To merge horizontally

powerlifting_combined = powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))

Running results :
image.png

Conclusion

The article is only for study notes , Record from 0 To 1 A process of

I hope it will be of some help to you , If you have any mistakes, you are welcome to correct them ~

I am a Sea boom ଘ(੭ˊᵕˋ)੭

If you think it's ok , Please like it

Thank you for your support ️

 Insert picture description here

copyright notice
author[Haibang Pro],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2021/08/20210823101855477j.html

Random recommended