问题描述
我正在尝试在 Jupyter Notebook 中使用 Pandas 绘制一个比较特定州在 1960-1962 年间的谋杀率的折线图.
关于我现在在哪里以及我是如何到达这里的一些背景信息:
我正在使用犯罪 csv 文件,如下所示:
我目前只对 3 个栏目感兴趣:州、年份和谋杀率.具体来说,我只对 5 个州感兴趣——阿拉斯加、密歇根、明尼苏达、缅因、威斯康星.
为了生成所需的表格,我这样做了(仅显示前 5 行条目):
al_mi_mn_me_wi = 犯罪[(crimes['State'] == 'Alaska') |(犯罪['州'] =='密歇根')|(犯罪['州'] =='明尼苏达')|(犯罪['州'] =='缅因州')|(犯罪['州'] =='威斯康星州')]control_df = al_mi_mn_me_wi[['状态', '年份', '谋杀率']]从这里我使用了 pivot 功能
df = control_1960_to_1962.pivot(index = 'Year', columns = 'State',values= 'Murder Rate' )这就是我卡住的地方.我在做的时候收到了 KeyError(KeyError 是年份):
df.plot(x='Year', y='Murder Rate', kind='line')当尝试时
df.plot()我得到了这个不稳定的图表.
如何获得我想要的图表?
给定一个长(整齐)格式的数据帧,pandas.DataFrame.pivot 用于转换为宽格式,即可以直接用 pandas.DataFrame.plot
在 python 3.8.11、pandas 1.3.3、matplotlib 3.4.3
将 numpy 导入为 np将熊猫导入为 pdcontrol_1960_to_1962 = pd.DataFrame({'州': np.repeat(['阿拉斯加', '缅因州', '密歇根州', '明尼苏达州', '威斯康星州'], 3),‘年份’:[1960, 1961, 1962]*5,谋杀率":[10.2、11.5、4.5、1.7、1.6、1.4、4.5、4.1、3.4、1.2、1.0、.9、1.3、1.6、.9]})df = control_1960_to_1962.pivot(index='Year', columns='State', values='Murder Rate')# 显示(df)阿拉斯加州缅因州密歇根州明尼苏达威斯康星州年1960 10.2 1.7 4.5 1.2 1.31961 11.5 1.6 4.1 1.0 1.61962 4.5 1.4 3.4 0.9 0.9地块
您可以明确告诉 Pandas(并通过它实际执行绘图的 matplotlib 包)您想要的 xticks:
ax = df.plot(xticks=df.index, ylabel='谋杀率')输出:
ax 是
I am trying to plot a line graph comparing the Murder Rates of particular States through the years 1960-1962 using Pandas in a Jupyter Notebook.
A little context about where I am now, and how I arrived here:
I'm using a crime csv file, which looks like this:
I'm only interested in 3 columns for the time being: State, Year, and Murder Rate. Specifically I was interested in only 5 states - Alaska, Michigan, Minnesota, Maine, Wisconsin.
So to produce the desired table, I did this (only showing top 5 row entries):
al_mi_mn_me_wi = crimes[(crimes['State'] == 'Alaska') | (crimes['State'] =='Michigan') | (crimes['State'] =='Minnesota') | (crimes['State'] =='Maine') | (crimes['State'] =='Wisconsin')]
control_df = al_mi_mn_me_wi[['State', 'Year', 'Murder Rate']]
From here I used the pivot function
df = control_1960_to_1962.pivot(index = 'Year', columns = 'State',values= 'Murder Rate' )
And this is where I get stuck. I received KeyError when doing (KeyError was Year):
df.plot(x='Year', y='Murder Rate', kind='line')
and when attempting just
df.plot()
I get this wonky graph.
How do I get my desired graph?
Given a dataframe in a long (tidy) format, pandas.DataFrame.pivot is used to transform to a wide format, which can be plotted directly with pandas.DataFrame.plot
Tested in python 3.8.11, pandas 1.3.3, matplotlib 3.4.3
import numpy as np
import pandas as pd
control_1960_to_1962 = pd.DataFrame({
'State': np.repeat(['Alaska', 'Maine', 'Michigan', 'Minnesota', 'Wisconsin'], 3),
'Year': [1960, 1961, 1962]*5,
'Murder Rate': [10.2, 11.5, 4.5, 1.7, 1.6, 1.4, 4.5, 4.1, 3.4, 1.2, 1.0, .9, 1.3, 1.6, .9]
})
df = control_1960_to_1962.pivot(index='Year', columns='State', values='Murder Rate')
# display(df)
State Alaska Maine Michigan Minnesota Wisconsin
Year
1960 10.2 1.7 4.5 1.2 1.3
1961 11.5 1.6 4.1 1.0 1.6
1962 4.5 1.4 3.4 0.9 0.9
The plots
You can tell Pandas (and through it the matplotlib package that actually does the plotting) what xticks you want explicitly:
ax = df.plot(xticks=df.index, ylabel='Murder Rate')
Output:
ax is a matplotlib.axes.Axes object, and there are many, many customizations you can make to your plot through it.
Here's how to plot with the States on the x axis:
ax = df.T.plot(kind='bar', ylabel='Murder Rate')
Output:
这篇关于从枢轴绘制 Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)