关于python:Pandas Multicolumn Groupby绘图

Pandas Multicolumn Groupby Plotting

问题:
我有一个熊猫数据框,要按年份-月份和rule_name分组。分组后,我希望能够获得该时段内每个规则的计数以及该组所有规则的百分比。到目前为止,我能够获得每个期间的计数,但无法获得百分比。

目标是绘制与底部相似的图,但在右y轴上,我也将获得时间段的百分比。

目标数据框:
对于rule_name A:

1
2
3
4
date       counts (rule_name)   %_rule_name
Jan 16     1                   50
Feb 16     0                    0
Jun 16     2                   66

我想为每个rule_name继续执行此操作(即,对于B和C)

到目前为止的代码:

1
2
3
4
5
6
d  = {'date': ['1/1/2016', '2/1/2016', '3/5/2016', '2/5/2016', '1/15/2016', '3/3/2016', '3/4/2016'],
 'rule_name' : ['A' , 'B', 'C', 'C', 'B', 'A','A']}

df = pd.DataFrame(d)

Output:

enter

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# format string date to datetime
df['date'] = pd.to_datetime(df['date'], format='%m/%d/%Y', errors='coerce')


rule_names = df['rule_name'].unique().tolist()
for i in rule_names:
    print""
    print 'dataframe for', i ,':'
    df_temp = df[df['rule_name'] == i]
    df_temp = df_temp.groupby(df_temp['date'].map(lambda x: str(x.year) + '-' + str(x.strftime('%m')))).count()
    df_temp.plot(kind='line', title = 'Rule Name: ' + str(i))
    print df_temp

Output:

enter

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import seaborn as sns

df_all = df.groupby(df['date'].map(lambda x: str(x.year) + '-' + str(x.strftime('%m')))).count()
df_all = pd.DataFrame(df_all)
df_all['rule_name_all_count'] = df_all['rule_name']

rule_names = df['rule_name'].unique().tolist()
for i in rule_names:
    print""
    print 'dataframe for', i ,':'
    df_temp = df[df['rule_name'] == i]
    df_temp = df_temp.groupby(df_temp['date'].map(lambda x: str(x.year) + '-' + str(x.strftime('%m')))).count()
    df_temp = pd.DataFrame(df_temp)
    df_merge = pd.merge(df_all, df_temp, right_index = True, left_index = True, how='left')
    drop_x(df_merge)
    rename_y(df_merge)
    df_merge.drop('date', axis=1, inplace=True)
    df_merge['rule_name_%'] = df_merge['rule_name'].astype(float) / df_merge['rule_name_all_count'].astype(float)
    df_merge = df_merge.fillna(0)

    fig = plt.figure()
    ax = fig.add_subplot(111)
    ax2 = ax.twinx()

    df_merge['rule_name'].plot()
    df_merge['rule_name_%'].plot()
    plt.show()
    print df_temp

enter