Panadas Condition on Dataframe returns TypeError: '>' not supported between instances of 'str' and 'int'
我正在使用pandas处理DataFrame,我需要根据某些条件添加一个新列。
我的DataFrame是:
1 2 3 4 | discount tax total subtotal productid 3 0 20 13 002 10 3 106 94 003 46.49 6 21 20 004 |
在将一个名为Class的新列添加到DataFrame时,我需要应用一些条件。
条件如下:
如果
否则应为0
这是我尝试过的方法:
1 2 3 4 5 6 7 | def conditions(s): if (s['discount'] > 20) and (s['tax'] == 0) and (s['total'] > 100): return 1 else: return 0 df_full['Class'] = df_full.apply(conditions, axis=1) |
但是它返回错误为:
TypeError: ("'>' not supported between instances of 'str' and 'int'", 'occurred at index 18')
我该如何解决这个问题?
请帮帮我!
提前致谢!
我建议创建布尔掩码并强制转换为
1 2 3 4 5 | print (df_full) discount tax total subtotal productid 0 3.00 0 20 13 002 1 40.00 0 106 94 003 2 46.49 6 21 20 004 |
您还可以检查所有非数值:
1 2 3 4 | print(df_full[pd.to_numeric(df_full['discount'], errors='coerce').isnull()] #for convert to numeric - non numeric are convert to `NaN`s df_full['discount'] = pd.to_numeric(df_full['discount'], errors='coerce') |
1 2 3 4 5 6 7 8 | df_full['Class'] = ((df_full['discount'] > 20) & (df_full['tax'] == 0) & (df_full['total'] > 100)).astype(int) print (df_full) discount tax total subtotal productid Class 0 3.00 0 20 13 002 0 1 40.00 0 106 94 003 1 2 46.49 6 21 20 004 0 |