how to delete nan in df from row without losing the whole row?
G'day,如何在不损失整行的情况下降低nan值?
这就是我的
我已经尝试过
但是我必须保留这些价值观,因为最后我希望它们向上发展。
它可能会有所帮助。 你可以试试 !!!!
根据您的问题陈述,您想让nan值的优先级降低,并将非nan值放在首位。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | import numpy as np import pandas as pd import functools def drop_and_roll(col, na_position='last', fillvalue=np.nan): result = np.full(len(col), fillvalue, dtype=col.dtype) mask = col.notnull() N = mask.sum() if na_position == 'last': result[:N] = col.loc[mask] elif na_position == 'first': result[-N:] = col.loc[mask] else: raise ValueError('na_position {!r} unrecognized'.format(na_position)) return result df = pd.read_table('data', sep='\\s{2,}') print(df.apply(functools.partial(drop_and_roll, fillvalue=''))) |
假设您要回填值,然后删除出现在任何列中的所有重复项,此示例可以工作:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | import pandas as pd import numpy as np data = [ ['POINT_1.1', 'POINT_1.2', pd.NA], [pd.NA, pd.NA, 'POINT_1.3'], ['POINT_2.1', 'POINT_2.2', pd.NA], [pd.NA, pd.NA, 'POINT_2.3'] ] df = pd.DataFrame(data) df # 0 1 2 # 0 POINT_1.1 POINT_1.2 <NA> # 1 <NA> <NA> POINT_1.3 # 2 POINT_2.1 POINT_2.2 <NA> # 3 <NA> <NA> POINT_2.3 t = df.T.bfill().T.bfill() t # 0 1 2 # 0 POINT_1.1 POINT_1.2 POINT_1.3 # 2 POINT_2.1 POINT_2.2 POINT_2.3 for column in t.columns: t = t.drop_duplicates(column) t # 0 1 2 # 0 POINT_1.1 POINT_1.2 POINT_1.3 # 2 POINT_2.1 POINT_2.2 POINT_2.3 |