python - pandas drop duplicates of one column with criteria -
I have such a datafile:
AB 239616412 none 239616414 Name 2 239616417 None 239616417 None 239616417 None 239616418 Name 1 23 9 616418 None 23 9 616428 Name 1 23 9 616429 None 23 9 616429 None 23 9 616429 Name 1
I I want to delete the duplicate of column A, and I want to put that line in which there is a name in it (! = None, originally) in column B, but if there is no value in all duplicates, then I still want to keep it (like 23 9 616417
).
It should be reduced:
AB 239616412 None 23 9 616414 Name 2 23 9 616417 None 23 9 616418 Name 1 23 9 616428 Name 1 23 9 616429 Name 1
df.sort ('B', inplace = True) DF out [24]: AB 5 23 9 616418 Name 1 7 23 9 616428 Name 1 10 23 9 616429 Name 1 1 23 9616414 Name 2 0 23 9 616412 Nain 2 23 9 616417 Nain 3 23 9 616417 Nain 4 23 9 616417 NAN 6 23 9 616418 NAN 8 23 9 616429 Nain 9 23 9 616429 Nain
Then drop duplicate wrt column 'A':
df .drop_duplicates ('a', inplace = true) df out [26]: ab 5 23 9 616418 name 1 7 23 9 616428 Name 1 10 23 9 616429 Name 1 1 23 9 616414 Name 2 0 23 9 616412 NAN 2 239616417 NAN
You can re-sort the data frame to get exactly the same:
df.sort (inplace = True) df Out [30]: AB 0 23 9 616412 Nain 1 23 9 616414 Name 2 2 23 9 616417 Nain 5 23 9 616418 Name 1 7 23 9 616428 Name 1 10 23 9 616429 Name 1
Comments
Post a Comment