I have a dataframe like below, and it's like a search result. There are six items in total (a2, b3, b7, b9, c6, and c8), and the dataframe presents the pairwise search results. For example, line a2 b3
presents that a2 finds b3 in the search.
So from the dataframe, we know that a2 finds b3, c6 finds both a2 and b3. So a2, b3, and c6 are good friends, and good friends need to be output in the same line of a new dataframe as a2 b3 c6
. The same logic for b7 and c8 case. For b9, as it only has one friend (a2), we could left it out from this step.
The logic to generate the output is easy to think, but I don't know how to make it into codes. I thought, firstly we could subset all the lines only has a
and b
. Then for each a
and b
line, we could compare the c
results with the a
and b
line. But I have been thinking for a while now, and still have no a clear idea how to do it.
dataframe:
G1 G2 # header
a2 b3
c6 a2
c6 b3
b7 a2
c8 b7
c8 a2
b9 a2
expected result:
G1 G2 G3 # header
a2 b3 c6
a2 b7 c8
source https://stackoverflow.com/questions/74502748/turn-a-two-column-dataframe-into-three-columns-by-matching-the-items-in-each-ro
Comments
Post a Comment