I have two dataframes, df_DD carries all my data, and df_GS carries the ranges of data that I want to break df_DD into. df_GS is much shorter than df_DD, I want to group all the df_DD by df_GS for when the ranges are equated.
Small range of df_GS
From To DHID
0 69.0 88.5 CR22-200
1 88.5 90.0 CR22-200
2 90.0 99.0 CR22-200
3 99.0 100.5 CR22-200
4 100.5 112.5 CR22-200
5 112.5 114.0 CR22-200
6 114.0 165.0 CR22-200
for i in range(len(df_GS)):
df_DD['Samples'].loc[(df_DD[From] >= df_GS[From].iloc[i]) & (df_DD[To] <= df_GS[To].iloc[i]) & (df_DD[DHID]==df_GS[DHID].iloc[i])] = i+1
Here is an output of df_DD
Samples From To DHID
0 1 69.0 70.5 CR22-200
1 1 70.5 72.0 CR22-200
2 1 72.0 73.5 CR22-200
3 1 73.5 75.0 CR22-200
4 1 75.0 76.5 CR22-200
5 1 76.5 78.0 CR22-200
6 1 78.0 79.5 CR22-200
7 1 79.5 81.0 CR22-200
8 1 81.0 82.5 CR22-200
9 1 82.5 84.0 CR22-200
10 1 84.0 85.5 CR22-200
11 1 85.5 87.0 CR22-200
12 1 87.0 88.5 CR22-200
13 2 88.5 90.0 CR22-200
14 3 90.0 91.5 CR22-200
15 3 91.5 93.0 CR22-200
The code above does what I want it to by creating a new column named Samples giving values a sample index, after which I can use the groupby function. But I wanted to know if there was a better way to do this cause it's quite cumbersome.
source https://stackoverflow.com/questions/71503736/python-pandas-group-one-dataframe-by-the-results-from-another-of-different-size
Comments
Post a Comment