I have 2 data frames that I would like to match more than 2 numbers that match in the row I'm looking up.
import pandas as pd
cols = ['Num1','Num2','Num3','Num4','Num5','Num6']
df1 = pd.DataFrame([[2,4,6,8,9,10]], columns=cols)
df2 = pd.DataFrame([[1,1,2,4,5,6,8],
[2,5,6,20,22,23,34],
[3,8,12,13,34,45,46],
[4,9,10,14,29,32,33],
[5,1,22,13,23,33,35],
[6,1,6,7,8,9,10],
[7,0,2,3,5,6,8]],
columns = ['Id','Num1','Num2','Num3','Num4','Num5','Num6'])
I have this code that matches but i would like to enhance by matching more than 2 numbers in the row.
# convert the values in the first dataframe to a list
vals_to_find = df1.iloc[0].tolist()
# Print the values to find
print("Vals to find:", vals_to_find)
# Create an empty list to hold the matching IDs
matching_ids = []
# iterate through the big dataframe
for index, row in df2.iterrows():
rowlist = row.tolist() # convert the row to a list
# keep the id for later, and extract the other values for evaluation
id = rowlist[0]
vals = rowlist[1:]
# count the number of values in one list against another list
counter = sum(elem in vals_to_find for elem in vals)
# If the number of matches is greater than 2, then grab the ID
if counter > 2:
matching_ids.append({'ID': id})
# Print the matching IDs
print('Matching IDS:', matching_ids)
I would like my results to be something like that..
df3 = pd.DataFrame([[6,1,6,7,8,9,10],
[7,0,2,3,5,6,8]],
columns = ['Id', 'Num1','Num2','Num3','Num4','Num5','Num6'])
source https://stackoverflow.com/questions/74867149/how-to-match-more-than-2-number-in-a-row
Comments
Post a Comment