Skip to main content

turn a two-column dataframe into three columns, by matching the items in each row

I have a dataframe like below, and it's like a search result. There are six items in total (a2, b3, b7, b9, c6, and c8), and the dataframe presents the pairwise search results. For example, line a2 b3 presents that a2 finds b3 in the search.

So from the dataframe, we know that a2 finds b3, c6 finds both a2 and b3. So a2, b3, and c6 are good friends, and good friends need to be output in the same line of a new dataframe as a2 b3 c6. The same logic for b7 and c8 case. For b9, as it only has one friend (a2), we could left it out from this step.

The logic to generate the output is easy to think, but I don't know how to make it into codes. I thought, firstly we could subset all the lines only has a and b. Then for each a and b line, we could compare the c results with the a and b line. But I have been thinking for a while now, and still have no a clear idea how to do it.

dataframe:

G1 G2 # header
a2 b3 
c6 a2
c6 b3
b7 a2
c8 b7
c8 a2
b9 a2

expected result:

G1 G2 G3 # header
a2 b3 c6
a2 b7 c8


source https://stackoverflow.com/questions/74502748/turn-a-two-column-dataframe-into-three-columns-by-matching-the-items-in-each-ro

Comments

Popular posts from this blog

ValueError: X has 10 features, but LinearRegression is expecting 1 features as input

So, I am trying to predict the model but its throwing error like it has 10 features but it expacts only 1. So I am confused can anyone help me with it? more importantly its not working for me when my friend runs it. It works perfectly fine dose anyone know the reason about it? cv = KFold(n_splits = 10) all_loss = [] for i in range(9): # 1st for loop over polynomial orders poly_order = i X_train = make_polynomial(x, poly_order) loss_at_order = [] # initiate a set to collect loss for CV for train_index, test_index in cv.split(X_train): print('TRAIN:', train_index, 'TEST:', test_index) X_train_cv, X_test_cv = X_train[train_index], X_test[test_index] t_train_cv, t_test_cv = t[train_index], t[test_index] reg.fit(X_train_cv, t_train_cv) loss_at_order.append(np.mean((t_test_cv - reg.predict(X_test_cv))**2)) # collect loss at fold all_loss.append(np.mean(loss_at_order)) # collect loss at order plt.plot(np.log(al...

Sorting large arrays of big numeric stings

I was solving bigSorting() problem from hackerrank: Consider an array of numeric strings where each string is a positive number with anywhere from to digits. Sort the array's elements in non-decreasing, or ascending order of their integer values and return the sorted array. I know it works as follows: def bigSorting(unsorted): return sorted(unsorted, key=int) But I didnt guess this approach earlier. Initially I tried below: def bigSorting(unsorted): int_unsorted = [int(i) for i in unsorted] int_sorted = sorted(int_unsorted) return [str(i) for i in int_sorted] However, for some of the test cases, it was showing time limit exceeded. Why is it so? PS: I dont know exactly what those test cases were as hacker rank does not reveal all test cases. source https://stackoverflow.com/questions/73007397/sorting-large-arrays-of-big-numeric-stings

How to load Javascript with imported modules?

I am trying to import modules from tensorflowjs, and below is my code. test.html <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Document</title </head> <body> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script> <script type="module" src="./test.js"></script> </body> </html> test.js import * as tf from "./node_modules/@tensorflow/tfjs"; import {loadGraphModel} from "./node_modules/@tensorflow/tfjs-converter"; const MODEL_URL = './model.json'; const model = await loadGraphModel(MODEL_URL); const cat = document.getElementById('cat'); model.execute(tf.browser.fromPixels(cat)); Besides, I run the server using python -m http.server in my command prompt(Windows 10), and this is the error prompt in the console log of my browser: Failed to loa...