Skip to main content

apply a function to each level of grouping factor and create new column in existing data frame

I have a data frame (df) that looks like this

           timestamp                  datetime        date      time    open  \
0      1667520000000 2022-11-04 00:00:00+00:00  2022-11-04  00:00:00  0.2186   
1      1667606400000 2022-11-05 00:00:00+00:00  2022-11-05  00:00:00  0.2589   
2      1667692800000 2022-11-06 00:00:00+00:00  2022-11-06  00:00:00  0.2459   
3      1667779200000 2022-11-07 00:00:00+00:00  2022-11-07  00:00:00  0.2315   
4      1667865600000 2022-11-08 00:00:00+00:00  2022-11-08  00:00:00  0.2353   
              ...                       ...         ...       ...     ...   
15012  1675728000000 2023-02-07 00:00:00+00:00  2023-02-07  00:00:00  0.2449   
15013  1675814400000 2023-02-08 00:00:00+00:00  2023-02-08  00:00:00  0.2610   
15014  1675900800000 2023-02-09 00:00:00+00:00  2023-02-09  00:00:00  0.2555   
15015  1675987200000 2023-02-10 00:00:00+00:00  2023-02-10  00:00:00  0.2288   
15016  1676073600000 2023-02-11 00:00:00+00:00  2023-02-11  00:00:00  0.2317   
         high     low   close        volume              symbol  
0      0.2695  0.2165  0.2588  1.239168e+09  1000LUNC/USDT:USDT  
1      0.2788  0.2414  0.2458  1.147000e+09  1000LUNC/USDT:USDT  
2      0.2554  0.2292  0.2315  5.137089e+08  1000LUNC/USDT:USDT  
3      0.2398  0.2263  0.2352  4.754763e+08  1000LUNC/USDT:USDT  
4      0.2404  0.1320  0.1895  1.618936e+09  1000LUNC/USDT:USDT  
       ...     ...     ...           ...                 ...  
15012  0.2627  0.2433  0.2611  8.097549e+07       ZRX/USDT:USDT  
15013  0.2618  0.2432  0.2554  7.009100e+07       ZRX/USDT:USDT  
15014  0.2651  0.2209  0.2287  1.217487e+08       ZRX/USDT:USDT  
15015  0.2361  0.2279  0.2317  6.072029e+07       ZRX/USDT:USDT  
15016  0.2418  0.2300  0.2409  2.178281e+07       ZRX/USDT:USDT 

I want to apply a function from pandas ta, called bbands to each level of symbol using the column 'close' as the input. The function return multiple variables, but I only want to keep the one labeled 'BBM_20_2.0' and store this as another column in the df.

If I were to just apply the function to entire df ignoring the fact that each symbols has to be treated separately I would do this

daily_df['bbm'] = bbands(daily_df.close, 20, 2)['BBM_20_2.0']

I have tied to use groupby like this

daily_df['bbm'] = daily_df.groupby(["symbol"]).apply(bbands(daily_df.close, 20, 2)['BBM_20_2.0'])

but Im getting errors. Can anyone help?



source https://stackoverflow.com/questions/75421579/apply-a-function-to-each-level-of-grouping-factor-and-create-new-column-in-exist

Comments

Popular posts from this blog

ValueError: X has 10 features, but LinearRegression is expecting 1 features as input

So, I am trying to predict the model but its throwing error like it has 10 features but it expacts only 1. So I am confused can anyone help me with it? more importantly its not working for me when my friend runs it. It works perfectly fine dose anyone know the reason about it? cv = KFold(n_splits = 10) all_loss = [] for i in range(9): # 1st for loop over polynomial orders poly_order = i X_train = make_polynomial(x, poly_order) loss_at_order = [] # initiate a set to collect loss for CV for train_index, test_index in cv.split(X_train): print('TRAIN:', train_index, 'TEST:', test_index) X_train_cv, X_test_cv = X_train[train_index], X_test[test_index] t_train_cv, t_test_cv = t[train_index], t[test_index] reg.fit(X_train_cv, t_train_cv) loss_at_order.append(np.mean((t_test_cv - reg.predict(X_test_cv))**2)) # collect loss at fold all_loss.append(np.mean(loss_at_order)) # collect loss at order plt.plot(np.log(al...

Sorting large arrays of big numeric stings

I was solving bigSorting() problem from hackerrank: Consider an array of numeric strings where each string is a positive number with anywhere from to digits. Sort the array's elements in non-decreasing, or ascending order of their integer values and return the sorted array. I know it works as follows: def bigSorting(unsorted): return sorted(unsorted, key=int) But I didnt guess this approach earlier. Initially I tried below: def bigSorting(unsorted): int_unsorted = [int(i) for i in unsorted] int_sorted = sorted(int_unsorted) return [str(i) for i in int_sorted] However, for some of the test cases, it was showing time limit exceeded. Why is it so? PS: I dont know exactly what those test cases were as hacker rank does not reveal all test cases. source https://stackoverflow.com/questions/73007397/sorting-large-arrays-of-big-numeric-stings

How to load Javascript with imported modules?

I am trying to import modules from tensorflowjs, and below is my code. test.html <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Document</title </head> <body> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.0.0/dist/tf.min.js"></script> <script type="module" src="./test.js"></script> </body> </html> test.js import * as tf from "./node_modules/@tensorflow/tfjs"; import {loadGraphModel} from "./node_modules/@tensorflow/tfjs-converter"; const MODEL_URL = './model.json'; const model = await loadGraphModel(MODEL_URL); const cat = document.getElementById('cat'); model.execute(tf.browser.fromPixels(cat)); Besides, I run the server using python -m http.server in my command prompt(Windows 10), and this is the error prompt in the console log of my browser: Failed to loa...