Pandas : Count number of consecutive rows that are greater than current row value but less than the value from other column

Say I have the following sample dataframe (there are about 25k rows in the real dataframe)

df = pd.DataFrame({'A' : [0,3,2,9,1,0,4,7,3,2], 'B': [9,8,3,5,5,5,5,8,0,4]})
df
   A  B
0  0  9
1  3  8
2  2  3
3  9  5
4  1  5
5  0  5
6  4  5
7  7  8
8  3  0
9  2  4

For the column A I need to know how many next and previous rows are greater than current row value but less than value in column B.

So my expected output is :

A   B next count  previous count
0   9     2          0
3   8     0          0
2   3     0          1
9   5     0          0
1   5     0          0
0   5     2          1
4   5     1          0
7   8     0          0
3   0     0          2
2   4     0          0

Explanation :

First row is calculated as : since 3 and 2 are greater than 0 but less than corresponding B value 8 and 3
Second row is calculated as : since next value 2 is not greater than 3
Third row is calculated as : since 9 is greater than 2 but not greater than its corresponding B value

Similarly, previous count is calculated

Note : I know how to solve this problem by looping using list comprehension or using the pandas apply method but still I won't mind a clear and concise apply approach. I was looking for a more pandaic approach.

Here is the apply solution, which I think is inefficient. Also, as people said that there might be no vector solution for the question. So as mentioned, a more efficient apply solution will be accepted for this question.

This is what I have tried.

This function gets the number of previous/next rows that satisfy the condition.

def get_prev_next_count(row):
    next_nrow = df.loc[row['index']+1:,['A', 'B']]
    prev_nrow = df.loc[:row['index']-1,['A', 'B']][::-1]
    if (next_nrow.size == 0):
        return 0, ((prev_nrow.A > row.A) & (prev_nrow.A < prev_nrow.B)).argmin()
    if (prev_nrow.size == 0):
        return ((next_nrow.A > row.A) & (next_nrow.A < next_nrow.B)).argmin(), 0
    return (((next_nrow.A > row.A) & (next_nrow.A < next_nrow.B)).argmin(), ((prev_nrow.A > row.A) & (prev_nrow.A < prev_nrow.B)).argmin())

Generating output :

df[['next count', 'previous count']] = df.reset_index().apply(get_prev_next_count, axis=1, result_type="expand")

Output :

This gives us the expected output

df
   A  B  next count  previous count
0  0  9           2               0
1  3  8           0               0
2  2  3           0               1
3  9  5           0               0
4  1  5           0               0
5  0  5           2               1
6  4  5           1               0
7  7  8           0               0
8  3  0           0               2
9  2  4           0               0

source https://stackoverflow.com/questions/73039690/pandas-count-number-of-consecutive-rows-that-are-greater-than-current-row-valu

StacksPedia

Search This Blog