Adding count information to mapreduce output

I have data that came from sys.stdout mapper.py program as follows:

input from stdout of previous mapper.py

chevy, {mod: spark | col: brown}
chevy, {mod: equinox | col: red}
honda, {mod:civic | col:black}
honda, {mod:accord | col:white} 
honda, {mod:crv | col:pink} 
honda, {mod:hrv | col:gray} 
toyota, {mod:corola | col:white}

I would like to write a reducer.py or maybe even another mapper that takes this information and produces an output such as:

Expected output

chevy, {mod: spark | col: brown | total:2}
chevy, {mod: equinox | col: red | total:2}
honda, {mod:civic | col:black | total:4}
honda, {mod:accord | col:white | total:4} 
honda, {mod:crv | col:pink | total:4} 
honda, {mod:hrv | col:gray | total:4} 
toyota, {mod:corola | col:white | total:1}

the total is only for the keys (car brand), so chevy appears twice, honda appears 4 times, and toyota 1.

I have tried a reducer.py program and it did not work. The program I wrote looks like this:

curr_k = None
curr_v = None
k = None
curr_count = 0

for car in sys.stdin:
    car_split = car.split('|')
    k = car_split[0]
    v = car_split[1]

    if curr_k == k:
        print(curr_k, curr_v, 'total:',curr_count)
        curr_count += 1

    else:
        if curr_k:
            print(curr_k, curr_v, 'total:',curr_count)
        curr_k = k
        curr_count = 1
    
if curr_k == k:
    print(curr_k, curr_v, 'total:',curr_count)

The above code gave me the following answer:

chevy, {mod: spark | col: brown | total:1}
chevy, {mod: equinox | col: red | total:2}
honda, {mod:civic | col:black | total:1}
honda, {mod:accord | col:white | total:2} 
honda, {mod:crv | col:pink | total:3} 
honda, {mod:hrv | col:gray | total:4} 
toyota, {mod:corola | col:white | total:1}

But that is not what I am looking for.

source https://stackoverflow.com/questions/74173010/adding-count-information-to-mapreduce-output

StacksPedia

Search This Blog

Adding count information to mapreduce output

Labels

Comments

Post a Comment

Popular posts from this blog

Where and how is this Laravel kernel constructor called? [closed]

Why is my reports service not connecting?

How to show number of registered users in Laravel based on usertype?