Skip to main content

Quick way to create a pivot-table with totals from 2 dimensional data-points (part-id, year, units)?

I read out a set of data from an SQL table using pypyodbc... The result from cur.fetchall() is this :

sqldata=[
('part00', '2023', 15), ('part01', '2023', 4), ('part02', '2023', 1), ('part00', '2022', 72), 
('part03', '2022', 48), ('part04', '2022', 22), ('part05', '2022', 16), ('part06', '2022', 16), 
('part07', '2022', 15), ('part08', '2022', 12), ('part09', '2022', 6), ('part10', '2022', 4), 
('part01', '2022', 4), ('part11', '2022', 3), ('part12', '2022', 3), ('part13', '2022', 2), 
('part14', '2022', 1), ('part15', '2022', 1), ('part02', '2022', 1), ('part16', '2022', 1), 
('part04', '2021', 35), ('part00', '2021', 20), ('part08', '2021', 18), ('part05', '2021', 3), 
('part07', '2021', 3), ('part09', '2021', 1), ('part16', '2021', 1), ('part03', '2021', 1), 
('part04', '2020', 43), ('part08', '2020', 15), ('part10', '2020', 9), ('part00', '2020', 9), 
('part07', '2020', 6), ('part05', '2020', 4), ('part17', '2020', 4), ('part03', '2020', 4), 
('part18', '2020', 3), ('part19', '2020', 2), ('part16', '2020', 2), ('part12', '2020', 2), 
('part20', '2020', 1), ('part09', '2020', 1), ('part21', '2020', 1), ('part00', '2019', 37), 
('part04', '2019', 18), ('part05', '2019', 8), ('part11', '2019', 5), ('part10', '2019', 4), 
('part22', '2019', 4), ('part16', '2019', 4), ('part23', '2019', 3), ('part03', '2019', 3), 
('part24', '2019', 3), ('part25', '2019', 2), ('part09', '2019', 2), ('part21', '2019', 2), 
('part20', '2019', 1), ('part13', '2019', 1), ('part14', '2019', 1), ('part19', '2019', 1), 
('part12', '2019', 1), ('part01', '2019', 1), ('part26', '2019', 1), ('part15', '2019', 1), 
('part04', '2018', 47), ('part00', '2018', 27), ('part13', '2018', 10), ('part12', '2018', 6), 
('part07', '2018', 4), ('part21', '2018', 3), ('part09', '2018', 2), ('part05', '2018', 2), 
('part15', '2018', 1), ('part04', '2017', 37), ('part00', '2017', 17), ('part12', '2017', 15), 
('part13', '2017', 7), ('part19', '2017', 6), ('part25', '2017', 2), ('part07', '2017', 2), 
('part03', '2017', 2), ('part27', '2017', 2), ('part21', '2017', 1), ('part09', '2017', 1), 
('part28', '2017', 1), ('part04', '2016', 49), ('part00', '2016', 29), ('part12', '2016', 12), 
('part02', '2016', 5), ('part15', '2016', 2), ('part24', '2016', 2), ('part08', '2016', 2), 
('part29', '2016', 1), ('part01', '2016', 1), ('part21', '2016', 1), ('part13', '2016', 1), 
('part28', '2016', 1), ('part25', '2016', 1), ('part04', '2015', 30), ('part00', '2015', 15), 
('part15', '2015', 8), ('part28', '2015', 5), ('part09', '2015', 4), ('part12', '2015', 2), 
('part02', '2015', 2), ('part01', '2015', 1), ('part13', '2015', 1), ('part29', '2015', 1), 
('part00', '2014', 6), ('part12', '2014', 2), ('part28', '2014', 1)
]
 

Based on this, I currently have my second attempt (a lot faster than the first) to create a table on the data.
This time by first creating "order" of the above - using a loop - inserting the data into a numpy array, then looping through the array to get:

          2023  2022  2021  2020  2019  2018  2017  2016  2015
Part00:      -    72    20     9    37    27    17    29    15  . 226
Part01:     15     4     -     -     1     -     -     1     1  . 22
Part02:      4     1     -     -     -     -     -     5     2  . 12
Part03:      1     -     1     4     3     -     2     -     -  . 11
Part04:      -    48    35    43    18    47    37    49    30  . 307
Part05:      -    22     3     4     8     2     -     -     -  . 39
Part06:      -    16     -     -     -     -     -     -     -  . 16
Part07:      -    16     3     6     -     4     2     -     -  . 31
Part08:      -    15    18    15     -     -     -     2     -  . 50
Part09:      -    12     1     1     2     2     1     -     4  . 23
Part10:      -     6     -     9     4     -     -     -     -  . 19
Part11:      -     4     -     -     5     -     -     -     -  . 9
...
Total: 908

Not all ^^^ shown as The S.O. Editor somehow barfed on that portion (WHY?!)

... but I still feel that my code is "non-pythonic"; as it essentially is looping over the the data items several times... might there be a more compact way to actually create the above table,

Part of the problem is sorting: so additionally have it EASILY sorted by the numbers; most basically by the right hand side sums, *or by some other "smart" order, even parametrically selectable in the end - that might display it in a more "overview-able" manner?

Might it be numpy isn't the best suited tool?
Or is there a quicker way to insert the data?

Maybe pandas is easier, more convenient?
I have limited experience with either of these.

So.

  1. Avoiding looping, a more pythonic way to create a python data structure of the source data?
  2. Given 1: Simple means to display the table, sorted and summed in a reconfigurable way - e.g. avoiding loops, with the least possible code.

I'm currently working on this using pandas, a simple

import pandas as pd
df=pd.DataFrame(sqldata)
print(df)

is a good start i believe.

From here rearranging the data in the df DataFrame (to get the table above) seems doable.

Working on it...


... so a list of parts and years; do I REALLY have to create a double-loop based on these?
partslist=sorted(list(set(df[0])))
print(partslist)

years=sorted(list(set(df[1])))
print(years)

... I imagine there is some slice-based transformation in pandas that will create the above "Matrix". Then; allowing that matrix to be manipulated based on parameters e.g. given on the command line at start.

Reading https://pandas.pydata.org/docs/user_guide/indexing.html, but cannot see the right tree because of the woods...



source https://stackoverflow.com/questions/75175477/quick-way-to-create-a-pivot-table-with-totals-from-2-dimensional-data-points-pa

Comments

Popular posts from this blog

How to show number of registered users in Laravel based on usertype?

i'm trying to display data from the database in the admin dashboard i used this: <?php use Illuminate\Support\Facades\DB; $users = DB::table('users')->count(); echo $users; ?> and i have successfully get the correct data from the database but what if i want to display a specific data for example in this user table there is "usertype" that specify if the user is normal user or admin i want to user the same code above but to display a specific usertype i tried this: <?php use Illuminate\Support\Facades\DB; $users = DB::table('users')->count()->WHERE usertype =admin; echo $users; ?> but it didn't work, what am i doing wrong? source https://stackoverflow.com/questions/68199726/how-to-show-number-of-registered-users-in-laravel-based-on-usertype

Why is my reports service not connecting?

I am trying to pull some data from a Postgres database using Node.js and node-postures but I can't figure out why my service isn't connecting. my routes/index.js file: const express = require('express'); const router = express.Router(); const ordersCountController = require('../controllers/ordersCountController'); const ordersController = require('../controllers/ordersController'); const weeklyReportsController = require('../controllers/weeklyReportsController'); router.get('/orders_count', ordersCountController); router.get('/orders', ordersController); router.get('/weekly_reports', weeklyReportsController); module.exports = router; My controllers/weeklyReportsController.js file: const weeklyReportsService = require('../services/weeklyReportsService'); const weeklyReportsController = async (req, res) => { try { const data = await weeklyReportsService; res.json({data}) console...

How to split a rinex file if I need 24 hours data

Trying to divide rinex file using the command gfzrnx but getting this error. While doing that getting this error msg 'gfzrnx' is not recognized as an internal or external command Trying to split rinex file using the command gfzrnx. also install'gfzrnx'. my doubt is I need to run this program in 'gfzrnx' or in 'cmdprompt'. I am expecting a rinex file with 24 hrs or 1 day data.I Have 48 hrs data in RINEX format. Please help me to solve this issue. source https://stackoverflow.com/questions/75385367/how-to-split-a-rinex-file-if-i-need-24-hours-data