Here is my code:
import boto3
import pandas as pd
import json
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('bucket-name-here')
def verify_unique_col():
with open(r'unique_columns.json', 'r') as f:
config = json.load(f)
for my_bucket_object in my_bucket.objects.filter(Prefix='decrypted/'):
if my_bucket_object.key.endswith('.csv'):
filename = my_bucket_object.key.split('/')[-1]
for table in config['Unique_Column_Combination']['Table_Name']:
unique_col_comb = config['Unique_Column_Combination']['Table_Name'][f'{table}']
df = pd.read_csv(f's3://path/to/{filename}', sep='|')
df_unique = df.set_index(unique_col_comb.split(", ")).index.is_unique
print(df_unique)
verify_unique_col()
I am trying to iterate through each CSV file in my bucket and read each one in df = pd.read_csv(f's3://path/to/{filename}', sep='|')
and after reading one, I want to determine if the data for the columns in that CSV are unique. I listed the columns that are supposed to be unique in my config file here:
{
"Unique_Column_Combination":
{
"Table_Name":{
"TABLE_1": "CLIENT, ADDRNUMBER, PERSNUMBER, CONSNUMBER, LAST_LOAD_DT",
"TABLE_2": "CLIENT, ADDRNUMBER, CONSNUMBER, LAST_LOAD_DT",
"TABLE_3": "CLIENT, ADDRNUMBER, DATE_FROM, NATION, LAST_LOAD_DT",
"TABLE_4": "COMM_TYPE, CONSNUMBER, COMM_USAGE, VALID_TO, LOAD_DT",
}
}
}
So I want my code to read the csv TABLE_1.csv
then check if the columns above in TABLE_1
for that table are unique, then move onto TABLE_2.csv
and check if the columns above in TABLE_2
are unique and so on...the way my code is right now, is that it iterates through each Table_Name
in my config just fine, but it is only reading the first CSV in my s3 bucket and comparing the first CSV to all of my tables listed in my config. I want to do a comparison from TABLE_1.csv to TABLE_1 and TABLE_2.csv to TABLE_2 and so on...is that possible? In theory I thought it was but it appears to be too muddled...
source https://stackoverflow.com/questions/77398445/how-do-i-compare-my-csv-to-my-config-as-i-am-iterating-through-each-one
Comments
Post a Comment