Skip to main content

How to merge two very large csv files in chunks using pandas?

I have two large CSV files that have to be merged on the column user_id which is common to both CSV files.

I would like to load these files in dataframes using pandas and merge the dataframes on the column (pd.merge(df1,df2, on='user_id)) and store the merged file into a databass (Postgres).

So instead of loading the entire file, I would like to load these files in chunks and merge these chunks and load to DB. How can i do this?



source https://stackoverflow.com/questions/70073457/how-to-merge-two-very-large-csv-files-in-chunks-using-pandas

Comments