I am new to python so thank you for your patience with me.
I am in the process of converting a very large txt file to a csv file in python so I can use it in mysql. There are 14549030 records within the file. I am running into this error:
ParserError: Error tokenizing data. C error: Expected 19 fields in line 13297995, saw 22
when I am trying to use pandas to manipulate the file, I am running this to convert the csv to dataframe:
import pandas as pd
import pymysql
print('convert CSV to dataframe')
data = pd.read_csv ('mydata.csv', delimiter=',', header=None)
df = pd.DataFrame(data)
I can't manually edit the text to see what the specific issue is on line 13297995 and there are no headers in the txt. I am worried about losing valuable information by skipping any lines, but I am not sure if that is the only solution (and I don't know how to do that). Any help is appreciated to help fix the error. Thank you!
To give some context: the text file's column separators were || instead of commas, and I had help successfully converting || into commas.
source https://stackoverflow.com/questions/72222062/python-pandas-parsererror-error-tokenizing-data-c-error-with-very-large-dataset
Comments
Post a Comment