This might be a simple question but I can't seem to find the answer to this or why it is not working on this specific case.
I want to read large files, they can be compressed or not. I used contextlib to write a contextmanager function to handle this. Then using the with statement I read the files in the main script.
My problem here is that the script uses a lot of memory then get's killed (testing using a compressed file). What am I doing wrong? Should I approach this differently?
def process_vcf(location):
logging.info('Processing vcf')
logging.debug(location)
with read_compressed_or_not(location) as vcf:
for line in vcf.readlines():
if line.startswith('#'):
logging.debug(line)
@contextmanager
def read_compressed_or_not(location):
if location.endswith('.gz'):
try:
file = gzip.open(location)
yield file
finally:
file.close()
else:
try:
file = open(location, 'r')
yield file
finally:
file.close()
source https://stackoverflow.com/questions/70159969/reading-large-compressed-files-in-python
Comments
Post a Comment