So I got this far:
- names[] - is a dict with CSV data
- str[] - column names from CSV - to access STR names
- sequence[] - dna sequence from TXT
- checked_seq[] - list with STR counts from sequence
I now got stuck on the final task:
- Need to compare the STR counts against each person`s data from CSV
- Output the match
Here`s my code:
# Read database file into a variable
names = []
# Read data from the file
with open(sys.argv[1], "r") as file:
# Loop through the names
reader = csv.DictReader(file)
for name in reader:
names.append(name)
# Read STRs
with open(sys.argv[1], "r", newline='') as file:
readstr = csv.reader(file)
rows = list(readstr)
str = rows[0]
# Read DNA sequence file into a variable
sequence = []
with open(sys.argv[2], "r") as file:
sequence = file.read()
# TODO: Find longest match of each STR in DNA sequence
checked_seq = []
for i in range(len(str)):
subsequence = str[i]
reps = longest_match(sequence, subsequence)
checked_seq.append(reps)
I was printing every data structure created along the lines and it looks like the STR count works.
Now here was my train of thought for the last task:
for i in range(1, len(str) - 1):
match = 0
while True:
if checked_seq[i] == names[i - 1][str[i]]:
match += 1
else:
break
if match == len(str) - 1:
print(names[i - 1][str[0]])
else:
print("No match")
I was going to run a loop through every person's data and compare STR counts to STR counts from the checked TXT file. Every time there's a match I have to check the next STR for the same person and break out if it`s a no match or increment match count by one if STRs matched again.
I will check match count against number of STRs and if these values are the same then print out that person`s name.
Can someone please give me a clue where I went wrong?
source https://stackoverflow.com/questions/74381478/cs50-dna-py-compare-str-counts-with-database
Comments
Post a Comment