Skip to main content

How to separate tokens(parantheses,colons,etc) when a scanner is scanning a file?

I have created a Python scanner that is a lexical analyzer with created dictionaries. An argument(file) will be passed through command line that will then scan and print out each token. The problem is that if there is no space between tokens, it counts as a single string. Obviously, I do not want that. Is there a way to make a single rule for encountering no white space and create a single whitespace and then continue or will that be a bunch of conditionals for each one?

# Implement split() for separation
    for word in line.split():

        # Implementation of block comment encounter
        # If encountered, skip

        if tokenize(word).value == 3006:
            self.insideComment = True
            return ""
        if tokenize(word).value == 3007:
            self.insideComment = False
            return ""
        if tokenize(word).value == 3008:
            return (tokenizedLine)
        if tokenize(word).value == 3014:
            self.insideComment == False
            return ""
        
        # If program encounters ':' then white space
        if tokenize(word).value == 3013:
            return...


        #Special characters, other tokens
   "otherTokens": {
    '\"': 3001,
    # cannot start with number
    '\[([a-zA-Z]|([\-]?[0-9]?.[0-9]))+\]': 3002,
    '[\-]?[0-9]': 3003,
    '[a-zA-Z]+': 3004,
    '\(': 3005,
    '\)': 3006,
    '\/\*': 3007,
    '\*\/': 3008,
    '\/\/': 3009,
    '\[\]': 3010,
    '\,': 3011,
    '\s+': 3012,
    ':': 3013, # colon
    

}


source https://stackoverflow.com/questions/71213025/how-to-separate-tokensparantheses-colons-etc-when-a-scanner-is-scanning-a-file

Comments