Skip to main content

Count how many times an ip has accessed a url

I can print the ip and url from a massive log file, but I need to list how many times an ip has visited that url. I have done some research about throwing the log in a database, but I specifically need to do all of this in Python. any help is very appreciated.

My Code so far:

#!/usr/bin/python3
count = 0
log = open("access.log-20201019", "r")
arr = []
frequency_array = []

for i in log.readlines():
        ip = i[0:14]
        ip2 = ip.split(' ')
        ip3 = ip2[0]
        #print(ip3)
        url =i[53:87]
        url2 = url.split()
        url3 = url2[0]
        print(ip3,url3)

Snippet of Log file:

66.177.237.17 - - [18/Oct/2020:03:06:07 -0400] "GET /webcam/1/latest.jpeg HTTP/2.0" 304 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36" "-"
158.136.64.65 - - [18/Oct/2020:03:06:07 -0400] "GET /webcam/rwis/littlebay/latest.jpeg HTTP/1.1" 301 169 "-" "curl/7.46.0" "-"
158.136.64.65 - - [18/Oct/2020:03:06:07 -0400] "GET /webcam/rwis/littlebay/latest.jpeg HTTP/1.1" 200 37145 "-" "curl/7.46.0" "-"
112.198.71.230 - - [18/Oct/2020:03:06:09 -0400] "GET /precip/raingauge2.gif HTTP/2.0" 200 10078 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36" "-"
173.9.45.97 - - [18/Oct/2020:03:06:10 -0400] "GET /NHPR/NHPR_rad_an.gif HTTP/2.0" 200 587317 "-" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36" "-"
173.9.45.97 - - [18/Oct/2020:03:06:11 -0400] "GET /favicon.ico HTTP/2.0" 200 27877 "https://vortex.plymouth.edu/NHPR/NHPR_rad_an.gif" "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36" "-"
158.136.64.65 - - [18/Oct/2020:03:06:11 -0400] "GET /webcam/1/nograph.1.jpeg HTTP/1.1" 301 169 "-" "curl/7.46.0" "-"
158.136.64.65 - - [18/Oct/2020:03:06:11 -0400] "GET /webcam/1/nograph.1.jpeg HTTP/1.1" 200 242804 "-" "curl/7.46.0" "-"
158.136.64.65 - - [18/Oct/2020:03:06:12 -0400] "GET /webcam/rwis/echolake/latest.jpeg HTTP/1.1" 301 169 "-" "curl/7.46.0" "-"
158.136.64.65 - - [18/Oct/2020:03:06:12 -0400] "GET /webcam/rwis/echolake/latest.jpeg HTTP/1.1" 404 2256 "-" "curl/7.46.0" "-"
158.136.64.65 - - [18/Oct/2020:03:06:14 -0400] "GET /webcam/rwis/lafeyette/latest.jpeg HTTP/1.1" 301 169 "-" "curl/7.46.0" "-"
158.136.64.65 - - [18/Oct/2020:03:06:14 -0400] "GET /webcam/rwis/lafeyette/latest.jpeg HTTP/1.1" 200 36974 "-" "curl/7.46.0" "-"

I am able to run my current code, but will output the ip and url multiple times for the same ip. I just want the number of times an ip visited a certain url.



source https://stackoverflow.com/questions/74451296/count-how-many-times-an-ip-has-accessed-a-url

Comments