Skip to main content

web scraping with BS4 returning None

I have a list of movies that I want to scrap the genres from Google. I've built this code:

list=['Psychological thriller','Mystery','Crime film','Neo-noir','Drama','Crime Thriller','Indie film']
gen2 = {}
for i in list:
  user_query = i +'movie genre'
  URL = 'https://www.google.co.in/search?q=' + user_query
  headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.63 Safari/537.36'}
  page = requests.get(URL, headers=headers)
  soup = BeautifulSoup(page.content, 'html.parser')
  c = soup.find(class_='EDblX DAVP1')
  print(c)
  if c != None:
    genres = c.findAll('a')
    gen2[i]= genres

But it returns an empty dict, so I checked one by one and it worked, for example:

user_query = 'Se7en movie genre' 
URL = "https://www.google.co.in/search?q=" + user_query
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.63 Safari/537.36'}
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
v = soup.find(class_='KKHQ8c')
h = []
genres = v.findAll('a')
for genre in genres:
  h.append(genre.get_text())

So o find out that in the for loop the variable c is returning None. I can't figure out why! It only return None inside the loop.



source https://stackoverflow.com/questions/72706609/web-scraping-with-bs4-returning-none

Comments