I would format tables from a website like a dataframe with rows and columns.
In this example the url is https://www.soccerstats.com/pmatch.asp?league=england&stats=418-17-15-2022 but is the same for other links stats of matches from https://www.soccerstats.com.
Code
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36'}
s = requests.Session()
s.headers.update(headers)
response = s.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
##FIND THE RELATIVE TABLE FROM THE WEBSITE
for ta in soup.findAll('table'):
for s in ta.findPreviousSiblings():
if s.name == 'h2':
if s.text == 'Goal statistics':
goal_stats_table = ta
else:
break
Expected Output should be a dataframe, same for all the stats in the table
source https://stackoverflow.com/questions/70174886/formatting-table-in-a-dataframe-pandas-beautifulsoup
Comments
Post a Comment