Web sites often change the format of their pages so this may not always work. If it doesn't, rework the examples after examining the html content of the page (most browsers will let you see the html source - look for a "page source" option - though you might have to turn on the developer mode in your browser preferences. For example, on Chrome you need to click the "developer mode" check box under Extensions in the Preferences/Options menu.

Import necessary modules


The http request response cycle


Set up the BeautifulSoup object


BS4 functions


find finds the first instance of a specified tag


returns a bs4 element


bs4 functions can be recursively applied on elements

  • using selector=value
  • using a dictionary

get returns the value of a tag attribute

  • returns a string

A function that returns a list containing recipe names, recipe descriptions (if any) and recipe urls


Let's write a function that


given a recipe link


returns a dictionary containing the ingredients and preparation instructions


Construct a list of dictionaries for all recipes


Logging in to a web server


Get username and password

  • Best to store in a file for reuse
  • You will need to set up your own login and password and place them in a file called wikidata.txt
  • Line one of the file should contain your username
  • Line two your password

Construct an object that contains the data to be sent to the login page


get the value of the login token


Setup a session, login, and get data