Bradley Thompson

Home

I Created a Library To Scrape Product Information From Online Retailers

Python
image
Simplified UML Diagram of the components

I have created a library that allows you to scrape "basic" product information from online retailers, including Argos and The Entertainer. The Information scraped includes the products name, price, categories and primary image. click here to see this on my github

Supported Websites

  • www.argos.co.uk
  • www.thetoyshop.co.uk

The below code scrapes two stores and prints out the results

from StoreScraper.StoreScraper import StoreScraper
						
scraper = StoreScraper()
						
result1 = scraper.get_information("https://www.argos.co.uk/product/8992866")
						
result2 = scraper.get_information("https://www.thetoyshop.com/collectibles/adult-collectibles/DC-Comics-30cm-Superman-Figure/p/545842?queryId=83dec2295943aff8586c518c6f031006")
						
print(result1)
print(result2)

Result 1 Output

{
	"url": "https: //www.argos.co.uk/product/8992866",
	"start_datetime": "2023-10-12 12:31:09.102859",
	"end_datetime": "2023-10-12 12:31:09.660612",
	"status": "success",
	"errors": 0,
	"error_messages": [],
	"parser": "Argos",
	"item_name": "DC 12-inch Superman Figure",
	"category_one": "Toys",
	"category_two": "Playsets and figures",
	"category_three": None,
	"price": "£10.00",
	"image_url": "//media.4rgos.it/i/Argos/8992866_R_Z001A?w=750&h=440&qlt=70"
}

Result 2 Output

{
	"url": "https://www.thetoyshop.com/collectibles/adult-collectibles/DC-Comics-30cm-Superman-Figure/p/545842?queryId=83dec2295943aff8586c518c6f031006",
	"start_datetime": "2023-10-12 12:31:09.698567",
	"end_datetime": "2023-10-12 12:31:10.002384",
	"status": "success",
	"errors": 0,
	"error_messages": [],
	"parser": "TheEntertainer",
	"item_name": "DC Comics 30cm Superman Figure",
	"category_one": "Collectible Toys",
	"category_two": "Adult Collectibles",
	"category_three": None,
	"price": "£10.00",
	"image_url": "https://www.thetoyshop.com/medias/545842-Primary-515Wx515H?context=bWFzdGVyfGltYWdlc3wzNzgxM3xpbWFnZS9qcGVnfGFXMWhaMlZ6TDJnNE55OW9PRFF2T1RJeE9ESTVPRFV4TVRNNU1DNXFjR2N8NmYxOGNkNmIwZWYxNjE0YzVjZWU5N2FhNGRiZGIyMmM0Zjk0NmY4NDhiNzU3YTc3MWVkODkwZWM4NDllNjA1ZA"
}