Python Requests ‘User-Agent’ – Web Scraping

A ‘User-Agent’ HTTP request header is a string that a web browser is sending to a web server along with each request to identify itself.

The ‘User-Agent’ string contains information about which browser is being used, what version and on which operating system.

Some websites block access from non-web browser ‘User-Agents’ to prevent web scraping, including from the default Python’s requests ‘User-Agent’.

In this note i will show how to set the ‘User-Agent’ HTTP request header while using the Python’s requests library.

Cool Tip: How to download a file from URL using Python! Read More →

Set ‘User-Agent’ using Python’s Requests Library

Use the following code snippet to set the ‘User-Agent’ header while sending the HTTP request using the Python’s requests library:

import requests

url = "https://www.shellhacks.com"
headers = {
  'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Firefox/91.0'
}

response = requests.get(url, headers=headers)

To randomly select a ‘User-Agent’ from a list:

import random
import requests

url = "https://www.shellhacks.com"
user_agents = [
  "Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Firefox/91.0",
  "Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0",
  "Mozilla/5.0 (X11; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0"
  ]
random_user_agent = random.choice(user_agents)
headers = {
    'User-Agent': random_user_agent
}

response = requests.get(url, headers=headers)

To select a random ‘User-Agent’ from a file user_agents.txt:

import random
import requests

url = "https://www.shellhacks.com"
user_agents = open('user_agents.txt').read().splitlines()
random_user_agent =random.choice(user_agents)

headers = {
    'User-Agent': random_user_agent
}

response = requests.get(url, headers=headers)

Python’s Requests Default ‘User-Agent’

If you don’t specify the ‘User-Agent’ request header, the default Python’s requests ‘User-Agent’ will be used:

>>> import requests
>>> requests.utils.default_headers()
- sample output -
{'User-Agent': 'python-requests/2.27.1', 'Accept-Encoding': ...}

Leave a Reply