Scraping Booking.com Property Listings with JavaScript in 2023

In this article, we will learn how to scrape property listings from Booking.com using JavaScript. We will use common JavaScript libraries like Axios and Cheerio to fetch the HTML content and parse/extract details like property name, location, ratings etc.

Prerequisites

To follow along, you will need:

Node.js installed on your system

Basic knowledge of JavaScript and HTML

Installing Dependencies

We will use Axios for sending HTTP requests and Cheerio for parsing HTML.

Install them using npm:

npm install axios cheerio

This will download the packages into the node_modules folder.

Importing Dependencies

At the top of your JavaScript file, import the packages:

const axios = require('axios');
const cheerio = require('cheerio');

Defining the Target URL

—

We will scrape listings from this URL on Booking.com:

const url = '<https://www.booking.com/searchresults.html?ss=New+York&checkin=2023-03-01&checkout=2023-03-05&group_adults=2>';

You can modify the parameters as needed.

Setting User Agent

We need to set a valid User Agent string:

const userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36';

Fetching the HTML Page

Use Axios to send a GET request and get the response:

const options = {
  method: 'GET',
  url: url,
  headers: {
    'User-Agent': userAgent
  }
};

axios(options).then(res => {

  const html = res.data;

  // Parse HTML here

});

We pass the User Agent header and fetch the page HTML.

Parsing the HTML

Use Cheerio to parse and traverse the HTML:

const $ = cheerio.load(html);

This loads the HTML into a Cheerio object.

Extracting Property Cards

The property cards have a data-testid of property-card:

const cards = $('div[data-testid="property-card"]');

This extracts all divs with that attribute into a Cheerio collection.

Looping Through Cards

Loop through the cards:

cards.each((i, card) => {

  // Extract data from card

});

Inside the loop we can extract details from each card node.

Extracting Property Name

The title is in a h3 element:

const title = $(card).find('h3').text();

Get the h3 inside card and extract its text.

Extracting Location

The location is in a span:

const location = $(card).find('span[data-testid="address"]').text();

Filter by the data-testid attribute to find the span.

Extracting Rating

Get the aria-label attribute of the star rating div:

const rating = $(card).find('div.e4755bbd60').attr('aria-label');

Filter by the CSS class name.

Extracting Review Count

Get text of the review count div:

const reviewCount = $(card).find('div.abf093bdfe').text();

Again filter by class name.

Extracting Description

Get the description div text:

const description = $(card).find('div.d7449d770c').text();

Printing the Data

Print out the extracted details:

console.log(`
  Name: ${title}
  Location: ${location}
  Rating: ${rating}
  Review Count: ${reviewCount}
  Description: ${description}
`);

You can also store the data in an array instead of printing.

Full Script

Here is the full scraping script:

const axios = require('axios');
const cheerio = require('cheerio');

const url = '<https://www.booking.com/searchresults.html?ss=New+York&checkin=2023-03-01&checkout=2023-03-05&group_adults=2>';

const userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36';

const options = {
  method: 'GET',
  url: url,
  headers: {
    'User-Agent': userAgent
  }
};

axios(options).then(res => {

  const html = res.data;
  const $ = cheerio.load(html);

  const cards = $('div[data-testid="property-card"]');

  cards.each((i, card) => {

    const title = $(card).find('h3').text();
    const location = $(card).find('span[data-testid="address"]').text();
    const rating = $(card).find('div.e4755bbd60').attr('aria-label');
    const reviewCount = $(card).find('div.abf093bdfe').text();
    const description = $(card).find('div.d7449d770c').text();

    console.log(`
      Name: ${title}
      Location: ${location}
      Rating: ${rating}
      Review Count: ${reviewCount}
      Description: ${description}
    `);

  });

});

This script scrapes and prints key details from Booking.com listings using JavaScript. The same technique can be applied to any site.

While these examples are great for learning, scraping production-level sites can pose challenges like CAPTCHAs, IP blocks, and bot detection. Rotating proxies and automated CAPTCHA solving can help.

Proxies API offers a simple API for rendering pages with built-in proxy rotation, CAPTCHA solving, and evasion of IP blocks. You can fetch rendered pages in any language without configuring browsers or proxies yourself.

This allows scraping at scale without headaches of IP blocks. Proxies API has a free tier to get started. Check out the API and sign up for an API key to supercharge your web scraping.

With the power of Proxies API combined with Python libraries like Beautiful Soup, you can scrape data at scale without getting blocked.

Scraping Booking.com Property Listings with JavaScript in 2023

Prerequisites

Installing Dependencies

Importing Dependencies

Defining the Target URL

Setting User Agent

Fetching the HTML Page

Parsing the HTML

Extracting Property Cards

Looping Through Cards

Extracting Property Name

Extracting Location

Extracting Rating

Extracting Review Count

Extracting Description

Printing the Data

Full Script

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Scraping Booking.com Property Listings with JavaScript in 2023

Prerequisites

Installing Dependencies

Importing Dependencies

Defining the Target URL

Setting User Agent

Fetching the HTML Page

Parsing the HTML

Extracting Property Cards

Looping Through Cards

Extracting Property Name

Extracting Location

Extracting Rating

Extracting Review Count

Extracting Description

Printing the Data

Full Script

The easiest way to do Web Scraping

Don't leave just yet!