logo

Yellow Pages Scraper - Scrape Yellow Pages Data

RealdataAPI / Yellow Pages Scraper

The Yellow Pages Scraper, developed as a Real Data API actor, facilitates extracting information from Yellow Pages listings. This versatile tool enables users to search for records using keywords and location or a provided list of URLs. Built on the Real Data API SDK, it is deployable on both the platform and locally, offering flexibility in data extraction for various applications.

With powerful yellow pages data scraping capabilities, this tool helps businesses, researchers, and marketers efficiently gather structured business listing data for analysis, lead generation, and competitive research.

Supported Countries: Australia, Canada, Germany, France, Singapore, USA, UK, UAE, and India.

Why Choose Our Yellow Pages Scraper?

  • Simple & User-Friendly – No complex setup required—just input your search and location, and let the scraper do the work.
  • Flexible & Scalable – Extract business details using keywords, location, or a list of URLs for precision targeting.
  • Multi-Platform Support – Deploy on Real Data API or run it locally based on your preferences.
  • Customizable Output – Use the Extend Output Function to modify and enhance data fields based on your needs.
  • Proxy Support – Bypass restrictions with Real Data API proxies for smoother data extraction.

How It Works

  • Enter a search term (e.g., "Dentist") and location (e.g., "New York").
  • Or, provide a list of start URLs for deeper, targeted crawling.
  • Adjust the maximum number of records you want to scrape.
  • Run the scraper and access structured business data in your dataset.

Input

Field Type Description Default Value
search string search string Query string searched on a site Dentist
location string Location string searching the records for “New York”
startUrls array A series of Request objects designated for thorough crawling. The URLs should correspond to any record list page on Yellowpages.com None
maxItems number Maximum pages which will get scraped00 200
extendOutputFunction string Develop a function which takes the Cheerio object as well as a Cheerio representation from record element called ($, record) like inputs. This function is designed to generate data, which will subsequently be joined with a default output as per the specifications outlined in the Extend Output. ($, record) => { return {}; }
proxyConfiguration object Proxy settings of the run. If you have access to Real Data API proxy, you can set { "useRealDataAPIProxy": true" } to enable proxy usage { "useRealDataAPIProxy": false }

Set either the search and location attributes or the startUrls attribute.

Output

Data is stored in a dataset, with each entry containing information about a record.


{
    "isAd": true,

    "url": "https://www.yellowpages.com/compton-ca/mip/golden-state-
    dental-group-18768214?lid=1001760866489",
    "name": "Golden State Dental Group",
    "address": "1601 N Long Beach Blvd, Compton, New York 90221",
    "phone": "(310) 507-7718",
    "rating": 4,
    "ratingCount": 6,
    "infoSnippet": "*Please contact us for more information",
    "image":
    "https://i4.ypcdn.com/blob/00a40d49e577606be9d82ced5404696022a
    7e2a0",
    "categories": [
    "Dentists",
    "Pediatric Dentistry",
    "Implant Dentistry",
    "Periodontists",
    "Cosmetic Dentistry"
    ]
}


($, record) => {
    return {
    additionalField: 'exampleValue',
    modifiedField: 'newModifiedValue',
    unwantedField: undefined,
    };
}
                                                

This function extends the output by adding an "additionalField," modifying the value of "modifiedField," and removing the "unwantedField."

INQUIRE NOW