Disclaimer : Real Data API only extracts publicly available data while maintaining a strict policy against collecting any personal or identity-related information.
Are you looking for a tool to scrape data on the most popular items in Amazon's Best Sellers categories? Our Amazon Best Sellers Data Scraper can help! It scrapes information on the top 100 products, including their name, price, URL, and thumbnail image. Plus, it's optimized for Amazon's .com, .co.uk, .de, .fr, .es, and .it domains. You can save your data in convenient formats like HTML table, JSON, CSV, and Excel.
Free Amazon Best Sellers Scraper by Real Data API allows you to extract the hundred top-selling Amazon products. It scrapes information from the Amazon Best Sellers pages with a structured format like XML, JSON, Excel, or CSV. Using this Amazon Best Sellers actor, you can.
If you want a more typical data scraper for Amazon products, try other Amazon data scrapers by Real Data API.
For a stepwise tutorial on How to Extract Amazon Best Sellers, check out our tutorial on Amazon Best Sellers Scraper API.
You can scrape Amazon Best Sellers in two ways, by Amazon URL or Domain.
Check this input example for Amazon URL.
The proxy configuration allows you to set up proxy servers that the scraper uses to prevent target website detections. You can have a customized SOCKS5 or HTTP proxy server or try a Real Data API proxy.
You can follow the below list to configure a proxy server.
The API will crawl each webpage with Real Data API proxy automatically. Here, the proxy server uses all the existing proxy groups. For every new page, it selects the unused proxy with the particular hostname to reduce the probability of detection by the source website. You can check the list of existing proxy groups on the available proxy page on our platform.
The actor will crawl all the pages with Real Data API proxy servers with particular target groups of servers.
The API uses a customized list of proxies. You should mention these proxies in the scheme://user:password@host:port form and separate multiple proxies by a new line or a space. Further, you can use HTTP or SOCKS5 URL scheme where you should always use ports and can omit user and password.
You can export the output datasets in multiple usable formats like HTML, JSON, Excel, or CSV. Every item in the dataset will include different Amazon products in the below format.
[{
"category": "Amazon.co.uk Best Sellers: The most popular items in Books",
"categoryUrl": "https://www.amazon.co.uk/best-sellers-books-Amazon/zgbs/books/ref=zg_bs_nav_0",
"ID": 0,
"name": "The Bullet That Missed: (The Thursday Murder Club 3)",
"price": null,
"url": "https://www.amazon.co.uk/Bullet-that-Missed-Thursday-Mystery/dp/0241512425/ref=zg_bs_books_sccl_1/261-2733972-0388621?pd_rd_i=0241512425&psc=1",
"thumbnail": "https://images-eu.ssl-images-amazon.com/images/I/71xfjR3QXyL._AC_UL600_SR600,400_.jpg"
},
{
"category": "Amazon.co.uk Best Sellers: The most popular items in Books",
"categoryUrl": "https://www.amazon.co.uk/best-sellers-books-Amazon/zgbs/books/ref=zg_bs_nav_0",
"ID": 1,
"name": "One: Simple One-Pan Wonders",
"price": null,
"url": "https://www.amazon.co.uk/One-One-Pan-Wonders-Jamie-Oliver/dp/0241431107/ref=zg_bs_books_sccl_2/261-2733972-0388621?pd_rd_i=0241431107&psc=1",
"thumbnail": "https://images-eu.ssl-images-amazon.com/images/I/81CBtopMxOL._AC_UL600_SR600,400_.jpg"
},
{
"category": "Amazon.co.uk Best Sellers: The most popular items in Books",
"categoryUrl": "https://www.amazon.co.uk/best-sellers-books-Amazon/zgbs/books/ref=zg_bs_nav_0",
"ID": 2,
"name": "Verity: The thriller that will capture your heart and blow your mind",
"price": null,
"url": "https://www.amazon.co.uk/Verity-thriller-that-capture-heart/dp/1408726602/ref=zg_bs_books_sccl_3/261-2733972-0388621?pd_rd_i=1408726602&psc=1",
"thumbnail": "https://images-eu.ssl-images-amazon.com/images/I/91e9kVbpfZL._AC_UL600_SR600,400_.jpg"
}
...
During execution, the scraper will alert you to the page on which the data scraping process is going on. After extracting items, the API will inform you that it saves scraped data. Due to parallel scraping, you may not see these notifications in order.
If there is any error, the scraper will finish its run instantly and not add any source data.
If you are scraping Amazon product data for market research or retail analytics, the list of Amazon Best Sellers can update you about the top eCommerce trends. You may find it challenging to compete directly against Amazon Best Sellers, but you can also get inspiration for new product development. You can get the following benefits of scraping bestseller data from Amazon.
Lastly, you can connect any web application or cloud service with Amazon Best Sellers Scraper using integrations on the Real Data API platform. The scraper is compatible with several platforms like GitHub, Zapier, Google Sheets, Airbyte, Make, Slack, etc., to use Integrations. You can also try webhooks to commence any action when any event occurs.
This actor gives you programmatic permission to the Real Data API platform. It is organized around RESTful endpoints with HTTP, allowing you to schedule, manage, and execute our actors. It also allows you to use datasets, fetch outputs, track performance, develop and update actor versions, etc.
To use the API with Python, our client PyPl package, and to use it in Node.js, use our client NPM package.
Visit the API tab to checkout code examples.
To scrape a hundred pages, the API consumes 0.6 compute units to give you 160 pages for 25 cents in the platform credit. You can check our plans, offers, and platform credits on the pricing page.
Check out how industries use Google Play Scraper worldwide.
eCommerce & Retail
Marketing & Media
Fintech & Insurance
You should have a Real Data API account to execute the program examples. Replace < YOUR_API_TOKEN >
in the program using the token of your actor. Read about the live APIs with Real Data API docs for more explanation.
import { RealdataAPIClient } from 'RealdataAPI-Client';
// Initialize the RealdataAPIClient with API token
const client = new RealdataAPIClient({
token: '<YOUR_API_TOKEN>',
});
// Prepare actor input
const input = {
"categoryUrls": [
"https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics/"
],
"depthOfCrawl": 1,
"proxy": {
"useRealdataAPIProxy": true
}
};
(async () => {
// Run the actor and wait for it to finish
const run = await client.actor("junglee/amazon-bestsellers").call(input);
// Fetch and print actor results from the run's dataset (if any)
console.log('Results from dataset');
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
console.dir(item);
});
})();
from RealdataAPI_client import RealdataAPIClient
# Initialize the RealdataAPIClient with your API token
client = RealdataAPIClient("<YOUR_API_TOKEN>")
# Prepare the actor input
run_input = {
"categoryUrls": ["https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics/"],
"depthOfCrawl": 1,
"proxy": { "useRealdataAPIProxy": True },
}
# Run the actor and wait for it to finish
run = client.actor("junglee/amazon-bestsellers").call(run_input=run_input)
# Fetch and print actor results from the run's dataset (if there are any)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)
# Set API token
API_TOKEN=<YOUR_API_TOKEN>
# Prepare actor input
cat > input.json <<'EOF'
{
"categoryUrls": [
"https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics/"
],
"depthOfCrawl": 1,
"proxy": {
"useRealdataAPIProxy": true
}
}
EOF
# Run the actor
curl "https://api.RealdataAPI.com/v2/acts/junglee~amazon-bestsellers/runs?token=$API_TOKEN" /
-X POST /
-d @input.json /
-H 'Content-Type: application/json'
categoryUrls
Optional Array
Choose as many Best Sellers categories as you want to extract. If you put the top-level product category, the scraper will fetch all subcategories. Category links must include bestsellers, best-sellers, or Best-Sellers keywords.
depthOfCrawl
Optional Integer
Amazon Bestsellers include around thirty-seven important categories. Out of them, you can scrape 1 to 4 categories at once, with each having at least one subcategory. If you set this setting to 2, you will get around 550 subcategories with 100 outputs in each subcategory.
proxy
Required Object
You can set up proxy server groups from a particular country. Amazon displays the products you can get delivered to your address depending on the usage of a proxy. You shouldn't worry about it. If you think globally delivered products are sufficient for you.
{
"categoryUrls": [
"https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics/"
],
"depthOfCrawl": 1,
"detailedInformation": false,
"proxy": {
"useRealdataAPIProxy": true
}
}