Disclaimer : Real Data API only extracts publicly available data while maintaining a strict policy against collecting any personal or identity-related information.
Real Data API offers seamless integration with the Zepto Grocery Data Scraping API, providing precise insights through the Zepto Data API. With the Zepto Grocery Scraper and Zepto Data Scraper, you can effortlessly scrape Zepto grocery data to track pricing, monitor trends, and analyze grocery availability. Utilize the Zepto Grocery Delivery Data Scraper to gain actionable insights. Expand your operations with Grocery Data Scraping solutions in Australia, Canada, Germany, France, Singapore, USA, UK, UAE, and India, tailored to your needs.
This API enables you to efficiently scrape data from Zepto, filling the gap as the platform doesn’t offer built-in tools for this purpose. The Zepto Data Scraper supports the following features:
Don't worry if you come across different products while browsing the page—Zepto customizes its product offerings for each user.
This API is currently under development. If you have any suggestions or feature requests, feel free to email us! We are continuously working to improve and update the system.
Provide input through a JSON file with Zepto page URLs and required data fields like product name, description, price, images, metadata, reviews, and Q&A. This helps the scraper collect accurate data from the specified pages.
Field | Type | Description |
---|---|---|
Page URL | String | The URL of the Zepto page to scrape. |
Product Name | String | The name of the grocery item. |
Product Description | String | A detailed description of the grocery item in HTML format. |
Product Price | Float | The price of the grocery item. |
Product Images | Array (String) | URLs of images related to the grocery item. |
Product Metadata | Object | Additional metadata such as SKU, weight, size, etc. |
User Feedback | Array (Object) | Customer reviews, including country, translated content, and original content. |
Q&A Data | Array (Object) | Questions and answers related to the product. |
Currency | String | The currency used for the product price (e.g., USD, INR). |
Language | String | The language used for product descriptions and reviews. |
Region | String | The region for grocery delivery (e.g., Australia, UK). |
Product Category | String | The category to which the product belongs (e.g., Dairy, Vegetables). |
To access this solution efficiently, it's recommended to use proxy servers. You can either use your own proxies or try proxies provided by Real Data API.
When scraping a specific URL, use it as the startUrl.
For category links, set up the startUrls with the start and end page parameters, and configure the subcategory logic to ensure the scraper extracts data from all relevant subcategories.
The scraper is optimized to handle high-volume data extraction. If the API is not frequently blocked, you can scrape up to 100 listings in 120 seconds.
{
"startUrls":[
{
"url": "https://www.zepto.com/grocery-category/vegetables",
"startPage": 1,
"endPage": 5,
"subCategories": [
"leafy-vegetables",
"root-vegetables"
]
},
{
"url": "https://www.zepto.com/grocery-category/dairy",
"startPage": 1,
"endPage": 3,
"subCategories": [
"milk",
"cheese"
"yogurt"
]
}
],
"dataFields": {
"productName": "string",
"productDescription": "string",
"productPrice": "float",
"productImages": ["string"],
"productMetadata": {,
"sku": "string",
"weight": "float",
"size": "string"
},
"userFeedback": [
{
"country": "string",
"originalContent": "string",
"translatedContent": "string",
}
],
"qaData": [
{
"question": "string",
"answer": "string",
}
]
},
"proxySettings": {
"useRealDataApiProxies": true,
"region": "US",
"currency": "USD",
"language": "en",
}
}
The API stores the results in a custom dataset as it executes, with each product stored separately. Outputs can be processed in multiple coding languages. For more details on retrieving data, refer to our API documentation or FAQs.
While executing, the API organizes results into a custom dataset, treating each product as a separate entry. You can manage outputs in various programming languages. Check our API reference or FAQs for more information on how to obtain data.
The format for each product in Zepto looks as follows:
{
"productId": "123456",
"productName": "Fresh Organic Tomatoes",
"productDescription": "Fresh organic tomatoes sourced from local farms. Rich in flavor and nutrients.",
"productPrice": 3.99,
"currency": "USD",
"productImages": [
"https://www.zepto.com/images/tomatoes1.jpg",
"https://www.zepto.com/images/tomatoes2.jpg"
],
"productMetadata":{
"sku": "TOM123",
"weight": "1.2",
"size": "1kg"
},
"userFeedback": [
{
"country": "US",
"originalContent": "These tomatoes are amazing!",
"translatedContent": "¡Estos tomates son increíbles!",
},
{
"country": "IN",
"originalContent": "Very fresh and tasty.",
"translatedContent": "Muy frescos y sabrosos.",
}
],
"qaData": [
{
"question": "Are these tomatoes pesticide-free?",
"answer": "Yes, they are grown organically without any pesticides."
},
{
"question": "What is the shelf life of the tomatoes?",
"answer": "They stay fresh for up to 7 days when stored in a cool place."
}
]
}
You should have a Real Data API account to execute the program examples. Replace < YOUR_API_TOKEN >
in the program using the token of your actor. Read about the live APIs with Real Data API docs for more explanation.
import { RealdataAPIClient } from 'RealdataAPI-client';
// Initialize the RealdataAPIClient with API token
const client = new RealdataAPIClient({
token: '<YOUR_API_TOKEN>',
});
// Prepare actor input
const input = {
"productUrls": [
{
"url": "https://www.amazon.com/Apple-iPhone-64GB-Midnight-Green/dp/B08BHHSB6M"
}
],
"maxReviews": 100,
"proxyConfiguration": {
"useRealdataAPIProxy": true
},
"extendedOutputFunction": ($) => { return {} }
};
(async () => {
// Run the actor and wait for it to finish
const run = await client.actor("junglee/amazon-reviews-scraper").call(input);
// Fetch and print actor results from the run's dataset (if any)
console.log('Results from dataset');
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
console.dir(item);
});
})();
from RealdataAPI_client import RealdataAPIClient
# Initialize the RealdataAPIClient with your API token
client = RealdataAPIClient("<YOUR_API_TOKEN>")
# Prepare the actor input
run_input = {
"productUrls": [{ "url": "https://www.amazon.com/Apple-iPhone-64GB-Midnight-Green/dp/B08BHHSB6M" }],
"maxReviews": 100,
"proxyConfiguration": { "useRealdataAPIProxy": True },
"extendedOutputFunction": "($) => { return {} }",
}
# Run the actor and wait for it to finish
run = client.actor("junglee/amazon-reviews-scraper").call(run_input=run_input)
# Fetch and print actor results from the run's dataset (if there are any)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)
# Set API token
API_TOKEN=<YOUR_API_TOKEN>
# Prepare actor input
cat > input.json <<'EOF'
{
"productUrls": [
{
"url": "https://www.amazon.com/Apple-iPhone-64GB-Midnight-Green/dp/B08BHHSB6M"
}
],
"maxReviews": 100,
"proxyConfiguration": {
"useRealdataAPIProxy": true
},
"extendedOutputFunction": "($) => { return {} }"
}
EOF
# Run the actor
curl "https://api.RealdataAPI.com/v2/acts/junglee~amazon-reviews-scraper/runs?token=$API_TOKEN" /
-X POST /
-d @input.json /
-H 'Content-Type: application/json'
productUrls
Required Array
Put one or more URLs of products from Amazon you wish to extract.
maxReviews
Optional Integer
Put the maximum count of reviews to scrape. If you want to scrape all reviews, keep them blank.
includeGdprSensitive
Optional Boolean
Personal information like name, ID, or profile pic that GDPR of European countries and other worldwide regulations protect. You must not extract personal information without legal reason.
sort
Optional String
Choose the criteria to scrape reviews. Here, use the default HELPFUL of Amazon.
"RECENT"
,"HELPFUL"
proxyConfiguration
Required Object
You can fix proxy groups from certain countries. Amazon displays products to deliver to your location based on your proxy. No need to worry if you find globally shipped products sufficient.
extendedOutputFunction
Optional String
Enter the function that receives the JQuery handle as the argument and reflects the customized scraped data. You'll get this merged data as a default result.
{
"productUrls": [
{
"url": "https://www.amazon.com/Apple-iPhone-64GB-Midnight-Green/dp/B08BHHSB6M"
}
],
"maxReviews": 100,
"includeGdprSensitive": false,
"sort": "helpful",
"proxyConfiguration": {
"useRealdataAPIProxy": true
},
"extendedOutputFunction": "($) => { return {} }"
}