A lightweight node.js module for retrieving and scraping ads from Kijiji.
- Retrieve single ads as JavaScript objects given their URL
- Retrieve the latest ads matching given search criteria
- node.js - evented I/O for the backend
- cheerio - jQuery-like API for the server
- node-fetch - Fetch API implementation for Node.Js
npm install kijiji-scraper
Quick start: Use Ad.Get()
to scrape an ad given its URL. Use search()
to scrape many ads given a set of search parameters. Read on (or CTRL+F) for more detailed information.
This class encapsulates a Kijiji ad and its properties. It also handles retrieving this information from Kijiji.
Property | Type | Description |
---|---|---|
title |
String | Title of the ad |
description |
String | Ad description |
date |
Date | Date the ad was posted |
image |
String | URL of the ad's primary image |
images |
String[] | Array of URLs of the ad's images |
attributes |
Object | Properties specific to the category of the scraped ad |
url |
String | The ad's url |
The image URL given in image
is the featured image for the ad and will be up to 300x300. The image URLs given in images
are all of the images associated with the ad and each may be up to 1024x1024.
Note: If the ad has not been scraped automatically, some of these properties may be null or empty. This happens when an Ad
object is created manually using the constructor or by performing a search with the scrapeResultDetails
option set to false
. See the Ad.isScraped()
and Ad.scrape()
method documentation below for more information on this.
Will scrape the Kijiji ad at url
and construct an Ad
object containing its information.
url
- A Kijiji ad URLcallback(err, ad)
(optional) - A callback called after the ad has been scraped. If an error occurs during scraping,err
will not be null. If everything is successful,ad
will contain anAd
object
Returns a Promise
which resolves to an Ad
object containing the ad's information.
const kijiji = require("kijiji-scraper");
// Scrape using returned promise
kijiji.Ad.Get("<Kijiji ad URL>").then(function(ad) {
// Use the ad object
console.log(ad.title);
}).catch(console.error);
// Scrape using optional callback paramater
kijiji.Ad.Get("<Kijiji ad URL>", function(err, ad) {
if (!err) {
// Use the ad object
console.log(ad.title);
}
});
Ad
constructor. Manually constructs an ad object. You should generally not need to use this save for a few special cases (e.g., storing ad URLs entered by a user for delayed scraping). Ad.isScraped()
returns false for Ad
objects constructed in this way unless scraped
is passed as true
or they are subsequently scraped by calling Ad.scrape()
, which causes the scraper to replace the ad's information with what is found at its URL.
url
- Ad's URLinfo
(optional) - Object containing the ad's properties. Only keys in the properties table (above) may be specified. May be omitted (if not specified thenimages
will be an empty array,attributes
will be an empty object, and all other properties will be null)scraped
(optional) - Iftrue
, causesAd.IsScraped()
to returntrue
regardless of whether or notAd.scrape()
has been called
const kijiji = require("kijiji-scraper");
let ad = kijiji.Ad("<Kijiji ad URL>", { date: new Date() });
console.log(ad.isScraped()); // false
console.log(ad.date); // current date
ad.scrape().then(function() {
// Use the ad object
console.log(ad.date); // date ad was posted (initial value is overwritten)
}).catch(console.error);
Determines whether or not the ad's information has been retrieved from Kijiji.
Returns a boolean indicating whether or not an ad's information has been scraped from the page at its URL. This can be false if the Ad
object was manually created using the constructor or if it was retrieved from a search with the scrapeResultDetails
option set to false. Call Ad.scrape()
to retrieve the information for such ads.
const kijiji = require("kijiji-scraper");
let ad = kijiji.Ad("<Kijiji ad URL>"); // ad does not get scraped
console.log(ad.isScraped()); // false
ad.scrape().then(function() {
console.log(ad.isScraped()); // true
}).catch(console.error);
Manually retrieves an Ad
's information from its URL. Useful if it was created in a way that does not do this automatically, such as using the constructor or performing a search with the scrapeResultDetails
option set to false.
callback(err)
(optional) - A callback called after the ad has been scraped. If an error occurs during scraping,err
will not be null
Returns a Promise
which resolves once the ad has been scraped and the object has been updated.
const kijiji = require("kijiji-scraper");
let ad = kijiji.Ad("<Kijiji ad URL>"); // ad does not get scraped
console.log(ad.isScraped()); // false
// Scrape using returned promise
ad.scrape().then(function() {
// Use the ad object
console.log(ad.isScraped()); // true
console.log(ad.title);
}).catch(console.error);
// Scrape using optional callback paramater
ad.scrape(function(err) {
if (!err) {
// Use the ad object
console.log(ad.isScraped()); // true
console.log(ad.title);
}
});
Returns a string representation of the ad. This is just meant to be a summary and may omit information for brevity or change format in the future. Access the Ad
's properties directly if you need them for comparisons, etc. The current format is as follows:
[MM/dd/yyyy @ hh:mm] TITLE
URL
* property1: value1
* property2: value2
...
* propertyN: valueN
The date, title, and properties will be absent if the ad has not been scraped (isScraped() == false
) unless they were manually specified when the object was constructed.
const kijiji = require("kijiji-scraper");
kijiji.Ad.Get("<Kijiji ad URL>").then(function(ad) {
console.log(ad.toString());
}).catch(console.error);
Searches are performed using the search()
function:
-
params
- Object containing Kijiji ad search parameters.-
Mandatory parameters:
Parameter Type Default Value Description locationId
Integer/Object 0
(all of Canada)Id of the geographical location to search in categoryId
Integer/Object 0
(all categories)Id of the ad category to search in Values for
locationId
andcategoryId
can be found by performing a search on the Kijiji website and examining the URL that Kijiji redirects to. For example, after setting the location to Ottawa and selecting the "cars & vehicles" category, Kijiji redirects to http://www.kijiji.ca/b-cars-vehicles/ottawa/c27l1700185. The last part of the URL (c27l1700185) is formatted as c[categoryId]l[locationId]. So in this case,categoryId
is 27 andlocationId
is 1700185.For convenience, objects containing all
locationId
andcategoryId
values Kijiji accepts have been defined inlocations.js
andcategories.js
, respectively. These objects are nested in the same way as those in the location and category selectors on the Kijiji website (e.g., the city of Montreal is located under "Quebec > Greater Montreal > City of Montreal"; coffee tables are located under "Buy and Sell > Furniture > Coffee Tables"), so their contents should be familiar.For example, instead of setting
locationId
to1700281
(Montreal) andcategoryId
to241
(coffee tables), you can setlocationId
tolocations.QUEBEC.GREATER_MONTREAL.CITY_OF_MONTREAL
andcategoryId
tocategories.BUY_AND_SELL.FURNITURE.COFFEE_TABLES
. You no longer need to know the ids, and you have a quick reference available. Any location/category object along the hierarchy will also work (e.g.,locations.QUEBEC
for all of Quebec, not just Montreal;categories.BUY_AND_SELL.FURNITURE
for all furniture, not just coffee tables). Location/category objects andlocationId
s/categoryId
s are interchangeable - the search function will behave identically in either case. Seelocations.js
andcategories.js
for all location and category objects. -
Optional parameters: There are many different search parameters. Some of these can be used in any search (i.e.,
keywords
), but most are category-specific. For example, setparams["attributeMap[petsallowed_s]"] = "[1]"
to exclude pet-unfriendly landlords when searching for apartments.Parameters can be found by using your browser's developer tools and performing a custom search on the Kijiji website. After submitting your search on Kijiji or updating the filter being applied, use your browser's network monitoring tool to examine the request for
https://www.kijiji.ca/b-search.html
. Any parameter used in the query string for this request is able to be specified inparams
. A few examples include:Parameter Type Description keywords
String Search string, with words separated by a '+' minPrice
Number Minimum price of returned items maxPrice
Number Maximum price of returned items sortByName
String Search results ordering (e.g., "dateDesc", "dateAsc", "priceDesc", "priceAsc")
-
-
options
(optional) - Contains parameters that control the behavior of the scraper. Can be omitted.Option Type Default Value Description scrapeResultDetails
Boolean true
By default, the details of each query result are scraped in separate, subsequent requests. To suppress this behavior and return only the data retrieved by the initial query, set this option to false
. Note that ads will lack some information if you do this andAd.isScraped()
will returnfalse
untilAd.scrape()
is called to retrieve the missing information.minResults
Integer 20
Minimum number of ads to fetch (if available). Note that Kijiji results are returned in pages of up to 20 ads, so if you set this to something like 29, up to 40 results may be retrieved. maxResults
Integer -1
Maximum number of ads to return. This simply removes excess results from the array that is returned (i.e., if minResults
is 40 andmaxResults
is 7, 40 results will be fetched from Kijiji and the last 33 will be discarded). A negative value indicates no limit. -
callback(err, results)
(optional) - A callback called after the search results have been scraped. If an error occurs during scraping,err
will not be null. If everything is successful,results
will contain an array ofAd
objects.
Returns a Promise
which resolves to an array of search result Ad
objects.
const kijiji = require("kijiji-scraper");
let options = {
minResults: 40
};
let params = {
locationId: 1700185, // Same as kijiji.locations.ONTARIO.OTTAWA_GATINEAU_AREA.OTTAWA
categoryId: 27, // Same as kijiji.categories.CARS_AND_VEHICLES
sortByName: "priceAsc" // Show the cheapest listings first
};
// Scrape using returned promise
kijiji.search(params, options).then(function(ads) {
// Use the ads array
for (let i = 0; i < ads.length; ++i) {
console.log(ads[i].title);
}
}).catch(console.error);
// Scrape using optional callback parameter
function callback(err, ads) {
if (!err) {
// Use the ads array
for (let i = 0; i < ads.length; ++i) {
console.log(ads[i].title);
}
}
}
kijiji.search(params, options, callback);