GithubHelp home page GithubHelp logo

rvalitov / backlink-checker-php Goto Github PK

View Code? Open in Web Editor NEW
50.0 5.0 19.0 384 KB

Validates a predefined list of backlinks on remote websites that they exist and are correct (for SEO). Simple and Javascript-enabled web engines are used for scraping.

License: GNU General Public License v3.0

PHP 97.80% HTML 2.20%
seo-tools backlinks seo seotools

backlink-checker-php's Introduction

PHP version icon License icon Platform icon Codacy Badge Codacy Badge Quality Gate Status Code Smells Maintainability Rating Security Rating Bugs Vulnerabilities Reliability Rating Technical Debt PHP 7.4 Build Test PHP 8.0 Build Test PHP 8.1 Build Test PHP 8.2 Build Test Linux Build Test Windows Build Test Mac Build Test

The Idea of Backlink Checker

Social preview

It's a casual task to work with backlinks in SEO. There are several tools to check or search for backlinks. Unlike other tools, we do not scan all possible websites in Internet and do not analyze Google Search results in order to find backlinks to your website. We only validate a list of backlinks that you already know. You receive a list of backlinks using one of the following ways:

  • you buy backlinks and receive the list of donor web pages
  • you generate the backlinks yourself by posting on forums, 3rd party websites, etc.
  • your SEO expert or company works for you and shows you the reports with backlinks as one of the SEO strategies

When you have such list of donor web pages, you need to confirm that they actually contain the required backlink to your website. Besides, you need to validate this list regularly in the future to monitor if the backlinks still exist and are not being deleted. So, this package will help you to do that. It allows checking for a fixed backlink, such as https://example.com and use search patterns, such as *.example.com and many others using regular expressions.

Browser Scraping Modes

Simple

Simple mode does not support JavaScript; it requires minimal dependencies, works fast, available on shared hosting. But it works only for simple or static HTML, for example, generated by Joomla, WordPress or Drupal. It will not find backlinks on websites that require JavaScript-enabled browser, for example, websites made with Laravel, Yii, React, etc.

Chromium (JavaScript enabled)

We use Chromium headless mode for JavaScript-enabled browsing. This approach allows parsing any website, and this is the recommended mode, but it uses more resources on the server and requires a little bit more time to configure the server.

How to Install

Step 1. Add the Package via Composer

You must have a Composer installed. Run the following command:

php composer require rvalitov/backlink-checker-php:^2.0.0

Here we use version 2.0.0 or later that support PHP 8.0 and latest versions of dependencies. If you want to use an earlier version, please check the 1.x.x releases.

Some dependencies from version 1.x.x are not supported anymore. Therefore, I had to switch to community-driven forks. As a result, such forks are not published in the composer library and have "dev" status. To use them, you need to add the following two repositories to your composer.json file, so that composer knows where to look for them.

"repositories": [
    {
      "type": "git",
      "url": "https://github.com/zoonru/puphpeteer.git"
    },
    {
      "type": "git",
      "url": "https://github.com/zoonru/rialto.git"
    }
  ]

Besides, please add the following config (for example, before or after the "repositories" section) to allow composer to use "dev" versions of the packages:

"minimum-stability": "dev",
"prefer-stable": true

After that run update:

php composer update

Step 2. Install the Chromium

Note: You can skip this step if you don't need the Chromium mode.

You need to install the following packages first, to make the Chromium work.

For Debian/Ubuntu:

apt-get update
apt-get install gconf-service libasound2 libatk1.0-0 libc6 libcairo2 libcups2 libdbus-1-3 libexpat1 libfontconfig1 libgcc1 libgconf-2-4 libgdk-pixbuf2.0-0 libglib2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libpangocairo-1.0-0 libstdc++6 libx11-6 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 libxss1 libxtst6 ca-certificates fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget

You must have a Node.Js installed. If it's not installed, install it using the official manual. Then run the following command to install the Chromium:

npm install

Step 3. Use Autoload

Include the autoload.php in your source PHP file, for example:

<?php
require __DIR__ . '/vendor/autoload.php';

How to Use

First, include the dependencies:

<?php
require __DIR__ . "/vendor/autoload.php";

use Valitov\BacklinkChecker;

Then decide which mode to use, for Chromium mode use:

$checker = new BacklinkChecker\ChromeBacklinkChecker();

Or if you want a simple mode without JavaScript support use:

$checker = new BacklinkChecker\SimpleBacklinkChecker();

Make a scan of the desired URL with a desired pattern (use the PCRE pattern syntax):

$url = "https://example.com";
$pattern = "@https?://(www\.)?mywebsite\.com.*@";
$scanBacklinks = true;
$scanHotlinks = false;
$makeScreenshot = true;

try {
    $result = $checker->getBacklinks($url, $pattern, $scanBacklinks, $scanHotlinks, $makeScreenshot);
} catch (RuntimeException $e) {
    die("Error: " . $e->getMessage());
}

The function getBacklinks has the following additional options:

  • $scanBacklinks - if set to true, then it scans for the backlinks (the text of the href attribute of <a> tag); otherwise scanning is not performed.
  • $scanHotlinks - if set to true, then it scans for the hotlink (the text of the src attribute of <img> tag); otherwise scanning is not performed.
  • $makeScreenshot - if set to true, then we also take a screenshot of the viewport; otherwise screenshot is not made. This option makes sense only for Chromium mode (default viewport size is 800 x 600 px, image format: JPEG, image quality: 90, image encoding: binary); for simple mode this option is ignored.

Now we should check the $result, if the function succeeded:

$response = $result->getResponse();
if ($response->getSuccess()) {
    $links = $result->getBacklinks();
    if (sizeof($links) > 0)
        //Backlinks found
    else {
        //No backlinks found
    }
} else {
    //Error, usually network error, or server error
    die("Error, HTTP Code " . $response->getStatusCode());
}

The function $result->getBacklinks() returns an array of objects that describe the backlink. Each object supports the following functions:

  • getBacklink returns string, a backlink - an exact URL that matches the target domain;
  • getTag returns string, the tag that is used for the backlink, can be a or img;
  • getTarget returns string, contents of target attribute of the href;
  • getNoFollow returns true if the backlink has nofollow attribute;
  • getAnchor returns string - anchor of the link, for example, inner text of <a> tag. This text is returned in a plain text format, all HTML tags are stripped.

The $response object supports the following functions:

  • getUrl returns string, the URL of that was analyzed
  • getStatusCode returns int, the HTTP status code, or 0 or -1 if there was a network error.
  • getScreenshot returns string, the screenshot in binary format. If the screenshot was not taken or is not available, then the string is empty. If you want to display this screenshot as an image on a web page, then you should first save it to disk and use a link to it, or encode it into base64 and insert into the web page directly. In this case, you can use a function like:
$base64_image = "data:image/jpeg;base64," . base64_encode($response->getScreenshot());

Note. If you use function json_encode on an object that contains the screenshot, then this screenshot will be converted to base64 format automatically.

Examples

Examples are available in a dedicated project on GitHub. Tests are in tests folder.

System Requirements

PHP 7.4+ required with the following extensions:

Feedback

Your feedback is very appreciated. If you want to see new features in this project, please post your ideas and feature requests in the issue tracker.

Support or Contact

Having trouble? Maybe something has already been reported in the issue tracker. If you don't find your problem there, then, please, add your issue there.

backlink-checker-php's People

Contributors

dependabot[bot] avatar rvalitov avatar spekulatius avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

backlink-checker-php's Issues

Require guzzlehttp/guzzle ^7.5

I use Drupal 10 which require guzzlehttp/guzzle ^7.5. This causes conflict.
Is it possible to require guzzlehttp/guzzle ^7.5?

Composer require problem

[email protected]:/var/www/lom $ composer update
Gathering patches for root package.
Loading composer repositories with package information
Info from https://repo.packagist.org: #StandWithUkraine
Updating dependencies
Your requirements could not be resolved to an installable set of packages.

Problem 1
- rvalitov/backlink-checker-php 2.0.0 requires nesk/puphpeteer dev-zoon -> found nesk/puphpeteer[dev-support-all-puppeteer-versions, dev-master, dev-dev, dev-2.0.0-old, 0.1.0, 0.2.0, 0.2.1, 0.2.2, 1.0.0, ..., 1.6.0, 2.0.0] but these do not match your constraint and are therefore not installable. Make sure you either fix the constraint or avoid updating this package to keep the one present in the lock file (nesk/puphpeteer[dev-zoon]).
- drupal/rg_deposit_for_link dev-master requires rvalitov/backlink-checker-php ~2.0.0 -> satisfiable by rvalitov/backlink-checker-php[2.0.0].
- Root composer.json requires drupal/rg_deposit_for_link dev-master -> satisfiable by drupal/rg_deposit_for_link[dev-master].

My composer.yml

{
  "name": "drupal/rg_deposit_for_link",
  "type": "drupal-custom-module",
  "authors": [
    {
      "name": "Roman Gudev (super_romeo)",
      "email": "[email protected]"
    }
  ],
  "license": "GPL-2.0+",
  "minimum-stability": "dev",
  "prefer-stable": true,
  "repositories": [
    {
      "type": "git",
      "url": "https://github.com/zoonru/puphpeteer.git"
    },
    {
      "type": "git",
      "url": "https://github.com/zoonru/rialto.git"
    }
  ],
  "require": {
    "rvalitov/backlink-checker-php": "~2.0.0"
  }
}

Error: Cannot find module 'lodash'

C:\xampp\htdocs\backlink>php cli_test.php -u https://classess.page.tl/ -p @^https://(www.)?dubaidance.com.*@
Using mode: javascript
The command "node "C:\xampp\htdocs\backlink\vendor\nesk\rialto\src/node-process/serve.js" C:\xampp\htdocs\backlink\vendor\nesk\puphpeteer\src\PuppeteerConnectionDelegate.js "{""idle_timeout"":60,""log_node_console"":false,""log_browser_console"":false}"" failed.

Exit Code: 1(General error)

Working directory: C:\xampp\htdocs\backlink

Output:

Error Output:

internal/modules/cjs/loader.js:638
throw err;
^

Error: Cannot find module 'lodash'
at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
at Function.Module._load (internal/modules/cjs/loader.js:562:25)
at Module.require (internal/modules/cjs/loader.js:692:17)
at require (internal/modules/cjs/helpers.js:25:18)
at Object. (C:\xampp\htdocs\backlink\vendor\nesk\rialto\src\node-process\NodeInterceptors\StandardStreamsInterceptor.js:3:11)
at Module._compile (internal/modules/cjs/loader.js:778:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
at Module.load (internal/modules/cjs/loader.js:653:32)
at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
at Function.Module._load (internal/modules/cjs/loader.js:585:3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.