Overview
Summarizer is a standalone PHP class that allows the quick creation of a summary for a given text or HTML page. This tool will give a summary for the given text by ranking each sentence by its relevance.
Basic Usage
Here is the basic usage:
Text summary
<?php
require_once(dirname(__FILE__) . '/Summarizer.php');
$text = file_get_contents(dirname(__FILE__) . '/test_files/cap1.txt');
$summarizer = new Summarizer();
$summarizer->loadText($text);
$summary = $summarizer->run();
print_r($summary);
?>
Url summary
<?php
require_once(dirname(__FILE__) . '/Summarizer.php');
$url = 'http://edition.cnn.com/2011/LIVING/02/07/russell.simmons.super.rich/index.html?hpt=C2';
$summarizer = new Summarizer();
$summarizer->loadUrl($url);
$summary = $summarizer->run();
print_r($summary);
?>
Complex usage
<?php
require_once(dirname(dirname(__FILE__)) . '/library/Summarizer.php');
// options for the summarizer
$options = array(// minimum sentence length
Summarizer::OPTION_MIN_SENTENCE_LENGTH => 50, // minimum word length
Summarizer::OPTION_MIN_WORD_LENGTH => 4, // treshold
Summarizer::OPTION_TRESHOLD => 0.7, // first best lines
Summarizer::OPTION_FIRST_BEST => 10, // document is in HTML format
Summarizer::OPTION_HTML => true, // split text into sentences
Summarizer::OPTION_SPLIT_SENTENCES => true);
$text = file_get_contents(dirname(__FILE__) . '/test_files/cap1.txt');
try {
$summarizer = new Summarizer($options);
$summarizer->loadText($text);
$summary = $summarizer->run();
}
catch (Exception $ex) {
echo 'Failed to summarize: ' . $ex->getMessage();
exit;
}
// get cleaned text
$cleanedText = $summarizer->getText();
$bestWords = $summarizer->getBestWords(10);
$bestSentences = $summarizer->getBestSentences(10);
$sentences = $summarizer->getSentences();
echo 'Summary: ' . PHP_EOL;
print_r($summary);
echo 'Best words:' . PHP_EOL;
print_r($bestWords);
echo 'Extracted sentences:' . PHP_EOL;
print_r($sentences);
echo 'Best sentences:' . PHP_EOL;
print_r($bestSentences);
?>
Requirements
- PHP 5.3+ (PHP 7+ works fine too)
Features
- Summary output with configurable treshold - only the lines with a frequency over the treshold will be returned
- Best words extraction - most relevant keyword will be extracted (ordered by their relevance)
- Sentences splitter - the given text is automatically split into sentences
- Common words skip - in order to provide better results, common words are skipped based on a dictionary (only for English language provided)
- Minimal dependencies - all you need is PHP 5+ to run it (PHP 7+ works fine too)
- Incredibly fast - in most cases, the summary is returned in less than 0.1 seconds
- Low memory usage - with regular articles less than 1MB of memory is used
- Natural language processing - much better results when using non-standard languages (e.g. Russian, Farsi, Arabic, Chinese)
- Sentence filtering callback - a custom callback can be used to filter sentences before ranking
Sentence filtering callback
In order to use the sentence filtering a callback must be provided which receives as a parameter an array with the list of sentences and must return an array with the list of sentences filtered. For example:
function filter_include_words(Array $sentences) { $filteredSentences = []; foreach ($sentences as $sentence) { if (preg_match('/[0-9]+/', $sentence)) { $filteredSentences[] = $sentence; } } return $filteredSentences; }
Demo
Documentation
To view all of the available class methods take a look at the API reference.
Buy it
You can buy it now from codester.com.
Wordpress Widget
You can now add a widget for the Summarizer tool to your Wordpress blog! It's easy and it's FREE.
Help me!
You have problems with the Summarize tool? Or perhaps you want to know its full potential?
Read this quick guide and see how you can improve your results.
Report a bug
We don't like bugs either, so if you spot one, please let us know and we'll do our best to fix it.
Buy script
If you want to buy this script you can see the Summarizer script page for documentation and pricing.
Donate!
If you like these tools and you want to help us pay for the hosting you can use the following buttons to donate some money.