Overview

Summarizer is a standalone PHP class that allows the quick creation of a summary for a given text or HTML page. This tool will give a summary for the given text by ranking each sentence by its relevance.

Basic Usage

Here is the basic usage:

Text summary

<?php
  
require_once(dirname(__FILE__) . '/Summarizer.php');
  
  
$text file_get_contents(dirname(__FILE__) . '/test_files/cap1.txt');
  
$summarizer = new Summarizer();
  
$summarizer->loadText($text);
  
$summary $summarizer->run();
  
print_r($summary);
?>

Url summary

<?php
  
require_once(dirname(__FILE__) . '/Summarizer.php');
  
  
$url 'http://edition.cnn.com/2011/LIVING/02/07/russell.simmons.super.rich/index.html?hpt=C2';
  
$summarizer = new Summarizer();
  
$summarizer->loadUrl($url);
  
$summary $summarizer->run();
  
print_r($summary);
?>

Complex usage

<?php
  
require_once(dirname(dirname(__FILE__)) . '/library/Summarizer.php');
  
  
// options for the summarizer
  
$options = array(// minimum sentence length
  
Summarizer::OPTION_MIN_SENTENCE_LENGTH => 50// minimum word length
  
Summarizer::OPTION_MIN_WORD_LENGTH => 4// treshold
  
Summarizer::OPTION_TRESHOLD => 0.7// first best lines
  
Summarizer::OPTION_FIRST_BEST => 10// document is in HTML format
  
Summarizer::OPTION_HTML => true// split text into sentences
  
Summarizer::OPTION_SPLIT_SENTENCES => true);
  
  
$text file_get_contents(dirname(__FILE__) . '/test_files/cap1.txt');
  
  try {
      
$summarizer = new Summarizer($options);
      
$summarizer->loadText($text);
      
$summary $summarizer->run();
  }
  catch (
Exception $ex) {
      echo 
'Failed to summarize: ' $ex->getMessage();
      exit;
  }
  
  
// get cleaned text
  
$cleanedText $summarizer->getText();
  
  
$bestWords $summarizer->getBestWords(10);
  
$bestSentences $summarizer->getBestSentences(10);
  
$sentences $summarizer->getSentences();
  
  echo 
'Summary: ' PHP_EOL;
  
print_r($summary);
  
  echo 
'Best words:' PHP_EOL;
  
print_r($bestWords);
  
  echo 
'Extracted sentences:' PHP_EOL;
  
print_r($sentences);
  
  echo 
'Best sentences:' PHP_EOL;
  
print_r($bestSentences);
?>

Requirements

  • PHP 5.3+

Features

  • Summary output with configurable treshold - only the lines with a frequency over the treshold will be returned
  • Best words extraction - most relevant keyword will be extracted (ordered by their relevance)
  • Sentences splitter - the given text is automatically split into sentences
  • Common words skip - in order to provide better results, common words are skipped based on a dictionary (only for English language provided)
  • Minimal dependencies - all you need is PHP 5 to run it
  • Incredibly fast - in most cases, the summary is returned in less than 0.1 seconds
  • Low memory usage - with regular articles less than 1MB of memory is used
  • Natural language processing - much better results when using non-standard languages (e.g. Russian, Farsi, Arabic, Chinese)

Demo

See Tools4noobs summarizer

Documentation

To view all of the available class methods take a look at the API reference.

Buy it

You can buy it now from binpress.com.

Buy now

Wordpress Widget

You can now add a widget for the Summarizer tool to your Wordpress blog! It's easy and it's FREE.

Download Summarize Widget

Help me!

You have problems with the Summarize tool? Or perhaps you want to know its full potential?

Read this quick guide and see how you can improve your results.

Report a bug

We don't like bugs either, so if you spot one, please let us know and we'll do our best to fix it.

Buy script

If you want to buy this script you can see the Summarizer script page for documentation and pricing.

Buy great games

Donate!

If you like these tools and you want to help us pay for the hosting you can use the following buttons to donate some money.

Donate Button