This part of the documentation covers the installation of newspaper. The first step to using any software package is getting it properly installed.

Distribute & Pip

Installing newspaper is simple with pip. However, you will run into fixable issues if you are trying to install on ubuntu.

If you are on Debian / Ubuntu, install using the following:

  • Python development version, needed for Python.h:

    $ sudo apt-get install python-dev
  • lxml requirements:

    $ sudo apt-get install libxml2-dev libxslt-dev
  • For PIL to recognize .jpg images:

    $ sudo apt-get install libjpeg-dev zlib1g-dev libpng12-dev
  • Install the distribution via pip:

    $ pip install newspaper
  • Download NLP related corpora:

    $ curl | python2.7

If you are on OSX, install using the following, you may use both homebrew or macports:

$ brew install libxml2 libxslt

$ brew install libtiff libjpeg webp little-cms2

$ pip install newspaper

$ curl | python2.7

Otherwise, install with the following:

NOTE: You will still most likely need to install the following libraries via your package manager

  • PIL: libjpeg-dev zlib1g-dev libpng12-dev
  • lxml: libxml2-dev libxslt-dev
  • Python Development version: python-dev

Note that the Python3 package name is newspaper3k while our Python2 package name is newspaper.

$ pip install newspaper3k

$ curl | python2.7

Get the Code

Newspaper is actively developed on GitHub, where the code is always available.

You can clone the public repository:

git clone git://

Once you have a copy of the source, you can embed it in your Python package, or install it into your site-packages easily:

$ pip install -r requirements.txt
$ python install

Feel free to give our testing suite a shot:

$ python tests/