• Throw an error when integers or other non-strings are included in Document metadata dictionaries
  • Added a number of keyword arguments to documents searches to pull a single page, change page size and request document metadata in result
  • Temporarily removed SSL from image and text URLs to workaround bugs in underlying dependencies


  • Encoding bug fix for metadata associated with documents via API


  • URLs to PDFs can now be submitted for upload
  • Refactored and tests to be less complex


  • Python 3.4 testing
  • 400MB upload limit to match DocumentCloud’s API restrictions


  • Adopted semantic versioning without breaking existing packages on PyPI
  • Fixed bugs with get_page_text
  • Added keyword argument during initialization that allows you to override the BASE_URI and connect with independent clones of DocumentCloud. Contributed by Adi Eyal.
  • Refactored unit tests to run more quickly and require fewer web requests
  • Documentation moved from the gh-pages branch to master and refactored to be published via ReadTheDocs.


  • Python 3 support
  • PEP8 and PyFlakes compliance
  • Coverage reports on testing via


  • Continuous integration testing with TravisCI
  • Fixed bug with empty strings in Document descriptions
  • Raise errors when a user tries to save a data keyword reserved by DocumentCloud
  • Allow all-caps file extensions
  • Retry requests that fail with an increasing backoff delay
  • Fixed a bug in how titles are assigned to a file object
  • Added access checks when retrieving txt, pdf, img about a document


  • File objects can now be submitted for uploading
  • Added more support for unicode data thanks to contributions by Shane Shifflet.
  • Smarter lazy loading of Document attributes missing from a search


  • Added data attribute on Document for storing dictionaries of arbitrary metadata
  • Added secure option for Document uploads to prevent data from being sent to OpenCalais
  • Added save alias on Document and Project objects that uses the pre-existing put command
  • Fixed to url encoding to makes the system more unicode friendly
  • Added all Document upload arguments to upload_directory method


  • upload_directory method for documents


  • get_or_create_by_title method for projects
  • Document and project creation methods now return an object, not the new id.
  • Projects can pulled by id or by title


  • Document search now returns mentions of the keyword in the documents
  • related_url and published_url attributes now more easily accessible
  • normal sized images now available