Offline Category Database

Offline Category Database

One of the options for licensing our product is through the locally hosted domain categorization database. It has our entire domain classification data stored in text files, which takes about 1.5 Gigabytes, and contains 60m domains.

Companies usually license it when they have privacy or SLA requirements that aren’t supported with the regular classification API.

Advantages of using the offline database are:

  • No data is transmitted to our servers
  • By the terms of the SLA, the licensee is in full control
  • Supports high volume workloads, is not limited by bandwidth or our servers’ speed
  • Can be loaded to database of your choice
  • You receive updates every 3 months

Disadvantages:

  • Can’t categorize new sites (it’s possible to combine this with a locally hosted server, or use our servers to defer to the regular API for new sites, if privacy compliance allows)
  • Suitable only for medium to large businesses because of database price and IT knowledge required to work with it
  • No support for keyword classification (can be deferred to a server as well)
  • It takes size, so can’t be used on end points and routers

The structure of the data is 160 directories (one for each classification). Each directory contains a text file with the domains that are part of that classification. This allows for easy importing to any database like: MySQL, PostgreSQL and more.

For example, under the directory ‘News’ there will be a file called “domains” and it will contain:

cnn.com
foxnews.com
reuters.com