What is URL Classification?
URL Categorization or URL Classification is the process of providing the category a Domain or a URL belongs too, for example, for the domain cnn.com the domain classification is: “news”.
Who needs URL Classification?
URL Classification is used by vendors that provide software such as: Parental control, DLP (Data Leakage Protection), gateway protections, also it can be used to improve advertising performance for: RTB (Real Time Bidding) advertisement, brand safety and segmentation of users web surf habits for more accurate targeting of advertisement.
Do you classify websites per URL?
Regarding per URL classification, there are two ways of classifying the URL:
- Provide the classification of the domain, regardless of the page’s content, for example every page in cnn.com will be classified as “news”
- Actually classifying the page and providing the classification specific for that page, regardless of the domain’s main category, for example an article in cnn about finance will be classified as: “news,finance”
What kind of per URL classification do you provide?
We are able to provide both options. The option to classify the content of the page costs more as we need to request the URL in real time, and analyze it. We must request every page so we can analyze each of them separately.
Do you classify web searches?
Yes, the server detects when a URL is for a search engine and part of a search query, then it will extract the keyword and classify the keyword within the database of over 20 million keywords and phrases in 20 languages.
Do you classify keywords or phrases?
Yes, we classify both. The server can accept a single keyword or a phrase for classification.
Do you classify websites manually?
We classify the top sites manually, but our proprietary algorithms classify 99% of our database. This allows us to add new categories very fast, and let the algorithm update our database.
How do you handle a new website that is not in your classification database?
When we get a request for a new website, the server classifies it on the fly (so you don’t have to wait for someone to review it) and adds it to our sites database.
Do you classify images?
No we don’t. We classify the whole website based on the text content of the site.
It’s important to understand that images are coming in context (HTML, search phrase) and the context is what determines the category.
What is your service coverage?
Our servers are located strategically around the globe to give good coverage for end users from different geographies. In cases where we have a client that has most of its users from one country, we may add a server specifically for that country.
How often do you update your database?
We are updating our results every 30 days.
Who needs to manage the servers?
We manage our servers, which are accessed by all our clients. You can request a dedicated server just for your clients, we can manage it for you, or you can manage it yourself.
Can I manage my own server?
For an additional charge we can provide the server software and data.
Can you provide me the raw data?
For an additional charge we can provide the raw data with or without updates (You can look at our: Offline Category Database).
Do you have free URL blacklist?
No, we only provide paid solutions, you can read more about it here.
Can I request a customized scan?
Can you add a custom category?
It’s possible, and we need to discuss it first to see what is required.
Which languages do you support?
We fully support the English version, and have partial support for: German, French, Spanish, Italian, Dutch, Japanese, Chinese, Arabic, Croatian, Czech, Finnish, Greek, Hebrew, Norwegian, Polish, Romanian, Portuguese, Russian, Swedish, and Turkish.
Do you have real time virus detection?
We don’t provide a security feed.
What if we need to change a site classification?
In case there’s a mis-categorization, the client (not the end user) can email us or use the API to request a change. The change will be live within two business days. If it’s an emergency, and within business hours, it may be possible to do it within an hour, if it’s a onetime incident.
I want to know more about your API
Custom URL Classification data
We can offer custom data on top of our existing solution such as: custom scans, custom coverage and more, contact us for more details.
Offline domain database
How many domains are in your database?
We have two databases, one with 65 million domains, and another with 600 million domains.
How many URLs are in your database?
We have an offline URL database, it contains 10 Billion URLs.
What is the format of your database?
- The offline domain database format is a text file inside a directory. Each directory represents a category. Each text file contains a list of the domains in cleartext.
- The offline URL database format can be delivered via: proprietary solution, MySQL, text files.
How offline is your offline database?
The database is 100% offline, we provide the raw data, there’s no SDK that communicates with our servers, and there’s no component that calls home in any way.
How do you update the database?
We provide a downloadable link every X months as agreed upon in the commercial terms.
Where are you located?
We are located in Israel.