ClearCode Ltd.
- Address 190 "Tsar Simeon Veliki" blvd., fl. 3, office 8, Stara Zagora, Bulgaria
- Phone +359 2 444 7557
- E-mail contacts@clearcode.bg
-
Data miner
Manageability
Authentication
Data miner supports most widespread types of authentication
- Form-based
- HTTP basic
- Digest
Identity management
When Data miner accesses a given source, the remote system that presents the source usually identifies of its client in some way. When such an identity has been established by the remote system, it can be used to enforce limits on the client – how many requests it can make over a given period of time, etc. Data miner can work with an unlimited number of identities, which gives it the ability to circumvent such restrictions that might be imposed by the remote system.
The identity is usually a combination of a browser fingerprint, a session and a source IP address. Since Data miner supports the parallel execution of multiple web browser instances, the browser fingerprint and session can be managed dynamically – i.e. multiple browser instances may share a single identity, or a single browser instance can use multiple identities to make subsequent requests. In order to manage the source IP address, the system can route requests through:
- HTTP proxies
- SOCKS 4/5 proxies
- TOR exit nodes
- Remote hosts, which act as forwarders
Scheduling
Data miner allows every data extraction to happen at a specific time. Extractions may be scheduled to happen once at a fixed time, or multiple times at regular time intervals. Extraction parallelism and syncrhonisation are fully supported in this context.
- Overview
- Architecture
- Data extraction
- Manageability
- Navigation