What is a Data Source?
A Data Source is any storage location where you have digital content – files, emails, appointments, contacts, social media posts, IM/text messages etc. It can be local (computer hard drive, phone, USB/flash storage device, CDs, DVD, camera, MP3 player. . .) or remote (email account, cloud based file/photo storage account, social media account, web based calendar, IM/text messaging account etc.).
Which Data Sources are supported?
Currently, the following Data Sources are supported:
- Local Computer Hard Drives: all local hard disks assigned a drive letter by Windows e.g. C:\, D:\ etc.
- Removable Drives: CDs, DVDs, USB connected devices, flash drives, memory cards etc.
- Network Connected Drives: network attached storage (NAS) drives.
- Phones: any phone recognized by Windows when it is connected, including all Android, Apple iOS and Windows based phones. You may need to install vendor specific drivers before Windows can recognize your phone. You do NOT need to change phone settings or enable USB debugging for this to work.
- Other Portable Devices: music players, cameras, camcorders etc.
- Email Accounts: Google Gmail, Apple iCloud mail, Yahoo mail, Windows Hotmail / Live Mail and any other IMAP or POP3 compliant email server. Blob comes pre-configured with about a dozen email providers. You can also enter your email provider’s server information into Blob if it is not already pre-configured (see “How can I add an email Data Source not already listed?”).
- Social Media Accounts: Facebook and Google+.
- Cloud Storage Accounts: Google Drive, Microsoft OneDrive and Dropbox (more are being added).
- Calendar Accounts: Google Calendar, Apple Calendar, Yahoo Calendar and any other CalDAV compliant calendar account.
- Photo Storage Accounts:Google Photos and Picasa Web Albums (and more are being added).
- Instant/Text Messaging Accounts: Slack (currently in beta).
More Data Sources are being added on an ongoing basis. If you have a request for a specific Data Source to be added, please Contact Us.
Which content types are supported?
Blob supports all basic content types – files, emails, contacts, calendar appointments, social media posts/comments and IM/text messages. Blob natively supports compound files like ZIP and ISO files (it will automatically index files embedded inside compound files and emails).
The Premium version of Blob support additional file types like Microsoft Outlook OWA and PST files.
How do I tell Blob to index a Data Source?
Click on the “Add Data Source” button at the top left corner in Blob.
You will then get a screen like the one below that shows the supported Data Sources.
Select the Data Sources you want to add in each category and click on the “Next” button to move to the next category. If you want to view or change the default indexing settings, click on the “Show Settings Pages In Wizard” link. Please see “How can I control the indexing process?” to see how you can change Blob’s default indexing behavior. Click on the “Finish” button to start the indexing process.
How can I add an email Data Source not already listed?
Blob already supports a number of pre-configured email Data Sources, but there are thousands of email provides and Blob cannot possibly pre-configure all of them. However, if you have the server information about your email provider (this is typically published by your email provider), you can manually enter this information into Blob and have Blob manage it for you. For this to work, your email provider must support either IMAP (preferred) or POP3 email protocols and one of these protocols must be enabled for third party apps like Blob. To tell Blob about your email provider, click on the “Add Data Source” button at the top left corner of Blob.
Navigate to the Email Data Sources by clicking on the email envelope at the bottom left of the “Add Data Source” screen.
If you don’t see your email provide in this list, you can manually enter your provider information. Click on the “New” button at the bottom of this screen to enter new email server information in the screen that opens.
Enter the required server information in this screen and click on the “OK” button.
How long does it take to index a Data Source?
A few minutes to several hours – depending on the amount and type of content there and the speed of your connection.
A “typical” local hard disk (e.g. C:\) with a few hundred gigabytes of content should be completely indexed in a couple of hours (more if it has a lot of photos). A “typical” email account with about ten thousands messages should take a couple of hours with a fast internet connection (5 megabits per second or more). Most cloud storage accounts should be indexed within 30 minutes or less.
Note that Blob will make a full inventory of a Data Source only once when you add it for the first time. Subsequent re-indexing events to keep track of added, modified or deleted content are incremental and deal only with the updated content. These will complete much quicker, often within a few minutes.
How often are my Data Sources indexed? How can I change the defaults?
You control how often a Data Source is re-indexed to check for content updates. Each Data Source can be indexed at its own schedule independent of other Data Sources. The default re-indexing schedule is as follows:
- Local computer hard disks are checked “Constantly”. That is, Blob will get a notification from Windows immediately after a file is created, deleted or changed. Thus, Blob will become aware of content updates within seconds of the change.
- Removable devices/media like USB devices, phones, cameras, CDs, DVDs, memory cards etc. are indexed every time they are plugged into the computer.
- Network attached drives are not re-indexed automatically.
- Email accounts are checked once every 15 minutes.
- Social media accounts at Facebook are checked every 30 minutes. Accounts at Google+ are checked every 5 hours.
- Cloud Storage accounts are checked every 24 hours.
- Remote calendar accounts are checked every 24 hours.
- Photo Storage accounts are checked every 24 hours.
You can change these default values any time. Select a Data Source and right-click on it to bring up the menu choices as shown below. Click on the “Settings” menu choice.
In the settings screen that comes up, set the values you want for the “Check Data Source for content updates” setting.
You can also manually re-index a Data Source any time by selecting a Data Source, right-clicking on it to bring up the menu choices and selecting “Check For Content Updates”.
Will I miss changes to my Data Source if I close Blob? Will my backup tasks not run if I close Blob?
No, you can close Blob at any time without losing any functionality.
Blob is only the interface to you, the user. The actual work is done by several background processes: the “Cataloger” actually indexes a Data Source; “FileMonitor” monitors your local drives for changes; “TaskManager” runs the backup or synchronize tasks you create and “Refresher” launches other background tasks and looks for USB and other portable devices being plugged in. These background processes are automatically created when needed and they are all independent of Blob.
Does Blob copy my remote content to local drives?
No. Your content stays at its original location and Blob will not delete, move or modify it on its own. There are specific scenarios where Blob may download your remote content to your local hard drive:
- While indexing, to collect metadata embedded in remote content: During the indexing process, Blob tries to collect useful information that may be embedded inside your files. For example: for digital photos, Blob tries to determine the camera used, the picture date, embedded GPS info and other photo-specific information. For audio files, it tries to find the album name, singer, lyrics, play duration etc. If these files happens to be on a remote Data Source (e.g. as an attachment inside an email on a remote server), Blob may download the files to your local hard drive to collect this embedded information. Such copies are temporary – the downloaded files are deleted once the embedded information is collected.
- To enable quick and offline access to your content: Blob implements a caching policy that downloads remote content to your local hard drive (but not delete it at its original location). For example, when new email arrives at your email account, Blob may download the email and/or attachments to your local hard drive so that when you open the email to read it, Blob can quickly open the local copy rather than having to fetch it from the remote server first. The caching policy dynamically adjusts to available free space, and cached files are periodically deleted when space is low.
- To back up content from one remote location to another remote location: Blob allows you to set up a task that backs up files from one remote location to another (e.g. backup specific email attachments from your remote Yahoo Mail account to your Google Drive account). These files are first downloaded to your local hard drive and then uploaded to the remote destination. Again, such copies are temporary and the files are deleted after they are successfully copied to the destination.
Blob will NEVER copy or transmit your content, information (metadata) about your content or your account login information to our servers or to any affiliated 3rd party server. It does NOT collect any information about you or your content for any marketing purposes whatsoever.
How is authentication handled?
When Blob needs to access your password protected accounts, it will prompt you to authenticate with the server. Most vendors and servers support a form of authentication called “OAuth”. With “OAuth”, you provide your login information directly to the provider’s server where you have your account (e.g. Google Gmail’s server), NOT to Blob. The provider’s server authenticates you directly and returns only a temporary authorization token that allows Blob to access your content for a limited time (typically 30 to 90 minutes) to index it, back it up etc. Thus, Blob never sees your password when the “OAuth” scheme is used.
However, some vendors/servers do not yet support “OAuth”. For such cases, you have to provide your password to Blob which in turn will transmit it to the server where you have your account. Blob will encrypt your password and will transmit it to the remote server using the most secure channel supported by the server (see “What security mechanisms does Blob use?” ). Blob will NEVER transmit your password to our servers under any circumstances whatsoever. Blob will ask you whether it should save your password for subsequent use (e.g. to periodically look for content updates or to run backup tasks you create). If you do agree to save it, Blob will save your account login information encrypted in the Windows Credentials Manager. You can remove your password from there whenever you want.
What information is transmitted to Datamaton Inc. or affiliated 3rd parties?
How can I control the indexing process?
For each Data Source, you can independently control what content is indexed, how often it is re-indexed and where the index files are stored. Most aspects of the indexing process can be changed anytime you want. However, some parameters can only be set the first time a Data Source is added (e.g. location of the index files).
Please leave settings to their default values unless you are sure you understand what the setting does and how a change will impact Blob.
To manage how a Data Source is indexed, select the Data Source from the left pane, right-click on it to open the menu and select the “Settings” menu option.
On the screen that opens up, you will see multiple tabs that let you control the following aspects of the indexing process:
- Which parts of the Data Source should be indexed: You can select whether all or a subset of the content should be indexed. This setting is useful for Data Sources like hard disks which can have hundreds of thousands of system files that are otherwise uninteresting, since they are not your personal content files. Reducing the indexing load on Blob can significantly improve performance in such cases.
- When should the Data Source be re-indexed (also see “How often are my Data Sources indexed?”): Blob will do a “full” index of a Data Source when you add it for the first time, and can then do “incremental” checks for new, modified and deleted content depending on how you have configured it. You can set a Data Source to be indexed:
- Manually: Blob will not automatically check the Data Source for content updates. You can force a manual check for updated content by selecting the Data Source in the left pane, right-clicking on it to open the menu, and selecting “Check For Content Updates”.
- Constantly: Blob will constantly monitor the Data Source for content updates. Any file creation, modification or deletion is typically reflected within seconds. This allows you to create “instant” tasks that back up a new or changed files within seconds of the change. This setting is currently supported for local hard disks only.
- Periodically: Blob will check the Data Source for content updates every specified minutes, hours, days, weeks, months or years.
- At a specific time of day: Blob will check for content updates at the specified time of day. You can specify whether this should happen every day or every few days.
- When the device is plugged in: Blob will automatically check for content updates each time the device is plugged into the computer. This setting is useful for removable devices like USB/flash disks, CDs/DVDs and portable devices like phones, music players, camera and camcorders.
- Where the index files should be created: Blob saves a Data Source’s inventory information in index files. This setting lets you decide where these index files should be created. These files are accessed very frequently, so you should create them only on a local hard drive that is always available (e.g. not a removable USB drive). This setting is useful if you have multiple local hard drives and the default C:\ drive is low on space. Blob can create dozens of index files for each Data Source. Their size may vary from a few megabytes to hundreds of megabytes depending on the amount and nature of content being indexed. A very approximate rule-of-thumb is that index files for a Data Source will use about 0.1% of the size of the Data Source (i.e. if the Data Source has 1 terabyte of content, its index files will consume about 1 gigabyte of disk space).
Note that this setting can only be set the first time a Data Source is added. It cannot be changed once the indexing process has started.
- Whether the Data Source’s index should be visible to all user of the computer: By default, index files are visible only to the user that was logged in when that Data Source was first added. If Blob is installed on a computer with multiple user accounts, you can chose to make a Data Source’s index files visible to all such user accounts (including the “Guest” user). In general, you should NOT make the index files visible to all users, since they may contain personal information that may not be appropriate for guest users.
- How much information should be logged when indexing: This setting determines the verbosity level for the errors, warnings or informational messages that Blob logs when it accesses a Data Source. It is actually common for the indexing process to encounter errors. Many errors are either transient (e.g. temporary network problems) or benign/“expected” (e.g. the current user does not have permissions to index a file on the local hard drive). However, you may occasionally encounter persistent fatal errors that prevent a Data Source from being indexed or managed successfully. In such cases, you can set the verbosity level to “Verbose” and view the logs for hints about what might be going wrong. See “How can I view the errors encountered while trying to access a Data Source?” for how you can view the logged messages.
Will Blob see and index all the content from my phone?
This depends on the phone and how it connects to the PC.
When you connect a phone to a PC, the phone decides how to report itself to the PC and what content types to report to the PC. By default, most modern phones report themselves as photo storage devices (a “Picture Transport Protocol” or PTP compliant device) or as music storage devices (a “Media Transport Protocol” or MTP compliant device). Such devices only report photos, videos and/or music files to the PC. Thus, Blob (and the Windows File Manager) will only see these content types, not contacts, IMs and other types of files on the phone.
Some phones will let you change how they report themselves to the PC. You may be able to go into the phone’s “Settings” menu option and find the option that controls how it connects to a PC. If you enable “USB Debugging” and connect to the PC as a “Mass Storage Device”, the phone will report itself as a storage disk drive when you connect it to the PC. In this case, Blob (and the Windows File Manager) will see all content types and files on the phone.
Please note that when you create a task to back up your phone, Blob can only copy files that the phone reports (i.e. the files you see when you see its contents in Blob or the Windows File Manager).
How often will I be prompted for passwords?
This depends on the Data Source being accessed and whether you have allowed Blob to save passwords (also see “How is authentication handled”). Blob and other background processes (e.g. Refresher, Cataloger, Task Manager) will access your Data Sources for the following:
- When you initiate an operation manually – e.g. you manually select and move, copy or delete content on a protected Data Source. This request will be initiated by Blob (not a background process). Blob will not prompt you for this Data Source again as long as it is running, unless the remote server explicitly revokes Blob’s permissions and forces another login. If you close Blob, the cached permissions are lost and Blob will have to prompt you again the next time you run it and access the Data Source (unless you have asked Blob to save the password in the Windows Credentials Manager).
- To check for content updates, if you have enabled the Data Source to be automatically re-indexed (also see “How often are my Data Sources indexed?”). This access will be initiated by the Refresher and Cataloger background processes. How often this happens depends on how often you have configured the Data Source to be checked for content updates.
- To run backup tasks you have created. This request will be initiated by the Task Manager background process. How often this happens depends on how often you have configured the task to run.
For Data Sources that support OAuth (see “How is authentication handled”), Blob will try to request a new temporary access token first without prompting you. If that succeeds, you will not have to authenticate with the server again.
Can I add Data Sources later?
How much space do the index files consume?
As a very rough approximation, index files consume about 0.1% of the amount of content on the Data Source. So if you have 1 terabyte of content on a disk drive, its index files will use about 1 gigabyte of disk space on your computer. More space will be used if you have a lot of photos, videos or music files since Blob collects additional information about these file types. Please note this is a rough approximation only, not a guarantee or a limit that is enforced.
What happens if I remove a Data Source from Blob?
Removing a Data Source from Blob only removes it from Blob – nothing is deleted from the Data Source itself. The removed Data Source’s index files and any locally cached content are marked for deletion (and eventually deleted when Blob’s background process to reclaim disk space runs).
Please note that if you select one or more actual content items in a Data Source and manually delete them, that selected content is deleted from the Data Source. Thus, if you select a few emails and delete them, they will be deleted from the email server where they were stored (and also from Blob’s local cache, if they were cached).
You can remove a Data Source from Blob by selecting it, right-clicking on it to see its menu and clicking on the “Remove” menu choice.