What is a Data Source?
A Data Source is any storage location where you have digital content – files, emails, appointments, contacts, social media posts, IM/text messages etc. Some Data Sources are local – e.g. computer hard drives, phones, USB/flash storage devices, CDs, DVD, cameras, MP3 players etc. Others are remote – e.g. email accounts, cloud based file/photo storage accounts, social media accounts, web based calendars, text messaging accounts etc. Blob can handle local as well as remote Data Sources.
Which Data Sources are supported by all versions of Blob?
Currently, all versions of Blob support the following Data Sources:
- Local Computer Hard Drives: all computer hard drives assigned a drive letter by Windows (e.g. C:\, D:\ etc.). The drive manufacturer and connection type to the computer do not matter.
- Removable Drives: CDs, DVDs, USB connected devices, flash drives, memory cards etc. that are recognized and supported by Windows.
- Network Attached Storage: all network attached storage (NAS) drives that are mapped as drive letters by Windows.
- Phones: any Android or Apple iOS phone recognized by Windows when it is connected to the computer. Some phones may require you to install manufacturer specific drivers before Windows will recognize it. Once Windows recognizes the phone, Blob will too. You do not need to enable USB debugging for this to work.
- Portable Devices: music players, cameras, camcorders etc. that are detected and supported by Windows are also supported by Blob.
Email Accounts: Most web based email providers support industry standard email protocols IMAP and/or POP3 and SMTP. Blob supports them too, so hundreds of web based email providers should just work. The key is to configure email provider specific information like server address, protocol used etc.
Blob’s “main” email page comes pre-configured with server settings for Google Gmail, Apple iCloud mail, Yahoo mail and web based Windows Hotmail/Live Mail. An “overflow” page contains a few more pre-configured email providers (e.g. Comcast, AT&T, Zoho, Godaddy, AOL…). However, it is not possible to pre-configure the hundreds of email providers that’ll just work with Blob. If your web based email provider is not on the pre-configured list, you can do an internet search to get and add the server settings on our own – see “How can I add an email Data Source not already listed?”.
- Social Media Accounts: Blob nominally supports Facebook, but Facebook now provides very (very) minimal account data to non-Facebook applications. Google has discontinued Google+, so newer versions of Blob don’t support it anymore.
- Cloud Storage Accounts: Google Drive, Microsoft OneDrive and Dropbox are currently supported. More cloud storage providers are being actively added.
Calendar Accounts: Blob supports Google Calendar, Apple iCloud Calendar and Yahoo Calendar. It also supports the industry standard calendar protocol CalDAV. Many web based calendar accounts support CalDAV, so they should just work with Blob. You just need the CalDAV calendar server address from your provider, and Blob will automatically try to discover the calendars hosted there.
In addition, Blob creates a local calendar on your computer’s hard drive. You can save appointments there if you don’t want to save them to a web based calendar.
- Photo Storage Accounts: Blob supports Google Photos (Google has discontinued Picasa Web Albums, so newer versions of Blob no longer support Picasa).
Please note that the list of supported Data Sources is constantly evolving. Contact us if you have suggestions for what we should add or prioritize.
Does the Premium version of Blob@Home support any additional Data Sources?
Which Data Sources does Blob@Work support?
- Instant/Text Messaging Accounts: Slack.
- Email Accounts: email accounts hosted on Microsoft Exchange Servers. On-premise as well as off-premise Exchange Mail servers are supported.
- Cloud Storage Accounts: FTP servers. When you add an FTP server, Blob will try to use the most secure SFTP (FTP over SSH2) protocol first. If the server does not support this, Blob will try the next most secure FTPS (FTP over SSL) protocol. If the server does not support FTPS either, Blob will try to use the plain old FTP protocol.
- Calendar Accounts: calendar accounts hosted on Microsoft Exchange Servers. Blob supports on-premise as well as off-premise Exchange Calendar servers.
Please note that the list of supported Data Sources is constantly evolving. Contact us if you have suggestions for what we should add or prioritize.
Which content types are supported by all versions of Blob?
All versions of Blob support the basic content types – files, emails, contacts, calendar appointments, social media posts/comments and IM/text messages. Blob natively supports “compound” emails (i.e. emails with attachments) and compound files that have other files embedded inside them (e.g. ZIP, ISO, CAB… files). This means it can automatically index files embedded inside other content if it is configured to do so.
Blob uses industry standard file formats/extensions for the content types it supports. For example, when you save an email using Blob, the file is saved as a “.eml” file. Many 3rd party email programs can open and read that file. Similarly, you can load an “.ical” appointment file you received in an email into the Blob calendar or save your calendars as an “ics” iCalendar file for backup. You can save Blob contacts as standard “.vcf” contact files, text messages as plain text files etc.
Blob backups will copy your files in their original format and non-files (emails, appointments…) in industry standard formats. If you ask a Blob backup task to encrypt/compress content at the destination, it will save it as (encrypted) ZIP files. Not using a proprietary format means your backups can be accessed using many other programs, not just Blob.
How do I tell Blob to index a Data Source?
Click on the “Add Data Source” button at the top left corner in Blob.
You will then get a screen like the one below that shows the supported Data Sources.
Select the Data Sources you want to add in each category and click on the “Next” button to move to the next category. If you want to view or change the default indexing settings, click on the “Show Settings Pages In Wizard” link. Please see “How can I control the indexing process?” to see how you can change Blob’s default indexing behavior. Click on the “Finish” button to start the indexing process.
How can I add an email Data Source not already listed?
Blob already supports a number of pre-configured email Data Sources. However, there are thousands of email provides and Blob cannot possibly pre-configure all of them. Web based email providers that support industry standard IMAP or POP3 protocols typically publish the server settings needed to access your mail using any 3rd party email app like Blob. If your email provider supports IMAP, use that instead of POP3. Once you have the server values, you can enter them into Blob. Click on the “Add Data Source” button at the top left corner of Blob.
Navigate to the Email Data Sources by clicking on the email envelope at the bottom left of the “Add Data Source” screen.
Click on “Other Web Based Email” to add a non-default email Data Source. You will see a new “overflow” screen with a few more pre-configured email servers. If you see your email provider in this list, just select this by clicking on it selection box at the left edge of its row (under the column titled “Use?”).
If you don’t see your email provide in this list above, you can manually enter your provider information. Click on the “New” button at the bottom of this screen. Enter your provider’s email server information in the screen that opens.
Enter the required server information in this screen and click on the “OK” button.
How long does it take to index a Data Source?
A few minutes to several hours – depending on the amount and type of content there and the speed of your connection.
A “typical” local hard disk (e.g. C:\) with a few hundred gigabytes of content should be completely indexed in a couple of hours (more if it has a lot of photos). A “typical” email account with about ten thousands messages should take a couple of hours with a fast internet connection (5 megabits per second or more). Most cloud storage accounts should be indexed within an hour or two. Indexing will take longer if you have a lot of photos, videos etc. since they need additional processing.
To view real-time status about which Data Source was indexed when, select any Data Source and right-click on it to bring up the menu. Click on the “View Index Status” menu item.
This will display a window shows when each Data Source was checked last for content updates.
Note that Blob will make a full inventory of a Data Source only once when you add it for the first time. Subsequent re-indexing events to keep track of added, modified or deleted content are incremental and deal only with the updated content. These will finish much quicker, often within a few minutes.
How often are my Data Sources indexed? How can I change the defaults?
You control how often a Data Source is re-indexed to check for content updates. Each Data Source can be indexed at its own schedule independent of other Data Sources. The default re-indexing schedule is as follows:
- Local computer hard disks are checked “Constantly”. That is, Blob will get a notification from Windows immediately after a file is created, deleted or changed. Blob will update its index files right away, typically within seconds of the change.
- Removable devices/media like USB devices, phones, cameras, CDs, DVDs, memory cards etc. are indexed every time they are plugged into the computer.
- Network attached drives are not re-indexed automatically.
- Email accounts are checked “Constantly” if the email server supports this. Blob subscribes to push notifications from the server, which notifies Blob when new email arrives in your Inbox. If the server does not support push notifications, Blob will check for content updates once every 15 minutes.
- Text messaging and social media accounts like Slack and Facebook are checked every 30 minutes.
- Cloud Storage accounts are checked every 24 hours.
- Web based calendar accounts are checked every 24 hours.
- Photo Storage accounts are checked every 24 hours.
You can change these default values any time. Select a Data Source and right-click on it to bring up the menu choices as shown below. Click on the “Settings” menu choice.
In the settings screen that comes up, set the values you want for the “Check Data Source for content updates” setting.
You can also manually re-index a Data Source any time by selecting a Data Source, right-clicking on it to bring up the menu choices and selecting “Check For Content Updates”.
Will I miss changes to my Data Source if I close Blob? Will my backup tasks not run if I close Blob?
No, you can close Blob at any time without losing any functionality.
Blob is only the interface to you, the user. The actual work is done by several background processes. The “Cataloger” background process actually indexes a Data Source. The “FileMonitor” background process monitors your local drives for changes. The “ImapMonitor” background process monitors new email arriving at the email server. The “TaskManager” process runs the backup or synchronize tasks you create. The “Refresher” launches other background tasks and looks for USB and other portable devices being plugged in. These background processes are automatically created when needed and they are all independent of Blob. See How Blob works for more information.
Does Blob copy my remote content to local drives?
Blob will leave your content at its original location and will not delete, move or modify it on its own. There are specific scenarios where Blob may download your remote content to your local hard drive:
- While indexing, to collect metadata embedded in remote content. During the indexing process, Blob tries to collect useful information that may be embedded inside your files. For example: for digital photos, Blob tries to determine the camera used, the picture date, embedded GPS info and other photo-specific information. For audio files, it tries to find the album name, singer, lyrics, play duration etc. If these files happens to be on a remote Data Source (e.g. an email attachment on a web based email account), Blob may download them to your local hard drive to probe it for such embedded information. Such copies are temporary – the downloaded files are deleted once they are indexed.
- To enable quick and offline access to your content. Blob implements a caching policy that downloads remote content to your local hard drive (without deleting it at its original location). For example, when new email arrives at your email account, Blob may download the email and/or attachments to your local hard drive so that when you open the email to read it, Blob can quickly open the local copy rather than having to fetch it from the remote server first. The caching policy dynamically adjusts to available free space, and cached files are periodically deleted when space is low (a separate background process “SpaceManager” does this).
- To back up content from one remote location to another remote location. Blob allows you to set up a task that backs up files from one remote location to another (e.g. backup email attachments from your Yahoo Mail account to your Google Drive account). These files are first downloaded to your local hard drive and then uploaded to the remote destination. Again, such copies are temporary and the files are deleted after they are successfully copied to the destination.
Blob will NEVER copy or transmit your content, information about your content (metadata) or your account login information to our servers or to any affiliated 3rd party server.
How is authentication handled?
When Blob needs to access your password protected accounts, it will prompt you to authenticate with the server. Most vendors and servers support a form of authentication called “OAuth”. With this scheme, you provide your login information directly to the server where you have your account (e.g. Google Gmail), not to Blob. The server authenticates you directly and returns only a temporary authorization token that allows Blob to access your content for a limited time (typically 30 to 90 minutes) to index it, back it up etc. Thus, Blob never sees your password when the “OAuth” scheme is used.
However, some vendors/servers do not yet support “OAuth”. For such cases, you have to provide your password to Blob to transmit it to the server where you have your account. Blob will transmit it using the most secure channel supported by the server (see “What security mechanisms does Blob use?” ). Blob will not automatically save your password, but will give you an option to do so. Saving your password allows Blob to periodically re-index your account, run any backup tasks you may have created there etc. without prompting you each time. If you decide to let Blob save your password, it will save it encrypted in the Windows Credentials Manager. You can remove your password from there whenever you want.
Blob will NEVER transmit your password to our servers under any circumstances whatsoever.
What information is transmitted to Datamaton Inc. or affiliated 3rd parties?
How can I control the indexing process?
For each Data Source, you can independently control what content is indexed, how often it is re-indexed and where the index files are stored. You can change most aspects of the indexing process anytime you want. However, some parameters (e.g. location of the index files) can only be set the first time a Data Source is added.
Please leave settings to their default values unless you are sure you understand what the setting does and how a change will impact Blob.
To manage how a Data Source is indexed, select the Data Source from the left pane, right-click on it to open the menu and select the “Settings” menu option.
On the screen that opens up, you will see multiple tabs that let you control different aspects of the indexing process for this specific Data Source.
- Which parts of the Data Source should be indexed. This setting is available under the “Inventory” tab of the settings screen. You can select whether all or a subset of the content should be indexed. This is useful for Data Sources like hard disks which can have hundreds of thousands of system files that are otherwise uninteresting. Skipping indexing them significantly reduces the performance load Blob places on your computer.
- How often should the Data Source be re-indexed (also see “How often are my Data Sources indexed?”). This setting is available under the “General” tab of the settings screen.
Blob will do a “full” index of a Data Source when you add it for the first time, and can then do “incremental” checks for new, modified and deleted content depending on how you have configured it. You can set a Data Source to be indexed:
- Manually: Blob will not automatically check the Data Source for content updates. You can force a manual check for updated content by selecting the Data Source in the left pane, right-clicking on it to open the menu, and selecting “Check For Content Updates”.
- Constantly: Blob will constantly monitor the Data Source for content updates. Any file creation, modification or deletion is typically reflected within seconds. With this, you can create “instant” tasks that back up a new or changed files within seconds of the change. Blob supports this setting only for Data Sources that it can constantly monitor.
- Periodically: Blob will check the Data Source for content updates every specified minutes, hours, days, weeks, months or years.
- At a specific time of day: Blob will check for content updates at the specified time of day. You can specify whether this should happen every day or every few days.
- When the device is plugged in: Blob will automatically check for content updates each time the device is plugged into the computer. This setting is useful for removable devices like USB/flash disks, CDs/DVDs and portable devices like phones, music players, camera and camcorders.
- Where the index files should be created. This setting lets you decide where Blob’s index files will be created. These files are accessed very frequently, so you should create them only on your computer’s hard drive, not on a removable USB drive or network attached storage. This setting is useful if you have multiple local hard drives and the default C:\ drive is low on space. Blob can create dozens of index files for each Data Source. Their size may vary from a few megabytes to hundreds of megabytes depending on the amount and nature of content being indexed.
This setting can only be set the first time a Data Source is added. It cannot be changed once the indexing process has started.
- How much information should be logged when indexing. This setting determines whether Blob logs just errors or also warnings and informational messages when it accesses the Data Source. It is actually common for the indexing process to encounter errors. Many errors are either transient (e.g. temporary network problems) or expected (e.g. the current user does not have permissions to index this file). However, you may occasionally encounter persistent fatal errors that prevent a Data Source from being indexed or managed successfully. In such cases, you can set the logging level to “Verbose” and view the logs for hints about what might be going wrong. See “How can I view the errors encountered while trying to access a Data Source?” for how you can view the logged messages.
Will Blob see and index all the content from my phone?
This depends on the phone and how it connects to your computer.
When you connect a phone to your computer, the phone decides how to report itself to the PC. Most phones report themselves as photo storage devices (a “Picture Transport Protocol” or PTP compliant device) while some phones present themselves as music storage devices (a “Media Transport Protocol” or MTP compliant device). Such devices only report photos, videos and/or music files to the PC. Thus, Blob (and Windows) will only see these content types, not contacts, text messages and other types of files on the phone.
Some phones will let you change how they report themselves to the PC. You may be able to go into the phone’s “Settings” menu and find the option that controls how it connects to a PC. If you enable “USB Debugging” and connect to the PC as a “Mass Storage Device”, the phone will report itself as a storage disk drive when you connect it to the PC. In this case, Blob (and Windows) will see all content types and files on the phone.
Please note that when you create a task to back up your phone, Blob can only copy files that the phone reports and that Blob indexes. Thus, if Blob cannot see/index text messages, it won’t be able to back them up.
How often will I be prompted for passwords?
This depends on the Data Source being accessed and whether you have allowed Blob to save passwords (also see “How is authentication handled”).
Blob will need to access your Data Sources for the following reasons:
- To index or re-index the Data Source. How often this happens depends on how you’ve configured Blob to check the Data Source for content updates (see “How often are my Data Sources indexed?”).
- To run backup tasks you’ve created. How often this happens depends on how often you have configured the task to run.
- When you manually access the Data Source using Blob. This happens when you read or respond to an email or text message, create an appointment, move, copy, download or upload content etc. using Blob
If a Data Source uses password based authentication and you’ve asked Blob to save the password, Blob will prompt you for a password only once – when you add the Data Source. For a Data Source that supports OAuth based authentication, how often Blob will have to re-authenticate depends on the account’s provider. For example, Google supports OAuth based authentication and typically requires a re-authentication every few days for GMail access but much less frequently for Google Drive and Google Photos. Other providers like Slack typically do not require a re-authentication for several weeks or even months.
Can I add Data Sources later?
How much space do the index files consume?
As a very rough approximation, index files consume about 0.2% of the amount of content on the Data Source. So if you have 100 gigabytes of actual indexed content on a disk drive, its index files will use about 200 megabytes of disk space on your computer. More space will be used if you have a lot of photos, videos or music files since Blob collects additional information about these file types. Please note this is a rough approximation only, not a guarantee or a limit that is enforced.
What happens if I remove a Data Source from Blob?
Removing a Data Source from Blob only removes it from Blob – nothing is deleted from the Data Source itself. Blob will not re-index the removed Data Source again and will delete its index files and any locally cached content from your computer.
Please note that if you select one or more actual content items in a Data Source and manually delete them, that selected content is deleted from the Data Source. For example, if you select a few emails and delete them, they will be deleted from the email server where they were stored.
To remove a Data Source, select it in the left panel and right-click on it to see the menu. Click on the “Remove” menu choice to remove it.