Exchange Online Archive (EOA): a view from the trenches – part 2

A bit later than expected, here’s finally the successor to the first article about Exchange Online Archiving which I wrote a while ago.

Exchange Online Archives and Outlook

How does Outlook connect to the online archive? Essentially, it’s the same process as with an on-premises archive. The client will receive the archive information during the initial Autodiscover process. If you take a look at the response, you will see something similar in the ouput:

image

Based on the SMTP address, the Outlook client will now make a second Autodiscover call to retrieve the connection settings for the archive after which it will try connecting to it. What happens then is exactly the same as how Outlook would connect to a regular mailbox in Office 365. Because Exchange Online is configured to use basic authentication for Outlook, the user will be prompted to enter their credentials. It’s particularly important to point this out to your users as the credential window will have no reference to what it’s used for. If you have deployed SSO, users will have to use their UPN (and not domain\username !) in the user field.

Experiences

So far we have covered what Exchange Online Archiving is all about, what the prerequisites are to make it work and how things come together in e.g. Outlook. Now, it’s time to stir things up a little and talk about how things are actually perceived in real life.

First, let me start by pointing out that this feature actually works great, IF you are willing to accept some of the particularities inherent to the solution. What I mean with particularities?

Latency

Unlike on-premises archives, your archives are now stored ‘in the cloud’. Which means that the only way to access them is over the internet. Depending on where you are connecting from, this could be an advantage or a disadvantage. I’ve noticed that connectivity to the archive and therefore the user-experience is highly dependent on the internet access you have. Rule of thumb: the more bandwidth/lower latency the better it gets. This shouldn’t be a surprise, but can be easily forgotten. I have found on-premises archives to be more responsive in terms of initial connectivity and retrieval of content. This brings me to the second point: speed.

Speed

As you are connecting over the internet, the speed of fetching content is highly dependent on the speed of your internet connection (you see a similarity here?). The bigger the message/attachment you want to download is, the longer it will take. Truth be told, you’ll have the same experience while accessing your on-premises archive from a remote location, so it’s not something exclusive to Office 365.

Outlook

To be honest, Outlook does a relative good job of working with the archive – at least when you deal with it the way it was designed. If you let Exchange sync expired items to your archive using the Managed Folder Assistant, your life will be great! However, if you dare to manually drag & drop messages from your primary mailbox into the archive, you’ll be in for a surprise. Outlook treats such an operation as a “foreground” action, which means that you will have to wait for this action to complete before you can do anything else in Outlook. The problem here is that if you choose to manually move a 4Mb message to the archive, it could take as long as 20 – 30 seconds (depending on your internet connection) before the action completes. To make things worse: during this operation Outlook freezes and if you try clicking something it’ll (temporarily) go into a “Not Responding…” state until the operation completes. According to Microsoft’s support, this is by design. So, as a measure of precaution: advise your users to NOT drag & drop messages, just let Exchange take care of it; something it does marvelously by the way.

I have found that proper end-user educations is also key here. If they are well informed about the how the archive works and have had some training on how to use retention tags, they’ll be on their way in no time!

Provisioning

As part of the problem I described above, the initial provisioning process can be a problem. When you first enable an archive, chances are that a lot of items will be moved to the archive. Although this process is handled by the MFA, if you mailbox is open whilst the MFA processes the mailbox, Outlook might become unresponsive or extremely slow at the least – this because the changes are happening and Outlook needs to sync those changes to the client’s OST file (when running in cached mode at least). Instead, it’s better to provision the archive on-premises, let the MFA do it’s work and then move the archive to Office 365. The latter approach works as a charm and doesn’t burden the user with an unresponsive Outlook client. If you are going to provision archives on-premises first, you might find it useful to estimate the size of an archive before diving in, heads first.

Search

This is a short one. Search is great. Because Outlook and Exchange can do cross-premises searches, you will be able to search both your primary mailbox and archive mailbox at once. Didn’t have much issues here. So: thumbs up!

Other Tips & Tricks

General (best) practices

Other than the particularities above, you shouldn’t do anything else compared to ‘regular’ on-premises archives. Try not to overwhelm your users with a ginormous amount of retention tags. Instead offer them a few tags they can used and – if necessary – adapt based on user feedback.

Autodiscover

Given the dependency from both Outlook and Exchange to make the online archive work, you should make sure that Autodiscover is working for your Exchange deployment AND that your Exchange servers are able to query Office 365’s Autodiscover service successfully as well.

This is especially important if you are using Outlook Web App (OWA) to access your online archive. In this case, it’s not Outlook but Exchange that will perform an Autodiscover lookup and connect to the archive. If your internet connection isn’t working properly or you have some sort forward authenticating proxy server in between, things could not (or intermittently) work.

Implement it gradually

As described above, it’s a bad idea to grant everyone with a new cloud-based archive at once. It will not only put a heavy load on your internet connection, but it will also affect your users. Instead, try to gradually implement the solution and request feedback from your users. Start with on-premises archives and move them to the cloud in batches, for instance.

DirSync is utterly important!

As described in the prerequisites sections, DirSync is very important to online archives. So make sure that you closely monitor how it’s doing. If you have issues with DirSync, you will inadvertently also have issues with creating archives. Issues with DirSync won’t interfere with archives that have already been provisioned though.

Conclusion

Is Exchange Online Archiving going to solve all your issues? Probably not. Is it a good solution. Yes, absolutely! I have been using Exchange Online Archiving for quite a while and I’m quite happy with it. I rarely encounter any issues, but I also have learnt to live with some of the particularities I mentioned earlier. Also, I treat my archive as a real archive. The stuff that’s in there are usually things I don’t need all that often. So the little latency-overhead that I experience whilst browsing/searching my archives is something I’m not bothered with. However, if I’d had to work with items from my archive day in, day out in; I’d probably have a lot more issues with adjusting to the fact that it’s less snappier than an on-premises archive.

So remember, set your (or your customer’s) experiences straight and you’ll enjoy the ride. If not, there might be some bumpy roads ahead!

Blog Exchange 2013 Office 365

Estimating the size of an Exchange (online) Archive

As part of some of the (archiving-) projects I have worked on, I frequently get asked if there is an easy way to determine what the size of the archive will be once it’s been activated. Although a bit odd at first, there are actually many good reasons why you’d want to know how big an archive will be.

First of all, determining the archive size allows to better size (or schedule for) the storage required for the archives. While there are also other ways to do this, knowing how big an archive will be when enabled is very helpful.

Secondly, if you’re using Exchange Online Archiving (EOA), it allows you to determine the amount of data that will pass through your internet connection for a specific mailbox. If the amount of data is large enough (compared to the available bandwidth), I personally prefer to provision an archiving on-premises, after which I can move it to Office 365 using MRS. But that’s another discussion. Especially for this scenario it can be useful to know how much archive you can (temporarily) host on-premises before sending them off to Office 365 and freeing up disk space again.

In order to calculate how big an archive would be, I’ve created a script which will go through all the items in one (or more) mailbox(es) and calculate the total size of all the items that will expire. When an item expires (and thus is eligible to be moved to the archive) depends on the Retention Policy you assign to a mailbox and what retention policy tags are included in that policy.

As the name of the script depicts, it’s important to understand that it’s an estimation of the archive size. There are situations in which the results of the script will be different from the real world. This could be the case when you enabled the archive and a user assigned personal tags to items before the Managed Folder assistant has processed the mailbox. In such a scenario, items with a retention tag that are different from the AgeLimit defined in the script will be calculated wrongfully. Then again, the script is meant to be ran before an archive is created.

Secondly, the script will go through all the folders in a mailbox. If you disabled archiving of calendar items, these items will be wrongfully included in the calculation as well. I will try to built this into the script in future releases, but this has a lower priority as the script was built to provide a pretty good estimation, not a 100% correct number.

The script, which you can download here, accepts multiple parameters:

UserPrimarySMTPAddresses the Primary SMTP Address of the mailbox for which you want to estimate the archive size
Report full file path to a txt file which contains the archive sizes
AgeLimit The retention time (in days) against which items should be probed. If you have a 60 day retention before items get moved to the archive, enter 60.
Server Used for connecting with EWS. Optional. Can be used if autodiscover is unable to determine the connection URI.
Credentials The credentials of an account that has the ApplicationImpersonation Management Role assigned to it.

 

The output of the script will be an object that contains the user’s Primary SMTP Address and the size of the archive in MB (TotalPRMessageSize).

Credit where credit is due! I would like to thank Michel de Rooij for his immensely insane PowerShell scripting skills and for helping me with cleaning up this script to its current form. Before I sent it off to Michel, the code was pretty inefficient [but hey! it was working], what you’ll download has been cleaned up and greatly enhanced. Now you have a clean code, additional error handling and some more parameters than in my original script [see parameters above].

I hope you’ll enjoy the script and find it useful. I’ve used it in multiple projects so far and it really helped me with planning of provisioning the archives.

Note:  To run the script, you’ll need to have Exchange Web Services installed and run it with an account that has the Application Impersonation Management Role assigned to it.

Cheers,

Michael

Blog Exchange

Exchange Online Archiving (EOA): a view from the trenches – part 1

What is Exchange Online Archiving?

I’ve been meaning to write this article for quite a while now, so I’m glad it’s finally “ready”. First, let me start by introducing what Exchange Online Archiving (EOA in short) actually is.
This feature, first available since Exchange Hybrid, allows you to provision an cloud-based archive for an on-premises mailbox. While having an Exchange archive isn’t something new, at least not since Exchange 2010, the fact that the archive doesn’t have to be hosted within your own organization is pretty interesting.

Archives can be useful in many ways. One of the primary reasons why archives are used is to keep historical data for a longer period of time without cluttering a user’s primary mailbox. This could, for instance, be the case when you have to meet some compliance requirements which e.g. state that corporate data should be kept for 5 years. Although Exchange doesn’t have a problem with handling very large mailboxes including a high item count per folder, it’s usually the human component that cannot handle the overload of information that comes with having large amounts of data – at least that’s my experience. Keeping email inherently means that you’ll have to increase disk space to support the sometimes huge amounts of data that is involved. Although disk space has become quite cheap and Exchange 2013 is a great candidate to be used in combination with those cheap disks, there’s still a significant overhead involved in keeping that additional piece of infrastructure up and running.

This is where Exchange Online Archives could come in handy. First of all, there is no feature difference between an on-premises archive or a cloud-based (Office 365) archive. From a user’s point-of-view they both act and look the same. In fact, you are only offloading the task of storing archives to Office 365. The Exchange Online Plan 2 subscription automatically includes the right to provision unlimited-sized archives for your users. Although I don’t expect many people to run into the issue of filling up the initial 100GB, which you get provisioned to start with, any time soon, it’s very hard to match that offer for only  8$ per user per month… If you are only interested in EOA, there are specific EOA licenses as well which cost only a fraction of the full Exchange online license. Of course, this license will only allow you to use EOA and nothing more.

How does it work?

As briefly touched upon earlier, being able to use Exchange Online Archives is a by-product from having a hybrid Exchange deployment. A hybrid deployment, as the name stipulates, is the process of ‘pairing’ your On-Premises Exchange organization to Office 365; essentially creating one large “virtual Exchange organization”. As a result, having a (fully functional) Hybrid Deployment is the first requirement to abide to… Technically speaking it would be possible to setup a sort of minimalistic Hybrid deployment in which you leave out functionalities that you do not necessarily need to make Online Archives work (like e.g. cross-premises mail flow). Nonetheless I strongly encourage to still setup the full monty. It might save you some time afterwards if you decide to deploy cloud-based mailboxes anyway.

A very import part of the setup is set aside for DirSync. As you might remember, if you tick the “Hybrid Deployment” checkbox during DirSync setup, you allow it to write back some attributes into your on-premises organization. One of these attributes is the msExchArchiveStatus attribute. This attribute is a flag telling the on-premises organization whether an online archive has been provisioned or not. As we will see later in this section, this attribute is particularly important during the creation of an archive.

One of the questions I get asked regularly is whether you are required to deploy ADFS when setting up a hybrid deployment. The short answer is no. On the other hand, there are many good reasons why you would want to deploy ADFS, or rather: there are many good reasons why you would want to have some sort of single/same sign on. One reason I can think of it to simplify using online archives from an end user’s perspective. That way they won’t need to manage another set of credentials. Of course this isn’t only valid for online archives, it’s the same for each cloud-based workload in Office 365. ADFS can be one way of providing SSO, Password Sync is another. Both are valid options, neither are required and won’t be discussed here.

From a functional point-of-view, Online Archives have the exact same requirements as on-premises archives. You at least need Office 2007 SP3 Professional edition or later. Since we are running archives from Office 365, you also need to make sure to be up to speed with the latest required updates. For more information on what updates are needed, have a look at the following web page: http://office.microsoft.com/en-us/office365-suite-help/software-requirements-for-office-365-for-business-HA102817357.aspx

Now that we got the prerequisites covered, let’s have a look at how the provisioning process works from a high-level perspective:

image

As you can derive from the image above, there are two DirSync operations needed. The first one is used to “tell” Office 365 to create an archive for user “X”. The second DirSync operation is used to sync back the msExchArchiveStatus attribute which will now have a value of 1 instead of 0. This is to tell the on-premises organization the archive has been created. A good way to verify whether this process has completed is to run the Get-Mailbox | fl *arch* command:

image

Here you can see that the archive was created successfully (ArchiveStatus = Active). However, we are missing a part of the information. This is because the on-premises organization cannot provide the information from Office 365 (which is essentially another Exchange organization). To fetch the missing information, you’ll have to open up a remote PowerShell session to Exchange Online and run the Get-MailUser | fl *arch* command:

image

Conclusion

This is it for part one of this article.
In the following part, I will talk about some of the gotchas, do’s and don’ts. Stay tuned!

Exchange Exchange 2013 Hybrid Exchange Office 365

Update: Disappearing (online) archives after moving your mailbox to Office 365

Update

After a few weeks of mailing back and forth with Microsoft’s support, I was today (finally) able to confirm that the issue which I described below is now solved.

It seems that Microsoft rolled out a hotfix/code change for their Exchange Online service. Although, at first, I thought the issue was related to a bug in EMC for not correctly issuing all parameters when initiating a remote mailbox move, it seems the issue had more to it than that. Basically, what happened is that when MRS moved the mailbox from on-premises environment to Office 365, it wouldn’t keep the link to the already-existing archive. This caused a new (empty) archive to be created and could possibly cause data loss.

I’m happy to see what time and effort Microsoft has put into solving this issue. It proves that Microsoft is concerned about the quality of their product / service. In fact, it would surprise me if they weren’t. A bug that could cause data-loss is not really something you’d want to carry around for a long time!

Thanks to everyone involved and kudo’s to Philippe Phan Cao Bach (Sr. Escalation Engineer) who was working with me on this case.

Original Post

Office 365 offers great ways to enhance the functionalities of your on-premises deployment. By running the Hybrid Configuration Wizard (which Steve Goodman explains in this article) you can configure both environments to act as one; allowing you to make use of features such as e.g. Online Archives (EOA).

With Exchange Online Archives, your primary mailbox stays in your on-premises Exchange server, whereas the archive will – as the name might have given away – be hosted in Office 365. If you’re interested in finding out more about Online Archives, I suggest that you take a look at Bharat Suneja’s session at TechEd this year: “Archiving in the cloud with Exchange Online Archiving

The problem

To me, one of the most interesting things about a hybrid deployment is the flexibility it offers. You can put a few mailboxes in Office 365, try them out and move more to the service if you like it.

If you are looking to take that approach, this information might be interesting for you!

Imagine the following: you are trying out Office 365 and decide to use Online Archives to start with. You provision the archives and life is great! After a while you decide you want to use more and you decide to move some mailboxes to Office 365. However, after your users have been moved to Office 365 they start complaining that their archive is empty.

It seems that – although this scenario is supported – there are some issues with the provisioning process when you move a user to Office 365 that previously already had an Online Archive: it get’s “wiped”. At least, that’s how it looks like.

At first, I though the data would reappear after a while, so I made sure that I waited long enough. Unfortunately even after a few days, the archives was still empty.

I decided to do some tests, to make sure this wasn’t a standalone case. Perhaps something went wrong during the move. To my surprise, tests confirmed what was going on: although the archive contained items prior to the move, they are now empty.

To explain what happens, let me describe the process I used to reproduce this issue.

This first screenshot show the details of the on-premises mailbox that has a cloud-based archive (EOA) enabled. This archive contains 4 (test) items:

image

Afterwards, I moved the mailbox through the Exchange Management Console using the “New Remote Move Request”-wizard.

Because on-premises only a mailbox exists, you don’t have the option to move an archive (which is normal):
image

The move completed successfully, and after having waited long enough (DirSync etc.). I verified the mailbox’s settings:

image

The interesting part here is that the Archive, although having the same GUID, appears to have been moved to the same database as the mailbox. Before the archive resided in database “EURPRD04DG032-db055” whereas now it’s in “EURPRD04DG030-db041”.

To ascertain myself that this wasn’t causing problems, I decided to do another test. When executing the MoveRequest, I specified to what database the archive should be moved to. I made sure that the target database of the Online Archive was set to the database it was already residing in before moving the mailbox:

New-MoveRequest “Testmivh5” –RemoteHostName “hostname.company.com” –targetdeliverydomain “tenant.mail.onmicrosoft.com” –ArchiveTargetDatabase “EURPRD04DC032-db055” –RemoteCredential (Get-Credential)

Note   this cmdlet was executed from PowerShell connected to Exchange Online.

After the move completed (successfully btw), I – again – waited long enough for DirSync/replication/provisioning to occur. I deliberately didn’t force DirSync to ensure that wasn’t causing any issues either. But alas, none of that helped: the archive was again empty.

A quick look at the object’s attributes revealed that – although a target database parameter was provided – the archive still got moved to the same database as the user:

image

Then, I was thinking that the ‘old’ archive perhaps got disabled and that a new one was created. Although this would be strange since the GUID of the archive remains the same, I thought it was worth a try. Again: no joy! No disconnected mailboxes were to be found.

After all this testing, I had reasons enough to call Microsoft Support. After a few calls back and forth, they recently came back to me confirming that this is a known issue and that they’re currently working on it.

Until today I’m still not sure what the cause of the problem is. I haven’t received any feedback yet either. Of course, I will keep you posted as soon as I find out more!

Temporary workaround

It might sound too obvious, but the workaround is simple: either create both archive and mailbox in the cloud or create the (both!) on-premises first and move them together to the cloud. Both cases work just fine!

Conclusion

Although the last thing you’d want to experience is data-loss, I’m well aware that only a few customers, world-wide, would try this scenario. Nevertheless, it’s an issue that should be addressed quickly.

In our case, we have lost only a single archive worth a few hundred megabytes of emails. I can imagine that losing the wrong kind of emails might be a real big issues for some companies. I haven’t asked, but I’m pretty confident that – even though the emails seem lost – Microsoft can somehow recover the data so that you don’t really “lose” anything. I honestly cannot imagine otherwise.

Does this mean that I discourage using features like EOA? Absolutely not. I still have my hosted archive and I am pretty happy with it. Apart from some inconveniences which I will write about another time, it provides me with everything I need. Furthermore, it allows us to give everyone a relatively large archive without having to bear the costs of additional storage.

Until later!

Blog Exchange Hybrid Exchange Office 365

Disappearing (online) archives after moving your mailbox to Office 365…

Office 365 offers great ways to enhance the functionalities of your on-premises deployment. By running the Hybrid Configuration Wizard (which Steve Goodman explains in this article) you can configure both environments to act as one; allowing you to make use of features such as e.g. Online Archives (EOA).

With Exchange Online Archives, your primary mailbox stays in your on-premises Exchange server, whereas the archive will – as the name might have given away – be hosted in Office 365. If you’re interested in finding out more about Online Archives, I suggest that you take a look at Bharat Suneja’s session at TechEd this year: “Archiving in the cloud with Exchange Online Archiving

The problem

To me, one of the most interesting things about a hybrid deployment is the flexibility it offers. You can put a few mailboxes in Office 365, try them out and move more to the service if you like it.

If you are looking to take that approach, this information might be interesting for you!

Imagine the following: you are trying out Office 365 and decide to use Online Archives to start with. You provision the archives and life is great! After a while you decide you want to use more and you decide to move some mailboxes to Office 365. However, after your users have been moved to Office 365 they start complaining that their archive is empty.

It seems that – although this scenario is supported – there are some issues with the provisioning process when you move a user to Office 365 that previously already had an Online Archive: it get’s “wiped”. At least, that’s how it looks like.

At first, I though the data would reappear after a while, so I made sure that I waited long enough. Unfortunately even after a few days, the archives was still empty.

I decided to do some tests, to make sure this wasn’t a standalone case. Perhaps something went wrong during the move. To my surprise, tests confirmed what was going on: although the archive contained items prior to the move, they are now empty.

To explain what happens, let me describe the process I used to reproduce this issue.

This first screenshot show the details of the on-premises mailbox that has a cloud-based archive (EOA) enabled. This archive contains 4 (test) items:

image

Afterwards, I moved the mailbox through the Exchange Management Console using the “New Remote Move Request”-wizard.

Because on-premises only a mailbox exists, you don’t have the option to move an archive (which is normal):
image

The move completed successfully, and after having waited long enough (DirSync etc.). I verified the mailbox’s settings:

image

The interesting part here is that the Archive, although having the same GUID, appears to have been moved to the same database as the mailbox. Before the archive resided in database “EURPRD04DG032-db055” whereas now it’s in “EURPRD04DG030-db041”.

To ascertain myself that this wasn’t causing problems, I decided to do another test. When executing the MoveRequest, I specified to what database the archive should be moved to. I made sure that the target database of the Online Archive was set to the database it was already residing in before moving the mailbox:

New-MoveRequest “Testmivh5” –RemoteHostName “hostname.company.com” –targetdeliverydomain “tenant.mail.onmicrosoft.com” –<strong>ArchiveTargetDatabase </strong>“EURPRD04DC032-db055” –RemoteCredential Get-Credential)

Note   this cmdlet was executed from a remote PowerShell connection to Exchange Online.

After the move completed (successfully btw), I – again – waited long enough for DirSync/replication/provisioning to occur. I deliberately didn’t force DirSync to ensure that wasn’t causing any issues either. But alas, none of that helped: the archive was again empty.

A quick look at the object’s attributes revealed that – although a target database parameter was provided – the archive still got moved to the same database as the user:

image

Then, I was thinking that the ‘old’ archive perhaps got disabled and that a new one was created. Although this would be strange since the GUID of the archive remains the same, I thought it was worth a try. Again: no joy! No disconnected mailboxes were to be found.

After all this testing, I had reasons enough to call Microsoft Support. After a few calls back and forth, they recently came back to me confirming that this is a known issue and that they’re currently working on it.

Until today I’m still not sure what the cause of the problem is. I haven’t received any feedback yet either. Of course, I will keep you posted as soon as I find out more!

Temporary workaround

It might sound too obvious, but the workaround is simple: either create both archive and mailbox in the cloud or create the (both!) on-premises first and move them together to the cloud. Both cases work just fine!

Conclusion

Although the last thing you’d want to experience is data-loss, I’m well aware that only a few customers, world-wide, would try this scenario. Nevertheless, it’s an issue that should be addressed quickly.

In our case, we have lost only a single archive worth a few hundred megabytes of emails. I can imagine that losing the wrong kind of emails might be a real big issues for some companies. I haven’t asked, but I’m pretty confident that – even though the emails seem lost – Microsoft can somehow recover the data so that you don’t really “lose” anything. I honestly cannot imagine otherwise.

Does this mean that I discourage using features like EOA? Absolutely not. I still have my hosted archive and I am pretty happy with it. Apart from some inconveniences which I will write about another time, it provides me with everything I need. Furthermore, it allows us to give everyone a relatively large archive without having to bear the costs of additional storage.

Until later!

Blog Exchange Hybrid Exchange Office 365