Aug 112013
 

Over the past week, I’ve been attempting to get around to some cyber-housecleaning.  As you may know from previous articles, I have a networked-attached storage (NAS) storage device by manufacturer Thecus which has 5x 2 TB drives configured in RAID-5 array.

I’m always outage/data-loss conscious, so I implement a somewhat rigorous backup process from time to time – usually bi-annually. This year, I’ve managed to offload (by burning to BluRay discs and then moving to an external HDD) and create nearly 3 TB in free space.  Using the impressive WinDirStat utility (which is free), I’m able to get a graphical view of storage allocation:

WinDirStat

It is very nice and somewhat rewarding to see that massive grey box full of unallocated storage!  Now if only I could keep my office as neat and tidy..

Here’s a Visio diagram representing the current allocation of storage:

image

Sep 132012
 

So, you’re building apps that span multiple devices and you’re curious about what the cloud can offer. Is it possible to deploy scalable web apps and services on Windows Azure? How about storing data in the cloud? Is it possible to use the cloud for push notifications to the device? In this session, learn how to build Windows Phone, Android and even iOS apps that are backed by scalable cloud services with the Windows Azure platform.

Presented by Wade Wegner

Disclaimer: These are conference session notes I compiled during various sessions at Microsoft Tech Ed 2012, September 11-14, 2012.  The majority of the content comprises notes taken from the presentation slides accompanied, occasionally, by my own narration.  Some of the content may be free hand style.  Enjoy… Rob

Follow me on Twitter @ausrob

Entered late to the session…A session on building mobile applications for Windows Azure.

Storage Options

Storage Secrets – Authentication

Azure – storage name & key
SQL Azure – username & password (note: now named: Windows Azure SQL Database)

How to avoid storing storage credentials in a client app?

  1. Proxy the requests (via a WCF service, for example)
  2. Shared access signature (generates a sort of one time password)
    1. Device addresses storage directly

[Demo]  Using Shared Access Signature

  • Using NuGet Azure Storage package
  • Can query with REST based request
  • Can return XML or JSON
  • Returns shared access signature, client then directly requests data from Azure

Switched to Windows Phone 7 application (client).

  • Using NuGet Phone.Storage package
  • Target .Net 4.0 today unless you ship the 4.5 Framework
  • Windows Azure Storage Proxy (Cloud Services) NuGet Package
  • Storage Initializer is included.  Use it to change how storage is resolved (local or Azure)
  • Storage Service can be configured to point to the Azure Storage (using standard connection configuration)

Identity Provider Options

  • Create your own (e.g. ASP.net membership)
    • Additional scope/effort/testing
  • Use existing identity system (Facebook, LiveID etc)
    • APIs change, needs to be managed
  • Outsource identity management (Access Control Service)
    • Extra cost?
    • Allows larger base of federated identity providers
    • Claims!

Many factors to consider – management, attack vector, etc

  • Windows Identity Foundation provides claims aware capability.

[Demo] Using Access Control Service  (ACS) from Windows Phone using NuGet.

  • Demo will show how to pass an oAuth token to an Azure Service
  • Get access to the Access Control Service via the Azure Portal (Preview)
  • Create a service namespace
  • Set Identity Providers (i.e. federated identity providers, e.g. Google, Yahoo etc)
  • Add relay/realm details (sets routes, tokens supported), signing token
  • Add rules (map claims from provider to claims in an application)
    • Can be done by hand, or generated as default rules (e.g. email address –> email address)
  • Done.. consume away!

Service side – using a Delegating Handler (System.Net.Http) to validate the token before further requests are made.  Looks at request header coming in, validate token and verify against the ACS. 

Library: “Simple Web Token” (Open Source – NuGet package?) can be used to validate oAuth (simple web tokens) against the ACS.  Need to use the signing key to verify oAuth tokens against the ACS (using the previously mentioned library).

Signing key is embedded as a hash in the incoming oAuth token, FTW!  Incoming messages now are checked the header for the oAuth.  Client application now needs to pass along the token provided by the access control service.

Using ‘ACS Control for Windows Phone (NuGet).  Get the token from Application.Current.Resources.
Add to the header request.  Simple!

Communications

  • Device-initiated (pull)
  • Cloud-initiated (push)

Device:  Wire format choices (SOAP, JSON, POX)
Cloud: Notification (Toast/Title, Tile) and Raw – note: no guarantee of delivery

Subscribing to Push

  1. Device requests a channel
  2. NS returns channel (register device)
  3. Channel URI is stored in cloud
  4. Use channel URI to push message to device (via notification service)

Data is limited (payload can be small)
Cloud initiated message can act as a pointer to direct devices to larger data

Web Role -> NS -> Device -> Ckoud/Service

There are a number of Push Notification Services available, e.g. For Windows 8: Windows Push Notification Service (WNS)

[Demo]

Platform Services

“Application Building Blocks”

  • Windows Azure Traffic Manager (Global Traffic Manager, load balance)
    • at the DNS level,
    • route request to closest PoP
  • Distributed In-Memory cache
  • Messaging (queues) e.g worker roles with async update
  • Identity
  • Windows Azure Media Services (cloud transcoding as a service)
  • Content Delivery Network (static content, geo-located)
  • On-premise access

Tools

  • NuGet Packages
  • iOS Tools (GitHub)
  • Android (Eclipse project)

Summary

Reviewed Storage, Identity and Application Blocks.  That is all.

Sep 122012
 

There are numerous options available for data storage available on Windows Azure and it can be very difficult to pick the right one for a given application profile.

This session will evaluate many of the various storage options available to the Azure developer in terms of their: – Ease of use – Real-world performance – Cost – Features The session will also explore the benefits of tiered storage and review patterns that the developer can use to get the most out of a few key storage options.

Speaker Richard Laxton

Disclaimer: These are conference session notes I compiled during various sessions at Microsoft Tech Ed 2012, September 11-14, 2012.  The majority of the content comprises notes taken from the presentation slides accompanied, occasionally, by my own narration.  Some of the content may be free hand style.  Enjoy… Rob

Introduction

This session is around deciding on a storage approach for Azure applications.  On the agenda – looking at kinds of storage available, a look at specific technologies, performance (in brief) and processing data.

How do we look at storage?

What does the data look like – relational (SQL)?  structured? unstructured (file system)?  What is the lifecycle – permanent/transient?

Features?

API?  Open or proprietary interface, language API?  What kind of access mechanism do you require?  Random or sequential access?

How will it scale?

Level? Horizontal or vertical scale?  Ease of implementation (and testing?)  What kind of performance needs to be met?

Cost?

Size based (capacity) or transactions (bandwidth)?

Additional considerations – schema management, transactional support… management and monitoring.. access controls?  auditing?

Limitations?

Record type/size/total size/recovery?

Azure Storage Options

1. SQL Azure

Pros

Ticks a lot of the boxes.  Structured, API access, random access with good scalability but manual horizontal and vertical scaling.  Cost is by price.

Cons

Throttling can be inconsistent,  backups don’t work the same way as SQL Server, threshold under load can be unpredictable and the feature set is not identical to SQL Server.  Need to actively police retry attempts and manage outages.

2. Table storage (NoSQL)

Pros

Structured, permanent, web service API (plus managed APIs – many languages), random access, programmatic vertical and horizontal scale (easy to arrange data).  Need to understand how to design for it.  No relationships (key and column based).  Data is flat, like an index able CSV file.  Cost based on size/transaction.

Cons

Identifying sets are table name, partition key and row key (typed columns, very flat).  No secondary indexes.
Records limited to around 1MB with single columns at 64kb each.  Limited data types (mainly primitives).  Data modelling could be difficult compared to relational modelling.  How is a domain model mapped into rows and columns, with no relationships?

3. Blob Storage

Pros

Unstructured, permanent, random access, web service API, a bunch of data with metadata, automatic scaling (H & V) and cost is based on size and transactions.  Could store, for example, a serialized object into BLOB storage.

Two types: Block Blobs & Page blobs.
Block: Up to 200 GB, sequential write, not easy to update.

Page: Up to 1 TB, individually addressable 512k blocks.

4. Queues

Pros

Distributing workload, unstructured, permanent, sequential ordered access (FIFO), web service based, managed API and cost based on size/transactions.  Great failure recovery support.  Separate dequeue and delete operations, Failure to delete will see the message return.  Multiple readers/writers, cost based on transactions.

Cons

No notification mechanism, but supports polling.  Be wary of over used polling (cost/transaction based).  Not brilliant performance (especially enqueuing), small messages (~64kb).  Can dequeue 32 messages at a time.  Not guaranteed FIFO behaviour.  Larger message needs a pointer to a blob.

5. Cache

Transient, unstructured (key/value store), web service API, .Net SDK, auto or manual scale, shared and dedicated available, cost based on size (128mb-4gb).  Equivalent of memcache (Java).

Stored in-memory, local in—memory if required, distributed notification model (to invalidate local copies), automatically purges if reach quota, limitations on bandwidth and connections.

6. Content Delivery Network (CDN)

Geo distributed cache, used by Akamai, not strictly part of Azure.  Only way to deliver to a targeted geography.  Mirrors HTTP(S) content, control availability by HTTP headers, generally cheaper than delivering content through Azure.  Microsoft CDN is perhaps easier to use.  Can connect BLOB storage to CDN.  Transient storage.

7. Apache Hadoop

Java based distributed processing of large data sets.  Highly scalable,  Option to deploy a Hadoop cluster from Azure.  Reliable computation, suitable for structured and unstructured data.  Storage and processing capability.

8. Virtual Machine

Install anything you need (MySQL, memcache, Oracle?).  Why would you resort to a VM?  Any legacy or dependencies or use of MySQL (and others).  Possibly not the right approach for new green field apps.

Performance

Emulator environments not reliable measure of performance.  Test on real Azure.
Platform is dynamic, perform additional testing.  Make sure high volume situations are tested.  Test beyond read/write scenarios.  Test common scenarios and keep an eye on edge case scenarios.

Testing Methodologies

A sample test plan: Build a simple application, control API, use multiple workers and test at different levels of load.  e Stress testing the platform is OK.

Test small/large objects

20120912-142648.jpg

Results…  mid sized batches – table storage seems to be the winner.  Depends on your own specific application, so should profile multiple storage options.  SQL needs to be designed with sharding or caching to keep load manageable.

Patterns for Performance

Tiered storage, output caching, queued updates..  Use the right storage for the right data..

[Local Cache] (Transient)
[Azure Cache]
[Table Storage]
[SQL Azure] (Structured)

Architecturally challenging – performance, instrumentation, development support.. etc  Table storage allows for denormalization – multiple copies.. queued updates can help.  Output caching can be of benefit – cache at the presentation tier, generated JSON/etc, locally (IIS) but beware stale data.  CDN caching edge caching returned reduce load on servers, geographically targeted content.

How to choose?

Understand your design needs, resource and environment.  Ensure design is proportional to scale needs.  Determine data complexity needs/requirements.

Examples:

Line of business – transactional, small user base (<100), developers generally SQL-experienced, hybrid applications (online/offline) and very complex data.  Uses SQL Azure and Azure Cache for acceleration (simple – keep transactions/size down = lower cost).

Internet scale application – No transactions, read optimized, data partitionable, often simple data model.  Use Azure Table Storage, Apache Hadoop for analysis, Azure Cache for acceleration.

Conclusion

No single answer.  Consider all options, can use more than one strategy. 

Mar 192012
 

imageRecently I bought a brand new Network Attached Storage (NAS) device, as I’ve wanted (for a while) to redo my approach to storage capacity and – more specifically – add some fault tolerance (by way of redundant disks).

I did some research and looked in to a number of NAS suppliers.  A friend of mine bought a Drobo Storage Device last year, and I was very interested although what I wanted was out of my price range.  In the end I settled on something less expansive (feature wise) and decided on a Thecus N5200XXX 5 Bay Desktop NAS.

Originally I was going to fill it with five 3 TB drives, but when I went to buy disks only 2 TB versions (WD Black) were available, so I made do with a 10 TB base.  I wanted a RAID 5 configuration, which left me with about 7.5 TB of RAID capacity.  I’d like to write more about it, but it’s still formatting, so I’ll have to post an update later this week about how it functions.

Setting it up though, was quite easy.  It was simplicity itself to screw the drives into the drive caddies, and insert them into the device.  Each bay can be locked with a key, and fits perfectly into the chassis.  The information panel at the bottom displays status information which can be quite handy.

Most of the grunt work is done via the built in web interface, and most of the configuration is user settable.  Until my RAID array is properly formatted, I won’t get to play with too many of the settings, but there’s an enticing array of functionality.

All I can say at the moment is that the unit looks good, isn’t too noisy and was simplicity itself to start the configuration.  There seems to be a lot of great functionality, and I can’t wait to benchmark the I/O performance.

Disks:  5x WD 3.5" Black 2TB WD2002FAEX SATA 3 7200 rpm
Disk Specs: http://wdc.com/global/products/specs/?driveID=899&language=1

Model#: WD2002FAEX
Interface: SATA 6 Gb/s
Form Factor: 3.5 Inch
RPM: 7200
Capacity: 2 TB
Cache: 64 MB

NAS Features, from the official site:

Features
  • Extreme Speed
    With a power-house of a processor in the Intel Atom D525 running at 1.8GHz, the N5200XXX runs circles around the competition. Combined with 1GB of super fast DDR3 RAM, that means incredible transfer speeds and less time spent waiting.

  • Extreme Data Backup
    Secure your data with the sophisticated features and refined simplicity that only Thecus can offer. Incremental backups and recover data with Acronis’s True Image software, take and revert back to system snapshots at your leisure, and remotely backup to anywhere in the world with native Rsync support.

  • Extreme Power Management
    N5200XXX supports scheduled power on/off. With this feature, users can set what time the system turns on or off. This feature is a big plus for people who want to conserve energy. Wake-On-LAN enables users to remotely turn the system on without leaving their seat.

  • Extreme Protection
    Put safety first with AES256bit RAID volume encryption and USB Key functionality. An impenetrable wall of protection is at your disposal to make sure only those you want can access your data, and no one else. Simply set up a USB flash drive key that unlocks your data with no hassle and maximum protection.

  • iSCSI Thin Provisioning Support
    Get the most out of your storage space with the extreme speed of iSCSI and the efficiency of iSCSI thin provisioning. Connect through iSCSI for the fastest data transfer speeds available and make wasted disk space a thing of the past with thin provisioning’s flexible storage functionality.

  • Online RAID Volume Management
    Managing RAID volumes has never been easier thanks to the N5200XXX’s Online RAID Volume Management. Administrators can easily expand or migrate RAID volumes without having to power down the system, eliminating costly downtime.

Other Reviews

http://www.thecus.com/media_news_page.php?NEWS_ID=4309

http://www.tweak.dk/review/Thecus_N5200XXX_5_disk_NAS_server/1333/5/1
http://www.pcpro.co.uk/reviews/storage-appliances/370804/thecus-n5200xxx