Only showing posts tagged with "FSharp.Azure"
May 24, 2014 9:47 AM by Daniel Chambers (last modified on May 24, 2014 9:58 AM)
This week I published the 1.0 stable release of FSharp.Azure on NuGet. Compared to the beta, it has one extra feature I slipped in: support for using option types on your record fields.
For those unfamiliar with F#, the option
type is an F# construct that allows you to express the presence or absence of a value. It is similar to Nullable<T>
, however it works for all types, instead of just value types. C# developers are used to using null
as the “absence” value for reference types, as all references are nullable by default in the CLR. However, when writing F# any type you define is not allowed to be nullable by default, regardless of the fact that it may be a CLR reference type under the covers. This is where the F# option
type comes in; when you want to allow the absence of something, you must be explicit and describe that fact using the option
type. This is great because it means that you are much less likely to be passed null
(ie. the “absence”) when you don’t expect it and get an error such as the irritating NullReferenceException
. You can basically view the option
type as opt-in nullability.
Here’s an example record type for use with FSharp.Azure that uses option types:
type Game = { [<PartitionKey>] Developer : string [<RowKey>] Name : string ReleaseYear : int option Notes : string option }
You could insert one of these records into table storage like this:
let game = { Developer = "343 Industries" Name = "Halo 5" ReleaseYear = None Notes = None } let result = game |> Insert |> inGameTable
The inGameTable
function is a FSharp.Azure helper function and you can see how it was defined in this previous post.
Note the use of the None
value for ReleaseYear
and Notes
. We’re explicitly saying we’re omitting a value for those two fields. When translated to Azure table storage this means for the row that will be inserted for this record those two properties will not be defined. Remember that in table storage, unlike relational databases, not all rows in a table need have the same properties.
If we later want to update that record in table storage and provide values for ReleaseYear
and Notes
, we can:
let modifiedGame = { game with ReleaseYear = Some 2015 Notes = Some "Has yet to be released." } let result = modifiedGame |> ForceReplace |> inGameTable
Another nice ability that using option
types with FSharp.Azure provides us is being able to use Merge to update the row in table storage with only the properties that are either not option types or are option typed and have Some value (ie are not None). For example:
let modifiedGame = { game with ReleaseYear = None Notes = Some "Will be bigger than Halo 4!" } let result = modifiedGame |> ForceMerge |> inGameTable
Because we’re using a Merge operation, the above will change the Notes
property in table storage, but will not change the existing ReleaseYear
value.
To play with FSharp.Azure, use NuGet to install the package “FSharp.Azure”. The source code is available on GitHub.
May 11, 2014 12:06 PM by Daniel Chambers (last modified on May 18, 2014 2:40 PM)
In my last post, I showed how to use FSharp.Azure to modify data in Azure table storage. FSharp.Azure is a library I recently released that allows F# developers to write idiomatic F# code to talk to Azure table storage. In this post, we’ll look at the opposite of data modification: data querying.
To use FSharp.Azure, install the NuGet package: FSharp.Azure. At the time of writing the package is marked as beta, so you will need to include pre-releases by using the checkbox on the UI, or using the (v1.0.0 has been released!)–Pre
flag on the console.
Once you’ve installed the package, you need to open the TableStorage module to use the table storage functions:
open DigitallyCreated.FSharp.Azure.TableStorage
To provide an idiomatic F# experience when querying table storage, FSharp.Azure supports the use of record types when querying. For example, the following record type would be used to read a table with columns that match the field names:
type Game = { Name : string Developer : string HasMultiplayer : bool Notes : string }
We will use this record type in the examples below. We will also assume, for the sake of these examples, that the Developer field is also used as the PartitionKey and the Name field is used as the RowKey.
FSharp.Azure also supports querying class types that implement the Microsoft.WindowsAzure.Storage.Table.ITableEntity
interface.
The easiest way to use the FSharp.Azure API is to define a quick helper function that allows you to query for rows from a particular table:
open Microsoft.WindowsAzure.Storage open Microsoft.WindowsAzure.Storage.Table let account = CloudStorageAccount.Parse "UseDevelopmentStorage=true;" //Or your connection string here let tableClient = account.CreateCloudTableClient() let fromGameTable q = fromTable tableClient "Games" q
The fromGameTable
function fixes the tableClient
and table name parameters of the fromTable
function, so you don't have to keep passing them. This technique is very common when using the FSharp.Azure API.
Here's how we'd query for all rows in the "Games" table:
let games = Query.all<Game> |> fromGameTable
games
above is of type seq<Game * EntityMetadata>
. The EntityMetadata
type contains the Etag and Timestamp of each Game. Here's how you might work with that:
let gameRecords = games |> Seq.map fst let etags = games |> Seq.map (fun game, metadata -> metadata.Etag)
The etags in particular are useful when updating those records in table storage, because they allow you to utilise Azure Table Storage's optimistic concurrency protection to ensure nothing else has changed the record since you queried for it.
The Query.where
function allows you to use an F# quotation of a lambda to specify what conditions you want to filter by. The lambda you specify must be of type:
'T -> SystemProperties -> bool
The SystemProperties
type allows you to construct filters against system properties such as the Partition Key and Row Key, which are the only two properties that are indexed by Table Storage, and therefore the ones over which you will most likely be performing filtering.
For example, this is how we'd get an individual record by PartitionKey and RowKey:
let halo4, metadata = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "343 Industries" && s.RowKey = "Halo 4" @> |> fromGameTable |> Seq.head
You can, however, query over properties on your record type too. Be aware that queries over those properties are not indexed by Table Storage and as such will suffer performance penalties.
For example, if we wanted to find all multiplayer games made by Valve, we could write:
let multiplayerValveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" && g.HasMultiplayer @> |> fromGameTable
The following operators/functions are supported for use inside the where lambda:
=
, <>
, <
, <=
, >
, >=
operators not
function Table storage allows you to limit the query results to be only the first 'n' results it finds. Naturally, FSharp.Azure supports this.
Here's an example query that limits the results to the first 5 multiplayer games made by Valve:
let multiplayerValveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" && g.HasMultiplayer @> |> Query.take 5 |> fromGameTable
Azure table storage may not return all the results that match the query in one go. Instead it may split the results over multiple segments, each of which must be queried for separately and sequentially. According to MSDN, table storage will start segmenting results if:
FSharp.Azure supports handling query segmentation manually as well as automatically. The fromTable
function we used in the previous examples returns a seq
that will automatically query for additional segments as you iterate.
If you want to handle segmentation manually, you can use the fromTableSegmented
function instead of fromTable
. First, define a helper function:
let fromGameTableSegmented c q = fromTableSegmented tableClient "Games" c q
The fromGameTableSegmented
function will have the type:
TableContinuationToken option -> EntityQuery<'T> -> List<'T * EntityMetadata> * TableContinuationToken option
This means it takes an optional continuation token and the query, and returns the list of results in that segment, and optionally the continuation token used to access the next segment, if any.
Here's an example that gets the first two segments of query results:
let query = Query.all<Game> let games1, segmentToken1 = query |> fromGameTableSegmented None //None means querying for the first segment (ie. no continuation) //We're making the assumption segmentToken1 here is not None and therefore //there is another segment to read. In practice, this is a very poor assumption //to make, since segmentation is performed arbitrarily by table storage if segmentToken1.IsNone then failwith "No segment 2!" let games2, segmentToken2 = query |> fromGameTableSegmented segmentToken1
In practice, you'd probably write a recursive function or a loop to iterate through the segments until a certain condition.
FSharp.Azure also supports asynchronous equivalents of fromTable
and fromTableSegmented
. To use them, you would first create your helper functions:
let fromGameTableAsync q = fromTableAsync tableClient "Games" q let fromGameTableSegmentedAsync c q = fromTableSegmentedAsync tableClient "Games" c q
fromTableAsync
automatically and asynchronously makes requests for all the segments and returns all the results in a single seq
. Note that unlike fromTable
, all segments are queried for during the asynchronous operation, not during sequence iteration. (This is because seq
doesn't support asynchronous iteration.)
Here's an example of using fromTableAsync
:
let valveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" @> |> fromGameTableAsync |> Async.RunSynchronously
And finally, an example using the asynchronous segmentation variant:
let asyncOp = async { let query = Query.all<Game> let! games1, segmentToken1 = query |> fromGameTableSegmentedAsync None //None means querying for the first segment (ie. no continuation) //We're making the assumption segmentToken1 here is not None and therefore //there is another segment to read. In practice, this is a very poor assumption //to make, since segmentation is performed arbitrarily by table storage if segmentToken1.IsNone then failwith "No segment 2!" let! games2, segmentToken2 = query |> fromGameTableSegmentedAsync segmentToken1 return games1 @ games2 } let games = asyncOp |> Async.RunSynchronously
In this post, we’ve covered the nitty gritty details of querying with FSharp.Azure. Hopefully you find this series of posts and the library itself useful; if you have, please do leave a comment or tweet to me at @danielchmbrs.
May 08, 2014 1:23 PM by Daniel Chambers (last modified on May 24, 2014 9:53 AM)
In my previous post I gave a quick taster of how to modify data in Azure table storage using FSharp.Azure, but I didn’t go into detail. FSharp.Azure is the new F# library that I’ve recently released that lets you talk to Azure table storage using an idiomatic F# API surface. In this post, we’re going to go into deep detail about all the features FSharp.Azure provides for modifying data in table storage.
To use FSharp.Azure, install the NuGet package: FSharp.Azure. At the time of writing the package is marked as beta, so you will need to include pre-releases by using the checkbox on the UI, or using the (v1.0.0 has been released!)–Pre
flag on the console.
Once you’ve installed the package, you need to open the TableStorage module to use the table storage functions:
open DigitallyCreated.FSharp.Azure.TableStorage
In order to provide an idiomatic F# experience when talking to Azure table storage, FSharp.Azure supports the use of record types. For example, this is a record you could store in table storage:
type Game = { Name : string Developer : string HasMultiplayer : bool Notes : string }
Note that the record fields must be of types that Azure table storage supports; that is:
string
int
int64
bool
double
Guid
DateTimeOffset
byte[]
In addition to record types, you can also use classes that implement the standard Microsoft.WindowsAzure.Storage.Table.ITableEntity
interface.
For the remainder of this post however, we will focus on using record types.
One of the design goals of the FSharp.Azure API is to ensure that your record types are persistence independent. This is unlike the standard ITableEntity
interface, which forces you to implement the PartitionKey and RowKey properties. (And therefore if you're using that interface, you don't need to do any of things in this section.)
However, FSharp.Azure still needs to be able to derive a Partition Key and Row Key from your record type in order to be able to insert it (etc) into table storage. There are three ways of setting this up:
You can use attributes to specify which of your record fields are the PartitionKey and RowKey fields. Here's an example:
type Game = { [<RowKey>] Name : string [<PartitionKey>] Developer : string HasMultiplayer : bool Notes : string }
Sometimes you need to be able to have more control over the values of the Partition Key and Row Key. For example, if we add a Platform field to the Game record type, we will need to change the RowKey, or else we would be unable to store two Games with the same Name and Developer, but different Platforms.
To cope with this situation, you can implement an interface on the record type:
type Game = { Name: string Developer : string Platform: string HasMultiplayer : bool Notes : string } interface IEntityIdentifiable with member g.GetIdentifier() = { PartitionKey = g.Developer; RowKey = sprintf "%s-%s" g.Name g.Platform }
In the above example, we've derived the Row Key from both the Name and Platform fields.
For those purists who don't want to dirty their types with interfaces and attributes, there is the option of replacing a statically stored function with a different implementation. For example:
let getGameIdentifier g = { PartitionKey = g.Developer; RowKey = sprintf "%s-%s" g.Name g.Platform } EntityIdentiferReader.GetIdentifier <- getGameIdentifier
The type of GetIdentifier
is:
'T -> EntityIdentifier
The first thing to do is define a helper function inGameTable
that will allow us to persist records to table storage into an existing table called "Games".
open Microsoft.WindowsAzure.Storage open Microsoft.WindowsAzure.Storage.Table let account = CloudStorageAccount.Parse "UseDevelopmentStorage=true;" //Or your connection string here let tableClient = account.CreateCloudTableClient() let inGameTable game = inTable tableClient "Games" game
This technique of taking a library function and fixing the tableClient
and table name parameters is very common when using FSharp.Azure's API, and you can do it to other similar library functions.
FSharp.Azure supports all the different Azure table storage modification operations and describes them in the Operation
discriminated union:
type Operation<'T> = | Insert of entity : 'T | InsertOrMerge of entity : 'T | InsertOrReplace of entity : 'T | Replace of entity : 'T * etag : string | ForceReplace of entity : 'T | Merge of entity : 'T * etag : string | ForceMerge of entity : 'T | Delete of entity : 'T * etag : string | ForceDelete of entity : 'T
The Operation
discriminated union is used to wrap your record instance and describes the modification operation, but doesn't actually perform it. You act upon the Operation
by passing it to our inGameTable
helper function (which calls the inTable
library function). See below for examples for all the different types of operations.
In order to insert a row into table storage we wrap our record using Insert
and pass it to our helper function, like so:
let game = { Name = "Halo 4" Platform = "Xbox 360" Developer = "343 Industries" HasMultiplayer = true Notes = "Finished the game in Legendary difficulty." } let result = game |> Insert |> inGameTable
result
is of type OperationResult
:
type OperationResult = { HttpStatusCode : int Etag : string }
The other variations of Insert (InsertOrMerge and InsertOrReplace) can be used in a similar fashion:
let result = game |> InsertOrMerge |> inGameTable let result = game |> InsertOrReplace |> inGameTable
Replacing a record in table storage can be done similarly to inserting, with one caveat. Azure table storage provides optimistic concurrency protection using etags, so when replacing an existing record you also need to pass the etag that matches the row in table storage. For example:
let game = { Name = "Halo 4" Platform = "Xbox 360" Developer = "343 Industries" HasMultiplayer = true Notes = "Finished the game in Legendary difficulty." } let originalResult = game |> Insert |> inGameTable let gameChanged = { game with Notes = "Finished the game in Legendary and Heroic difficulty." } let result = (gameChanged, originalResult.Etag) |> Replace |> inGameTable
If you want to bypass the optimistic concurrency protection and just replace the row anyway, you can use ForceReplace
instead of Replace
:
let result = gameChanged |> ForceReplace |> inGameTable
Merging is handled similarly to replacing, in that it requires the use of an etag. Merging can be used when you want to modify a subset of properties on a row in table storage, or a different set of properties on the same row, without affecting the other existing properties on the row.
As a demonstration, we'll define a new GameSummary
record that omits the Notes field, so we can update the row without touching the Notes property at all.
type GameSummary = { Name : string Developer : string Platform : string HasMultiplayer : bool } interface IEntityIdentifiable with member g.GetIdentifier() = { PartitionKey = g.Developer; RowKey = sprintf "%s-%s" g.Name g.Platform }
Now we'll use Merge
to update an inserted row:
let game = { Name = "Halo 4" Platform = "Xbox 360" Developer = "343 Industries" HasMultiplayer = true Notes = "Finished the game in Legendary difficulty." } let originalResult = game |> Insert |> inGameTable let gameSummary = { GameSummary.Name = game.Name Platform = game.Platform Developer = game.Developer HasMultiplayer = false } //Change HasMultiplayer let result = (gameSummary, originalResult.Etag) |> Merge |> inGameTable
Like Replace
, Merge
has a ForceMerge
variant that ignores the optimistic concurrency protection:
let result = gameSummary |> ForceMerge |> inGameTable
Deleting is handled similarly to Replace
and Merge
and requires an etag.
let game = { Name = "Halo 4" Platform = "Xbox 360" Developer = "343 Industries" HasMultiplayer = true Notes = "Finished the game in Legendary difficulty." } let originalResult = game |> Insert |> inGameTable let result = (game, originalResult.Etag) |> Delete |> inGameTable
A ForceDelete
variant exists for deleting even if the row has changed:
let result = game |> ForceDelete |> inGameTable
Often you want to be delete a row without actually loading it first. You can do this easily by using the EntityIdentifier
record type which just lets you specify the Partition Key and Row Key of the row you want to delete:
let result = { EntityIdentifier.PartitionKey = "343 Industries"; RowKey = "Halo 4-Xbox 360" } |> ForceDelete |> inGameTable
The inGameTable
helper function we've been using uses the inTable
library function, which means that operations are processed synchronously when inTable
is called. Sometimes you want to be able to process operations asynchronously.
To do this we'll define a new helper function that will use inTableAsync
instead:
let inGameTableAsync game = inTableAsync tableClient "Games" game
Then we can use that in a similar fashion:
let game = { Name = "Halo 4" Platform = "Xbox 360" Developer = "343 Industries" HasMultiplayer = true Notes = "Finished the game in Legendary difficulty." } let result = game |> Insert |> inGameTableAsync |> Async.RunSynchronously
One obvious advantage of asynchrony is that we can very easily start performing operations in parallel. Here's an example where we insert two records in parallel:
let games = [ { Name = "Halo 4" Platform = "Xbox 360" Developer = "343 Industries" HasMultiplayer = true Notes = "Finished the game in Legendary difficulty." } { Name = "Halo 5" Platform = "Xbox One" Developer = "343 Industries" HasMultiplayer = true Notes = "Haven't played yet." } ] let results = games |> Seq.map (Insert >> inGameTableAsync) |> Async.Parallel |> Async.RunSynchronously
Azure table storage provides the ability to take multiple operations and submit them to be processed all together in one go. There are many reasons why you might want to batch up operations, such as
However, there are some restrictions on what can go into a batch. They are:
FSharp.Azure provides functions to make batching easy. First we'll define a batching helper function:
let inGameTableAsBatch game = inTableAsBatch tableClient "Games" game
Now let's generate 150 Halo games and 50 Portal games, batch them up and insert them into table storage:
let games = [seq { for i in 1 .. 50 -> { Developer = "Valve"; Name = sprintf "Portal %i" i; Platform = "PC"; HasMultiplayer = true; Notes = "" } }; seq { for i in 1 .. 150 -> { Developer = "343 Industries"; Name = sprintf "Halo %i" i; Platform = "Xbox One"; HasMultiplayer = true; Notes = "" } }] |> Seq.concat |> Seq.toList let results = games |> Seq.map Insert |> autobatch |> List.map inGameTableAsBatch
The autobatch
function splits the games by Partition Key and then into groups of 100. This means we will have created three batches, one with 50 Portals, one with 100 Halos, and another with the final 50 Halo games. Each batch is then sequentially submitted to table storage.
If we wanted to do this asynchronously and in parallel, we could first define another helper function:
let inGameTableAsBatchAsync game = inTableAsBatchAsync tableClient "Games" game
Then use it:
let results = games |> Seq.map Insert |> autobatch |> List.map inGameTableAsBatchAsync |> Async.Parallel |> Async.RunSynchronously
In this post, we’ve gone into gory detail about how to modify data in Azure table storage using FSharp.Azure. In a future post, I’ll do a similar deep dive into the opposite side: how to query data from table storage.
May 06, 2014 12:03 PM by Daniel Chambers (last modified on May 18, 2014 2:41 PM)
Over the last few months I’ve been learning F#, .NET’s functional programming language. One of the first things I started fiddling with was using F# to read and write to Azure table storage. Being a .NET language, F# can of course use the regular Microsoft WindowsAzure.Storage API to work with table storage, however that forces you to write F# in a very non-functional way. For example, the standard storage API forces you to use mutable classes as your table storage entities, and mutability is a functional programming no no.
There’s an existing open-source library called Fog, which provides an F# API to talk to the Azure storage services, but its table storage support is a thin wrapper over an old version of the WindowsAzure.Storage API. Unfortunately this means you still have to deal with mutable objects (boo hiss!). Also, Fog doesn’t support querying table storage.
So with my fledgling F# skills, I decided to see if I could do better; FSharp.Azure was born. In this first version, it only supports table storage, but I hope to in the future expand it to cover the other Azure storage services too.
FSharp.Azure, like Fog, is a wrapper over WindowsAzure.Storage, however it exposes a much more idiomatic F# API surface. This means you can talk to table storage by composing together F# functions, querying using F# quotations, and do it all using immutable F# record types.
But enough blathering, let’s have a quick taste test!
Imagine we had a record type that we wanted to save into table storage:
open DigitallyCreated.FSharp.Azure.TableStorage type Game = { [<PartitionKey>] Developer: string [<RowKey>] Name: string HasMultiplayer: bool }
Note the attributes on the record fields that mark which fields are used as the PartitionKey and RowKey properties for table storage.
We’ll first define a helper function inGameTable
that will allow us to persist these Game
records to table storage into an existing table called "Games":
open Microsoft.WindowsAzure.Storage open Microsoft.WindowsAzure.Storage.Table let account = CloudStorageAccount.Parse "UseDevelopmentStorage=true;" //Or your connection string here let tableClient = account.CreateCloudTableClient() let inGameTable game = inTable tableClient "Games" game
Now that the set up ceremony is done, let's insert a new Game
into table storage:
let game = { Developer = "343 Industries"; Name = "Halo 4"; HasMultiplayer = true } let result = game |> Insert |> inGameTable
Let's say we want to modify this game and update it in table storage:
let modifiedGame = { game with HasMultiplayer = false } let result2 = (modifiedGame, result.Etag) |> Replace |> inGameTable
Want more detail about modifying data in table storage? Check out this post.
First we need to set up a little helper function for querying from the "Games" table:
let fromGameTable q = fromTable tableClient "Games" q
Here's how we'd query for an individual record by PartitionKey and RowKey:
let halo4, metadata = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "343 Industries" && s.RowKey = "Halo 4" @> |> fromGameTable |> Seq.head
If we wanted to find all multiplayer games made by Valve:
let multiplayerValveGames = Query.all<Game> |> Query.where <@ fun g s -> s.PartitionKey = "Valve" && g.HasMultiplayer @> |> fromGameTable
For more detail about querying, check out this post.
To get FSharp.Azure, use NuGet to install “FSharp.Azure”. At the time of writing, it’s still in beta so you’ll need to include pre-releases (tick the box on the GUI, or use the -Pre flag from the console). (v1.0.0 has been released!)
In future blog posts, I’ll go into more detail about modification operations, further querying features, asynchronous support and all the other bits FSharp.Azure supports. In the meantime, please visit GitHub for more information and to see the source code.