Explaining Distributed Data Consistency to IT novices? Well, …

Greek Shepherd

it’s all greek to me. Bruce Stidston cited a post on Google+ where Yonatan Zunger, Chief Architect of Google+, tried to explain Data Consistency by way of Greeks enacting laws onto statute books on disparate islands. Very long post here. It highlights the challenges of maintaining data consistency when pieces of your data are distributed over many locations, and the logistics of trying to keep them all in sync – in a way that should be understandable to the lay – albeit patient – reader.

The treatise missed out the concept of two-phased commit, which is a way of doing handshakes between two (identical copies) of a database to ensure a transaction gets played successfully on both the master and the replica sited elsewhere on a network. So, if you get some sort of failure mid transaction, both sides get returned to a consistent state without anything going down the cracks. Important if that data is monetary balance transfers between bank accounts for example.

The thing that impressed me most – and which i’d largely taken for granted – is how MongoDB (the most popular Open Source NoSQL Database in the world) can handle virtually all the use cases cited in the article out of the box, with no add-ons. You can specify “happy go lucky”, majority or all replicas consistent before confirming write completion. And if a definitive “Tyrant” fails, there’s an automatic vote among the surviving instances for which secondary copy becomes the new primary (and on rejoining, the changes are journaled back to consistency). And those instances can be distributed in different locations on the internet.

Bruce contended that Google may not like it’s blocking mechanics (which will slow down access while data is written) to retain consistency on it’s own search database. However, I think Google will be very read heavy, and it won’t usually be a disaster if changes are journaled onto new Google search results to its readers. No money to go between the cracks in their case, any changes just appear the next time you enact the same search; one very big moving target.

Ensuring money doesn’t go down the cracks is what Blockchains design out (majority votes, then change declines to update attempts after that’s achieved). That’s why it can take up to 10 minutes for a Bitcoin transaction to get verified. I wrote introductory pieces about Bitcoin and potential Blockchain applications some time back if those are of interest.

So, i’m sure there must be a more pithy summary someone could draw, but it would add blockchains to the discussion, and probably relate some of the artistry behind hashes and Git/Github to manage large, multiuser, multiple location code, data and writing projects. However, that’s for the IT guys. They should know this stuff, and know what to apply in any given business context.

Footnote: I’ve related MongoDB as that is the one NoSQL database I have accreditations in, having completed two excellent online courses with them (while i’m typically a senior manager, I like to dip into new technologies to understand their capabilities – and to act as a bullshit repellent!). Details of said courses here. The same functionality may well be available with other NoSQL databases.

Starting with the end in mind: IT Management Heat vs Light

A very good place to startOne source of constant bemusement to me is the habit of intelligent people to pee in the industry market research bathwater, and then to pay handsomely to drink a hybrid mix of the result collected across their peers.

Perhaps betrayed by an early experience of one research company coming in to present to the management of the vendor I was working at, and finding in the rehearsal their conjecture that sales of specific machine sizes had badly dipped in the preceding quarter. Except they hadn’t; we’d had the biggest growth in sales of the highlighted machines in our history in that timeframe. When I mentioned my concern, the appropriate slides were corrected in short order, and no doubt the receiving audience impressed with the skill in their analysis that built a forecast starting with an amazingly accurate, perceptive (and otherwise publicly unreported) recent history.

I’ve been doubly nervous ever since – always relating back to the old “Deep Throat” hints given in “All the Presidents Men” – that of, in every case, “to follow the money”.

Earlier today, I was having some banter on one of the boards of “The Motley Fool” which referenced the ways certain institutions were imposing measures on staff – well away from a useful business use that positively supported better results for their customers. Well, except of providing sound bites to politicians. I can sense that in Education, in some elements of Health provision, and rather fundamentally in the Police service. I’ve even done a drains-up some time ago that reflected on the way UK Police are measured, and tried trace the rationale back to source – which was a senior politician imploring them to reduce crime; blog post here. The subtlety of this was rather lost; the only control placed in their hands was that of compiling the associated statistics, and to make their behaviours on the ground align supporting that data collection, rather than going back to core principles of why they were there, and what their customers wanted of them.

Jeff Bezos (CEO of Amazon) has the right idea; everything they do aligns with the ultimate end customer, and everything else works back from there. Competition is something to be conscious of, but only to the extent of understanding how you can serve your own customers better. Something that’s also the central model that W. Edwards Deming used to help transform Japanese Industry, and in being disciplined to methodically improve “the system” without unnecessary distractions. Distractions which are extremely apparent to anyone who’s been subjected to his “Red Beads” experiment. But the central task is always “To start with the end in mind”.

With that, I saw a post by Simon Wardley today where Gartner released the results of a survey on “Top 10 Challenges for I&O Leaders”, which I guess is some analogue of what used to be referred to as “CIOs”. Most of which felt to me like a herd mentality – and divorced from the sort of issues i’d have expected to be present. In fact a complete reenactment of this sort of dialogue Simon had mentioned before.

Simon then cited the first 5 things he thought they should be focussed on (around Corrective Action), leaving the remainder “Positive Action” points to be mapped based on that appeared upon that foundation. This in the assumption that those actions would likely be unique to each organisation performing the initial framing exercise.

Simon’s excellent blog post is: My list vs Gartner, shortly followed by On Capabilities. I think it’s a great read. My only regret is that, while I understand his model (I think!), i’ve not had to work on the final piece between his final strategic map (for any business i’m active in) and articulating a pithy & prioritised list of actions based on the diagram created. And I wish he’d get the bandwidth to turn his Wardley Maps into a Book.

Until then, I recommend his Bits & Pieces Blog; it’s a quality read that deserves good prominence on every IT Manager’s (and IT vendors!) RSS feed.

CloudKit – now that’s how to do a secure Database for users

Data Breach Hand Brick Wall Computer

One of the big controversies here relates to the appetite of the current UK government to release personal data with the most basic understanding of what constitutes personal identifiable information. The lessons are there in history, but I fear without knowing the context of the infamous AOL Data Leak, that we are destined to repeat it. With it goes personal information that we typically hold close to our chests, which may otherwise cause personal, social or (in the final analysis) financial prejudice.

When plans were first announced to release NHS records to third parties, and in the absence of what I thought were appropriate controls, I sought (with a heavy heart) to opt out of sharing my medical history with any third party – and instructed my GP accordingly. I’d gladly share everything with satisfactory controls in place (medical research is really important and should be encouraged), but I felt that insufficient care was being exercised. That said, we’re more than happy for my wife’s Genome to be stored in the USA by 23andMe – a company that demonstrably satisfied our privacy concerns.

It therefore came as quite a shock to find that a report, highlighting which third parties had already been granted access to health data with Government mandated approval, ran to a total 459 data releases to 160 organisations (last time I looked, that was 47 pages of PDF). See this and the associated PDFs on that page. Given the level of controls, I felt this was outrageous. Likewise the plans to release HMRC related personal financial data, again with soothing words from ministers in whom, given the NHS data implications, appear to have no empathy for the gross injustices likely to result from their actions.

The simple fact is that what constitutes individual identifiable information needs to be framed not only with what data fields are shared with a third party, but to know the resulting application of that data by the processing party. Not least if there is any suggestion that data is to be combined with other data sources, which could in turn triangulate back to make seemingly “anonymous” records traceable back to a specific individual.Which is precisely what happened in the AOL Data Leak example cited.

With that, and on a somewhat unrelated technical/programmer orientated journey, I set out to learn how Apple had architected it’s new CloudKit API announced this last week. This articulates the way in which applications running on your iPhone handset, iPad or Mac had a trusted way of accessing personal data stored (and synchronised between all of a users Apple devices) “in the Cloud”.

The central identifier that Apple associate with you, as a customer, is your Apple ID – typically an email address. In the Cloud, they give you access to two databases on their cloud infrastructure; one a public one, the other private. However, the second you try to create or access a table in either, the API accepts your iCloud identity and spits back a hash unique to your identity and the application on the iPhone asking to process that data. Different application, different hash. And everyone’s data is in there, so it’s immediately unable to permit any triangulation of disparate data that can trace back to uniquely identify a single user.

Apple take this one stage further, in that any application that asks for any personal identifiable data (like an email address, age, postcode, etc) from any table has to have access to that information specifically approved by the handset owners end user; no explicit permission (on a per application basis), no data.

The data maintained by Apple, besides holding personal information, health data (with HealthKit), details of home automation kit in your house (with HomeKit), and not least your credit card data stored to buy Music, Books and Apps, makes full use of this security model. And they’ve dogfooded it so that third party application providers use exactly the same model, and the same back end infrastructure. Which is also very, very inexpensive (data volumes go into Petabytes before you spend much money).

There are still some nuances I need to work. I’m used to SQL databases and to some NoSQL database structures (i’m MongoDB certified), but it’s not clear, based on looking at the way the database works, which engine is being used behind the scenes. It appears to be a key:value store with some garbage collection mechanics that look like a hybrid file system. It also has the capability to store “subscriptions”, so if specific criteria appear in the data store, specific messages can be dispatched to the users devices over the network automatically. Hence things like new diary appointments in a calendar can be synced across a users iPhone, iPad and Mac transparently, without the need for each to waste battery power polling the large database on the server waiting for events that are likely to arrive infrequently.

The final piece of the puzzle i’ve not worked out yet is, if you have a large database already (say of the calories, carbs, protein, fat and weights of thousands of foods in a nutrition database), how you’d get that loaded into an instance of the public database in Apple’s Cloud. Other that writing custom loading code of course!

That apart, really impressed how Apple have designed the datastore to ensure the security of users personal data, and to ensure an inability to triangulate data between information stored by different applications. And that if any personal identifiable data is requested by an application, that the user of the handset has to specifically authorise it’s disclosure for that application only. And without the app being able to sense if the data is actually present at all ahead of that release permission (so, for example, if a Health App wants to gain access to your blood sampling data, it doesn’t know if that data is even present or not before the permission is given – so the app can’t draw inferences on your probably having diabetes, which would be possible if it could deduce if it knew that you were recording glucose readings at all).

In summary, impressive design and a model that deserves our total respect. The more difficult job will be to get the same mindset in the folks looking to release our most personal data that we shared privately with our public sector servants. They owe us nothing less.

For Enterprise Sales, nothing sells itself…

Trusted Advisor

I saw a great blog post published on the Andreessen Horowitz (A16Z) web site asking why Software as a Service offerings didn’t sell themselves here. A lot of it stems from a misunderstanding what a good salesperson does (and i’ve been blessed to work alongside many good ones throughout my career).

The most successful ones i’ve worked with tend to work there way into an organisation and to suss the challenges that the key executives are driving as key business priorities. To understand how all the levers get pulled from top to bottom of the org chart, and to put themselves in a position of “trusted advisor”. To be able to communicate ideas that align with the strategic intent, to suggest approaches that may assist, and to have references ready that demonstrate how the company the salesperson represents have solved similar challenges for other organisations. At all times, to know who the customer references and respects across their own industry.

Above all, to have a thorough and detailed execution plan (or set of checklists) that they follow to understand the people, their processes and their aspirations. That with enough situational awareness that they know who or what could positively – and negatively – affect the propensity of the customer to spend money. Not least to avoid the biggest competitor of all – an impression that “no decision” or a project stall will leave them in a more comfortable position than enacting a needed change.

When someone reaches board level, then their reference points tend to be folks in the same position at other companies. Knowing the people networks both inside and outside the company are key.

Folks who I regard as the best salespeople i’ve ever worked with tend to be straight forward, honest, well organised, articulate, planned, respectful of competitors and adept at working an org chart. And they also know when to bring in the technical people and senior management to help their engagements along.

The antithesis are the “wham bam thankyou mam”, competitors killed at all costs and incessant quoters of speeds and feeds. For those, i’d recommend reading a copy of “The Trusted Advisor” by Maister, Green and Galford.

Trust is a prize asset, and the book describes well how it is obtained and maintained in an Enterprise selling environment. Also useful to folks like me who tend to work behind the scenes to ensure salespeople succeed; it gives some excellent insight into the sort of material that your sales teams can carry into their customers and which is valued by the folks they engage with.

Being trusted and a source of unique, valuable insights is a very strong position for your salespeople to find themselves in. You owe it to them to be a great source of insights and ideas, either from your own work or curated from other sources – and to keep customers informed and happy at all costs. Simplicity sells.

Sometimes a picture is “How on earth did you do that”?

IBM3270ALLIN1

People often remember a startling or surprising first impression. Riverdance when they first appeared during the voting interval during Eurovision 1994. 19-year old Everton substitute Wayne Rooney being put on the pitch against a season-long unbeaten Arsenal side, and scoring. A young David Beckham doing likewise against Wimbledon from the half way line. Or Doug Flutie, Quarterback for Boston College, throwing the winning touchdown in a Rose Bowl final from an incredible distance with no time left on the clock. There is even a road in Boston called “Flutie Pass” named in memory of that sensational hail mary throw.

There are always lots of pressures on IT Managers and their staff, with tightening budgets, constrained resources and a precious shortage of time. We used to have a task to try and minimise the friction these folks had in buying Enterprise IT products and services from us or our reseller channels. A salesperson or vendor was normally the last person they wanted to have a dependency on for basic, routine “stuff”, especially for items they should be able to work out for themselves. At least if given the right information in lucid form, concise and free of surprises – immediately available at their fingertips.

The picture was one of the ones we put in the DECdirect Software Catalogue. It shows an IBM 3278 terminal, hooked up to an IBM Mainframe, with Digital’s VAX based ALL-IN-1 Office Automation Suite running on it. At the time, this was a startling revelation; the usual method for joining an IBM system to a DEC one at the time was to make the DEC machine look like a remotely connected IBM 2780 card reader. The two double page spreads following that picture showed how to piece this, and other forms of connections to IBM mainframes, together.

The DECdirect Software catalogue had an aim of being able to spit out all the configuration rules, needed part numbers and matching purchase prices with a minimal, simple and concise read. Our target for our channel salesforce(s) was to enable them to extract a correct part number and price for any of our 550 products – across between 20-48 different pricing tiers each – within their normal attention span. Which we assumed was 30 seconds. Given appropriate focus, Predictability, Consistency and the removal of potential surprises can be designed in.

In the event, that business (for which I was the first employee in, working alongside 8 shared telesellers and 2 tech support staff) went 0-$100m in 18 months, with over 90% of the order volume coming in directly from customers, correctly priced at source. That got me a 2-level promotion and running the UK Software Products Business, 16 staff and the country software P&L as a result.

One of my colleagues in DEC Finland did a similar document for hardware options, entitled “Golden Eggs“. Everything in one place, with all the connections on the back of each system nicely documented, and any constraints right in front of you. A work of great beauty, and still maintained to this day for a wide range of other systems and options. The nearest i’ve seen more recently are sample architecture diagrams published by Amazon Web Services – though the basics for IT Managers seeing AWS (or other public cloud vendors offerings) for the first time are not yet apparent to me.

Things in the Enterprise IT world are still unnecessarily complicated, and the ability to stand in the end users shoes for a limited time bears real fruits. I’ve repeated that in several places before and since then with pretty spectacular results; it’s typically only a handful of things to do well in order to liberate end users, and to make resellers and other supply channels insanely productive. All focus then directed on keeping customers happy and their objectives delivered on time, and more often that not, under budget.

One of my friends (who works at senior level in Central Government) lamented to me today that “The (traditional vendor) big players are all trying to convince the world of their cloudy goodness, unfortunately using their existing big contract corporate teams who could not sell life to a dying man”.

I’m sure some of the Public Cloud vendors would be more than capable to arm people like him appropriately. I’d love to help a market leading one do it.

Footnote: I did a previous post on what Vendors, Distributors and Resellers want here.

Officially Certified: AWS Business Professional

AWS Business Professional Certification

That’s added another badge, albeit the primary reason was to understand AWS’s products and services in order to suss how to build volumes via resellers for them – just in case I can get the opportunity to be asked how i’d do it. However, looking over the fence at some of the technical accreditation exams, I appear to know around half of the answers there already – but need to do those properly and take notes before attempting those.

(One of my old party tricks used to be that I could make it past the entrance exam required for entry into technical streams at Linux related conferences – a rare thing for a senior manager running large Software Business Operations or Product Marketing teams. Being an ex programmer who occasionally fiddles under the bonnet on modern development tools is a useful thing – not least to feed an ability to be able to spot bullshit from quite a distance).

The only AWS module I had any difficulty with was the pricing. One of the things most managers value is simplicity and predictability, but a lot of the pricing of core services have pricing dependencies where you need to know data sizes, I/O rates or the way your demand goes through peaks and troughs in order to arrive at an approximate monthly price. While most of the case studies amply demonstrate that you do make significant savings compared to running workloads on your own in-house infrastructure, I guess typical values for common use cases may be useful. For example, if i’m running a SAP installation of specific data and access dimensions, what operationally are typically running costs – without needing to insert probes all over a running example to estimate it using the provided calculator?

I’d come back from a 7am gym session fairly tired and made the mistake of stepping through the pricing slides without making copious notes. I duly did all that module again and did things properly the next time around – and passed it to complete my certification.

The lego bricks you snap together to design an application infrastructure are simple in principle, loosely connected and what Amazon have built is very impressive. The only thing not provided out of the box is the sort of simple developer bundle of an EC2 instance, some S3 and MySQL based EBD, plus some open source AMIs preconfigured to run WordPress, Joomla, Node.js, LAMP or similar – with a simple weekly automatic backup. That’s what Digital Ocean provide for a virtual machine instance, with specific storage and high Internet Transfer Out limits for a fixed price/month. In the case of the WordPress network on which my customers and this blog runs, that’s a 2-CPU server instance, 40GB of disk space and 4TB/month data traffic for $20/month all in. That sort of simplicity is why many startup developers have done an exit stage left from Rackspace and their ilk, and moved to Digital Ocean in their thousands; it’s predictable and good enough as an experimental sandpit.

The ceiling at AWS is much higher when the application slips into production – which is probably reason enough to put the development work there in the first place.

I have deployed an Amazon Workspace to complete my 12 years of Nutrition Data Analytics work using the Windows-only Tableau Desktop Professional – in an environment where I have no Windows PCs available to me. Just used it on my MacBook Air and on my iPad Mini to good effect. That will cost be just north of £21 ($35) for the month.

I think there’s a lot that can be done to accelerate adoption rates of AWS services in Enterprise IT shops, both in terms of direct engagement and with channels to market properly engaged. My real challenge is getting air time with anyone to show them how – and in the interim, getting some examples ready in case I can make it in to do so.

That said, I recommend the AWS training to anyone. There is some training made available the other side of applying to be a member of the Amazon Partner Network, but there are equally some great technical courses that anyone can take online. See http://aws.amazon.com/training/ for further details.

SaaS Valuations, and the death of Rubber Price Books and Golf Courses

Software Services Road Signs

Questions appear to being asked in VC circles about sky-high Software-as-a-Service company valuations – including one suggestion i’ve seen that it should be based on customer acquisition cost (something I think is insane – acquisition costs are far higher than i’d ever feel comfortable with at the moment). One lead article is this one from Andreessen Horowitz (A16Z) – which followed similar content as that presented on their podcast last week.

There are a couple of observations here. One is that they have the ‘normal’ Enterprise software business model misrepresented. If a new license costs $1000, then subsequent years maintenance is typically in the 20-23% of license cost range; the average life of a licensed product is reckoned to be around 5 years. My own analogue for a business ticking along nicely was having license revenue from new licenses, and support revenue from maintenance (ie: 20% of license cost, for 5 years) around balanced. Traditionally, all profit is on the support revenue; most large scale enterprise software vendors, in my experience, assume that the license cost (less the first year maintenance revenue) represents cost of sales. That’s why CA, IBM and Oracle salespeople drive around in nice cars.

You will also find vendors routinely increasing maintenance costs by the retail price index as well.

The other characteristic, for SaaS companies with a “money in this financial year” mindset, is how important it is to garner as many sales as is humanly possible at the start of a year; a sale made in month 1 will give you 12 months of income in the current financial year, whereas the same sale in month 12 will put only 1 months revenue in the current fiscal. That said, you can normally see the benefits scheduled to arrive by looking at the deferred revenue on the income statement.

Done correctly, the cost of sales of a SaaS vendor should be markedly less than that of a licensed software vendor. Largely due to an ability to run free trials (at virtual zero marginal cost) and to allow customers to design in an SaaS product as part of a feasibility study – and to provision immediately if it suits the business need. The same is true of open source software; you don’t pay until you need support turned on for a production application.

There is also a single minded focus to minimise churn. I know when I was running the Individual Customer Unit at Demon (responsible for all Consumer and SME connectivity sales), I donated £68,000 of the marketing budget one month to pay for software that measured the performance of the connectivity customers experienced – from their end. Hence statistics on all connectivity issues were fed back next time a successful connection was made, and as an aggregate over several 10’s of 1000’s of customers, we could isolate and remove root causes – and hence improve the customer experience. There really is no point wasting marketing spend on a service that doesn’t do a great job for it’s existing users, long before you even consider chasing recruitment of new ones.

The cost per customer acquired was £30 each, or £20 nett of churn, for customers who were spending £120/year for our service.

The more interesting development is if someone can finally break the assumption that to sell Enterprise software, you need to waste so much on customer acquisition costs. That’s a rubber price book and golf course game, and I think the future trend to use of Public Cloud Services – when costs will go over a cliff and way down from todays levels – will be it’s death. Instead, much greater focus on customer satisfaction at all times, which is really what it should have been all the way along.

Having been doing my AWS Accreditations today, I have plenty of ideas to simplify things out to fire up adoption in Enterprise clients. Big potential there.

Customer, Customer, Customer…

Jeff Bezos QuoteI’ve been internalising some of the Leadership principles that Amazon expect to see in every employee, as documented here. All of these explain a lot about Amazon’s worldview, but even the very first one is quite a unique in the IT industry. It probably serves a lesson that most other IT vendors should be more fixated on than I usually experience.

In times when I looked after two Enterprise Systems vendors, it was a never ending source of amusement that no marketing plan would be considered complete without at least one quarterly “competitive attack” campaign. Typically, HP, IBM and Sun (as was, Oracle these days) would expect to fund at least one campaign that aimed squarely into the base of customers of the other vendors (and the reseller channels that served them), typically pushing superior speeds and feeds. Usually selling their own proprietary, margin rich servers and storage to their own base, while tossing low margin x86 servers running Linux to try and unseat proprietary products of the other vendors. I don’t recall a single one working, nor one customer that switched as a result of these efforts.

One thing that DEC used to do well was, when a member of staff from a competitor moved to a job inside the company, to make it a capital offence for anyone to try and seek inside knowledge from that person. The corporate edict was to rely on publicly available data only, and typically to sell on your own strengths. The final piece being to ensure you satisfied your existing customers before ever trying to chase new ones.

Microsoft running “Scroogled” campaigns are a symptom (while Steve Ballmer was in charge) of losing their focus. I met Bill Gates in 1983, and he was a walking encyclopedia of what worked well – and not so well – in competitive PC software products of the day. He could keep going for 20 minutes describing the decision making process of going for a two-button mouse for Windows, and the various traps other vendors had with one or three button equivalents. At the time, it followed through into Microsoft’s internal sales briefing material – sold on their own strengths, and acknowledging competitors with a very balanced commentary. In time, things loosened up and tripping up competitors became a part of their playbook, something I felt a degree of sadness to see develop.

Amazon are much more specific. Start with the customer and work back from there.

Unfortunately, I still see server vendor announcements piling into technologies like “OpenStack” and “Software Defined Networking” where the word “differentiation” features heavily in the announcement text.  This suggests to me that the focus is on competitive vendor positioning, trying to justify the margins required to sustain their current business model, and detached from a specific focus of how a customer needs (and their business environment) are likely to evolve into the future.

With that, I suspect organisations with a laser like focus on the end customer, and who realise which parts of the stack are commoditising (and to follow that to it’s ultimate conclusion), are much more likely to be present when the cost to serve steps off the clifftop and heads down. The real battle is on higher order entities running on the commodity base.

I saw an announcement from Arrow ECS in Computer Reseller News this week that suggested a downturn in their Datacentre Server and Storage Product orders in the previous quarter. I wonder if this is the first sign of the switching into gear of the inevitable downward pricing trend across the IT industry, and especially for its current brand systems and storage vendors.

IT Hardware Vendors clinging onto “Public” and “Hybrid” cloud strategies are, I suspect, the final attempt to hold onto their existing business models and margins while the world migrates to commodity, public equivalents (see my previous post about “Enterprise IT and the Hall of Marbles“).

I also suspect that given their relentless focus on end customer needs, and working back from there, that Amazon Web Services will still be the market leaders as that new landscape unfolds. Certainly shows little sign of slowing down.

AWS Summit 2014, London. Impressed.

Amazon Web Services Logo

Having been to the Google equivalent a few weeks ago, I went to the 2014 AWS Summit in London today. Around 2,000 of us managed to steer around the RMT tube strike and overall, very impressed.

AWS have a “Windows Desktop as a Service” offering arriving real soon now, giving you both a Windows Server 2008 R2 server plus client software (for Windows, Mac, iOS and Android) for circa $30/month/user. That increases to between $50-$70/user/month with Windows and Office in place. I can see major opportunities for them at that pricing, not least as they appear to have solved the issues around high performance graphics being driven remotely, and have also got things like keypads available on the tablet implementations of the client. You can side load apps into the mix either directly or using Active Directory.

So, I will shortly have the ability to run up a PC and do a 30 trial of the current Windows-only Tableau Desktop Professional for around £20 – so I can at last finish off the story telling end piece of my 12 year long weight/nutrition analysis, without having to buy a Windows PC. Just need to be able to through trend lines through a few different filtered scatter plots now (something I couldn’t do with Google Fusion Tables).

There are also several traditional Licensed Software providers offering server implementations of their product as instances you only pay for when active, and with no long term commits. Jaspersoft and Tableau Server being two such examples (there are many more). Amazon are also offering assistance to other software providers to provide more products under this basis, including helping to drive free 30 day trials.

Much else to be very impressed by, and the differences between themselves, Digital Ocean and Google Cloud Services are fairly stand-out – to me at least. I think i’d know what i’d do to fire up Enterprise volumes for either of AWS or Google, but the things i’d do are very different based on what i’ve now learnt.

The most populous stand appears to be that of Splunk, who were one of my 3-4 “bets for the future” when I was at Computacenter. Talking to them, it now looks like IT Security is now their biggest application area, followed by e-commerce infrastructure flows and lastly by their traditional log file (and associated performance) analysis business. The product now appears to have plugins for virtually piece of data centre, storage and network device vendor log file, and relationships in place with all the key large brand vendors – and of course links into AWS infrastructure as well now.

I didn’t win a Kindle HDX, or the iPhone 5S Telecity were raffling, nor either of the two drones. But learnt a lot, and will be applying the learnings over the next few weeks.

Help available to keep malicious users away from your good work

Picture of a Stack of Tins of Spam Meat

One thing that still routinely shocks me is the shear quantity of malicious activity that goes on behind the scenes of any web site i’ve put up. When we were building Internet Vulnerability Testing Services at BT, around 7 new exploits or attack vectors were emerging every 24 hours. Fortunately, for those of us who use Open Source software, the protections have usually been inherent in the good design of the code, and most (OpenSSL heartbleed excepted) have had no real impact with good planning. All starting with closing off ports, and restricting access to some key ones from only known fixed IP addresses (that’s the first thing I did when I first provisioned our servers in Digital Ocean Amsterdam – just surprised they don’t give a template for you to work from – fortunately I keep my own default rules to apply immediately).

With WordPress, it’s required an investment in a number of plugins to stem the tide. Basic ones like Comment Control, that  can lock down pages, posts, images and attachments from having comments added to them (by default, spammers paradise). Where you do allow comments, you install the WordPress provided Akismet, which at least classifies 99% of the SPAM attempts and sticks them in the spam folder straight away. For me, I choose to moderate any comment from someone i’ve not approved content from before, and am totally ruthless with any attempt at social engineering; the latter because if they post something successfully with approval a couple of times, their later comment spam with unwanted links get onto the web site immediately until I later notice and take them down. I prefer to never let them get to that stage in the first place.

I’ve been setting up a web site in our network for my daughter in law to allow her to blog abound Mental Health issues for Children, including ADHD, Aspergers and related afflictions. For that, I installed BuddyPress to give her user community a discussion forum, and went to bed knowing I hadn’t even put her domain name up – it was just another set of deep links into my WordPress network at the time.

By the morning, 4 user registrations, 3 of them with spoof addresses. Duly removed, and the ability to register usernames then turned off completely while I fix things. I’m going into install WP-FB-Connect to allow Facebook users to work on the site based on their Facebook login credentials, and to install WangGuard to stop the “Splogger” bots. That is free for us for the volume of usage we expect (and the commercial dimensions of the site – namely non-profit and charitable), and appears to do a great job  sharing data on who and where these attempts come from. Just got to check that turning these on doesn’t throw up a request to login if users touch any of the other sites in the WordPress network we run on our servers, whose user communities don’t need to logon at any time, at all.

Unfortunately, progress was rather slowed down over the weekend by a reviewer from Kenya who published a list of best 10 add-ins to BuddyPress, #1 of which was a Social Network login product that could authenticate with Facebook or Twitter. Lots of “Great Article, thanks” replies. In reality, it didn’t work with BuddyPress at all! Duly posted back to warn others, if indeed he lets that news of his incompetence in that instance back to his readers.

As it is, a lot of WordPress Plugins (there are circa 157 of them to do social site authentication alone) are of variable quality. I tend to judge them by the number of support requests received that have been resolved quickly in the previous few weeks – one nice feature of the plugin listings provided. I also have formal support contracts in with Cyberchimps (for some of their themes) and with WPMU Dev (for some of their excellent Multisite add-ons).

That aside, we now have the network running with all the right tools and things seem to be working reliably. I’ve just added all the page hooks for Google Analytics and Bing Web Tools to feed from, and all is okay at this stage. The only thing i’d like to invest in is something to watch all the various log files on the server and to give me notifications if anything awry is happening (like MySQL claiming an inability to connect to the WordPress database, or Apache spawning multiple instances and running out of memory – something I had in the early days when the Google bot was touching specific web pages, since fixed).

Just a shame that there are still so many malicious link spammers out there; they waste 30 minutes of my day every day just clearing their useless gunk out. But thank god that Google are now penalising these very effectively; long may that continue, and hopefully the realisation of the error of their ways will lead to being a more useful member of the worldwide community going forward.