Posts filed under ‘PatternBuilders Technology’

Video: Big Data Made Easy. Sticky – see below for latest posts.

November 15, 2011 at 9:49 am 5 comments

All Together Now: All You Need is a Text Box!

By Terence Craig

All you need is text, Text is all you need (sing to the tune of The Beatles’ All you need is love).   If you are one of our regular readers you will remember that several months ago I wrote a manifesto on what the perfect analytics system would look like.  One of the last points was:

It must be as accessible as Excel (still the number one analytics tool in the world).

I was wrong – Excel is the number one non-specialized analytics tool in the world but in terms of usage, it is dwarfed in comparison to a very well know specialized analytics toolkit. The creators of this tool are a little company that you may have heard of:  it does no evil and analyzes the Internet to bring you back everything on the web based on a simple text query.  But behind that simple text box, Google has one of the most sophisticated analytics infrastructures in the world:

  • It can deduce your interests.
  • Give you the most relevant results.
  • And show you appropriate information based on them, as well as bring back highly personalized ads.

Google is not only the largest big data analytics company in the world, but it also has the easiest to use tools—proof that text is all you really need!

(more…)

October 14, 2011 at 3:22 pm 4 comments

No-SQL – Going All The Way

Going All The Way

We have recently made a big architectural change concerning our storage back-end and I wanted to talk about it.

Storage is key to any Big Data problem. As we’ve mentioned in prior posts, most of our performance bottlenecks and optimizations have to do with storage performance and architecture, as opposed to computation. Our architecture for the last few years has consisted of a hybrid approach with “no-SQL” analytics storage using MongoDB and “non-transactional” data stored in a traditional RDBMS, primarily SQL Server. There were a couple of reasons for this architecture. First, we started off entirely in RDBMS-land, because our initial design was done before no-SQL systems were really at a production-level of maturity. Second, most of our customers and prospects had traditional schemas and data organization – making integration easier if we could just use the same object model. (more…)

September 28, 2011 at 4:18 pm 1 comment

Maps: Lessons Learned

Recently we’ve been adding new user-friendly features to our platform and I’d like to talk about our map view. In particular, I want to discuss the lessons we learned from the map in the first version of the PAF (PatternBuilders Analytics Framework) versus the one in our new Silverlight client.

You may have already seen some screenshots of the map in our AJAX web client – when we released the first versions of PAF, we integrated with Google Maps to help users see their data on a map for quick comparisons and analysis. It’s always been a helpful tool, but suffered from a learning curve for new users and could potentially confuse people due to the way it displayed data.

The AJAX Client Map View

The AJAX Client Map View

Showing time series data on a map is a tricky proposition – the map is already two dimensional, and the addition of the two dimensions of time series analytics takes it into the 4th dimension. As exciting as it would be to see a four dimensional map view (we’d definitely be the only company doing it!), I don’t think most human beings would be able to understand it.

(more…)

September 6, 2011 at 8:04 am Leave a comment

It’s About Time: Series Data, Streaming, & Architecture

 

In previous posts, we have talked a lot about the PatternBuilders Analytics platform and streaming analytics. This platform is able to scale for huge amounts of data and stream results to the user as they are processed in real time. As mentioned before, we can do this because we have focused on time series analytics, making optimizations to our architecture that beat generalized MapReduce types of solutions by orders of magnitude. I’d like to discuss this focus and how it came about.

Why time series data?

Time series data is ubiquitous. It’s actually more difficult to think of an analytics question a user would be interested in that doesn’t involve time in some capacity. Even a non-numeric query like “Order the list of products by units sold” is almost useless without specifying a time period for which to sort. (more…)

July 14, 2011 at 6:20 pm 7 comments

Speaking at MongoSF Conference: Building a Streaming Analytics System with Mongo

By Terence Craig

I am excited to announce that I will be speaking at MongoSF 2011 with my fellow data wrangler, Tim.  Our talk will cover how we used Mongo to build the PatternBuilders Analytics Framework. The official title for our talk is: Building a Streaming Analytics System with Mongo.

In a previous post, I talked about the impact our Social Media Analytics solution had on our deployment choices. Briefly, we wanted to make a beta version of our solution publicly available on the web and to do that, we needed to ensure sufficient capacity.  Since we did not want to make a massive investment in the infrastructure to support it, we investigated the state of cloud servers. Long story short, as part of our move from our colo to the cloud we made a significant change in architecture, fully embraced some of MongoDB’s more advanced capabilities, and created a radically improved product – although the previous version was pretty cool too! (more…)

May 16, 2011 at 7:14 pm 5 comments

To Cloud Or Not To Cloud

By Terence Craig

When we started PatternBuilders, we made what was then an unusual decision: to avoid multi-tenancy as I talked about here.  However, we also decided to avoid the cloud because we wanted to have predictable costs and felt that given the high level of expertise we had internally with managing data centers, we would be better off investing in top tier colocation facilities. This made a lot sense given the security sensitivities of our initial target markets: internal IT at the Fortune 500, large retail suppliers, and hospital groups.  It was also an economically viable choice because our business model provisions hardware and bandwidth for each customer after the sale to manage cash flow.  We also knew that we would be able to reduce both the cost and maintenance headaches of separate customer provisioning by aggressive use of virtualization technology, much like the cloud server vendors Rackspace, Amazon, and others do today.
(more…)

April 25, 2011 at 7:12 am 6 comments

PatternBuilders Analytic Framework (PAF) Correlation Video

Terence’s last post was about correlation – a powerful tool that requires an easy-to-use UI to be effective along with some serious number crunching backing it up. I made a quick demo of our recently upgraded “insight discovery” interface showing off how to analyze correlations with the PAF. Click below to watch. Let us know what you think of it.

New Insight Discovery Interface

March 3, 2011 at 3:02 pm 3 comments

A Deep Dive Into PatternBuilders Analytics Framework (PAF)

Today one of our Server Engineers is going to give you a deep dive on our architecture.  As always on our blog, all of the data is simulated and all trademarks are the property of their respective owners.

Hello everyone! I am going to get fairly technical in this post and go over how PatternBuilders Analytics Framework (PAF) does what it does so well. As Terence has said in the past couple of posts, we have a new architecture that’s based around scalability, streaming, and ease of use. That’s not quite the whole story though; the development of this architecture was in fact driven primarily by performance. (more…)

February 2, 2011 at 1:30 pm 6 comments

The Perfect Fit for Analytics

By Terence Craig

In my last post, I gave an overview of the difference between batch and streaming analytics approaches.  It was a very popular post and was mentioned on the excellent MyNOSQL blog whichwas really appreciated.  Their able proprietor, Alex Popescu, had the following comment:

“I cannot put my finger on it right now, but I don’t think stream processing can cover exactly the same wide range of computations available in batch processing:

While I haven’t had the chance to play with real big data, I believe it is not a matter of either or. An ideal system would need to support:

  • piping incoming data through a combination of filters, preprocessors/transformers, and calculators/extractors
  • preserve (all/relevant) data for later computation
  • allow processing of stored data in either streams or batches“

(more…)

January 31, 2011 at 3:02 pm 2 comments

Older Posts Newer Posts


Video: Big Data Made Easy

PatternBuilders Corporate

Special privacy section!

Previous Posts


%d bloggers like this: