Author Archive

Even Geniuses Pass Away

By Terence Craig

Today, I got the sad news that a dear friend and an early contributor to PatternBuilders passed away.

Andrew (Andrei) Leman was a gruff, kind and generous man who will be deeply missed.  Andrei was also a very talented mathematician and software engineer who created some of the fundamental theories around the mathematics of graphs.  His papers on that subject are still heavily cited.

More importantly Andrei was a loving husband to his wife Elena and a great friend  and mentor to many, many folks.

He will be missed but his work and the respect and affection he engendered will endure.

пухом my friend.

November 8, 2012 at 6:19 pm Leave a comment

“Hadoopla”

© Marqin Cook

By Terence Craig

I had to miss Strata due to a family emergency. While Mary picked up the slack for me at our privacy session, and by all reports did her usual outstanding job, I also had to cancel a Tuesday night Strata session sponsored by 10Gen on how PatternBuilders has used Mongo and Azure to create a next generation big data analytics system.   The good news is that I should have some time to catch up on my writing this week so look for a version of what would have been my 10Gen talk shortly. In the meantime, to get me back in the groove, here is a very short post inspired by a Forbes post written by Dan Everett of SAP on “Hadoopla”

As a CEO of a real-time big data analytics company that occasionally competes with parts of the Hadoop ecosystem, I may have some biases (you think?).  But I certainly agree that there is too much Hadoopla (a great term).  If our goal as an industry is to move Big Data out of the lab and into mainstream use by anyone other than the companies that thrive on and have the staff to support high maintenance and very high skill technologies, Hadoop is not the answer – it has too many moving parts and is simply too complex.

To quote from a blog post I wrote a year ago:

“Hadoop is a nifty technology that offers one of the best distributed batch processing frameworks available, although there are other very good ones that don’t get nearly as much press, including Condor and Globus.  All of these systems fit broadly into the High Performance, Parallel, or Grid computing categories and all have been or are currently used to perform analytics on large data sets (as well as other types of problems that can benefit from bringing the power of multiple computers to bear on a problem). The SETI project is probably the most well know (and IMHO, the coolest) application of these technologies outside of that little company in Mountain View indexing the Internet. But just because a system can be used for analytics doesn’t make it an analytics system…..

Why is the industry so focused on Hadoop? Given the huge amount of venture capital that has been poured into various members of the Hadoop eco-system and that eco-system’s failure to find a breakout business model that isn’t hampered by Hadoop’s intrinsic complexity, there is ample incentive for a lot of very savvy folks to attempt to market around these limitations.  But no amount of marketing can change the fact that Hadoop is a tool for companies with elite programmers and top of the line computing infrastructures. And in that niche, it excels.  But it was not designed, and in my opinion will never see, broad adoption outside of that niche despite the seeming endless growth of Hadoopla.

October 24, 2012 at 1:39 pm 1 comment

Black Founders Conference

I am thrilled to be a mentor at the Black Founders Conference in San Francisco.  The event, is sponsored by Black Founders a group that is attacking the digital divide by promoting entrepreneurship. With luminary speakers such as Mitch Kapor (Lotus), Steve Blank (E.phinany) and Charles Hudson (Softech VC), it is sure to be a great event.

September 5, 2012 at 3:20 pm Leave a comment

Speaking on Inman Connect Panel on Real Estate and Big Data

By Terence Craig

I apologize for falling behind on blogging, but between several new hires,  major partnerships, and the industry finally starting to understand the need for product-driven (instead of project-driven) big data, things have been very hectic. Good, but hectic.

I did want to pull my head off my keyboard for a minute to tell you about participating in the big data & real estate panel this Thursday at Connect San Francisco.  Our panel will be moderated by industry luminary Brad Inman @bradInman.

Real estate has always been a data-driven business and is relying more and more on the insights and operational nimbleness provided by big data.  For those of you who are scratching your heads and going, “Huh, Real Estate and big data?” – think about it for a minute.  The real estate industry is “using” big data to do all kinds of things and drive all kinds of business models, such as:

  • Commercial landlords using smart thermostats and smart windows adjusted in real-time to save energy.
  • Capturing real-time parking meter data to make real-time decisions about how long to leave a retail location open.
  • Using real-time video analysis to stop vandalism before it happens.
  • Offering sophisticated analytics – see consumer facing sites like Truila and Zillow.
  • Risk Modeling – check out RMS. Like most of the PatternBuilders team, they were “doing” Big Data before the term was invented.

If you are attending the show, stop by and say hi. If you are interested in Big Data & Real Estate, look for our post-Connect blog next week. In it, we will talk about some great insights about the New York real estate market derived from a ton of data we grabbed from the NYC public data market which was then spun up in the PatternBuilders framework on our brand spanking new Microsoft Azure cloud beta release.

August 1, 2012 at 9:37 pm Leave a comment

Big Data Tools Need to Get Out of the Stone Age: Business Users and Data Scientists Need Applications, Not Technology Stacks

By Terence Craig

Things have been crazy at PatternBuilders recently. The excitement and positive reactions to FinancePBI, our Financial Services big data analytics solution, from media, analysts, venture folks, cloud infrastructure partners, and users has been amazing.  Our new cross industry graphical big data correlation mashups are generating a lot of excitement as well—we like to call this feature Google Correlate on steroids. Check out how our newest partner analytics consultancy, InsightVoices, has used it to find relationships between stock prices and traffic sensor data.

Mary’s recent post on Strata West 2012 provides a great overview of how hot the hype cycle around big data has become (while managing to work in a plug for her favorite gory TV series as well). In case you’re still not convinced, here are some additional nuggets:

  • The market for big data technology worldwide is expected to grow from $3.2 billion in 2010 to $16.9 billion in 2015, a compound annual growth rate (CAGR) of 40% (hat tip to IDC).
  • The amount of big data being generated continues to grow exponentially, now being expected to double in two years. This is largely driven by social networks, smartphones, and really cool IP-enabled devices like the Fitbit and this IPhone-based brain scanning device by our new Strata buddy Tan Le at Emotiv Lifesciences. Yes, she is much smarter than us but we like her anyway!
  • The White House is even doing its share, investing $200 million a year in access and funding to help propel big data sets, techniques, and technologies while giving a shout out to our friends at Data Without Borders.

(more…)

April 18, 2012 at 2:34 pm 5 comments

Big Data and Cloud not a fit? Comments on Infoworld Article

By Terence Craig

Since Disqus seems to have completely eaten (bleh) my comment on @davidlinthicum’s very interesting InfoWorld post – Big data and the cloud: A far from perfect fit, I decided to just expand my comments and make a short blog post out of it. IMHO the problems that David is describing are more a reflection of problems with batch oriented technologies like Hadoop (more on my take on Hadoop here) in the cloud than a general problem for cloud based big data solutions.

Computing always has, and probably always will have, a bias towards creating batch focused technologies at the beginning of any large paradigm shift.   But as new technologies are absorbed, understood, and move from early adopter to more mainstream use, the batch paradigm will inevitably start to shift to streaming and real-time. We have seen this again and again (from punch cards to touch sensitive tablets, downloaded media to streaming media, DOM to SAX parsers, HTML to Ajax, paper maps to real-time GPS). The reason this evolution almost always occurs is simple: humans live and think in real-time and when our tools do as well we are more productive and happier.  So why do we have this bias for batch processing in our first generation computational technologies? Simply put, because batch processing is a lot easier.

(more…)

February 23, 2012 at 3:01 pm Leave a comment

FinancePBI Begins its Shakeout Flight in the Cloud

By Terence Craig

I have been a little quiet on the blogging front recently as I and the rest of the PatternBuilders team have been focused on getting ready to launch our new financial services application: FinancePBI. It is the first cloud-based analytical platform for the Financial Services market.  While this is our first public announcement of our entry into the market, behind the scenes we have been gearing the company up for a big splash for several months:

  • Partnered with ActiveFinancial one of the premier real-time stock ticker vendors in the world.  Look for more data partnerships shortly.
  • We have added Doug Jeffrey to our board of advisors and board of directors.  Doug is an executive with deep Wall Street and startup expertise who has already done outstanding things in the short time he has been with us.
  • We have also partnered with the University of Sydney to use our technology to examine the influence of primary sources (NY Times, etc.) and secondary social media (Twitter, etc.) content on a company’s stock price over a 12 month period. This project will be done exclusively in the cloud and it’s our hope is that we will be able to convince our commercial partners to allow this PatternBuilders instance to be available to the general public. Of course, this would happen after the research is published. (more…)

February 7, 2012 at 8:04 pm 1 comment

A quick thought on #BlackoutSopa day.

By Terence Craig

In our book Privacy & Big Data that was written pre-SOPA, Mary and I spent a fair amount of time looking at the ways that big media interests are pushing both technical and legislative solutions that were inimical to both privacy and free speech. On this day when the Internet is raising its collective voice against one of the most ill thought laws of the Internet age, I thought it would be a great time to quote from the conclusion of Chapter 4 – The Stakeholders.

“Powerful groups, like the MPAA and RIAA and their international counterparts, have borrowed from advertising’s playbook and extended it to every device we own. Today, it’s not just about tracking our online behavior; it’s about tracking what we do within the “four walls” of any device that we own and being able to remotely control them without our permission. These technologies and policies could end up delivering a mortal blow to privacy as well as cede to the government and IP holders unprecedented control over what media we are allowed to consume and share. However you look at this, it’s a pretty high price to pay to support an old business model that is unable to adapt to new technology.”

Tell your congressperson – SOPA/PIPA is bad for the Internet, bad for free speech and bad for due process and should be rejected! More info on the law here.

January 18, 2012 at 2:30 pm 4 comments

No, Hadoop Doesn’t Own Big Data Analytics!

By Terence Craig

A number of folks have asked me if I was concerned about Microsoft’s  recent announcement that they would be partnering with HortonWorks and abandoning their own distributed processing technology for Hadoop.  While I thought this was an unfortunate choice on Microsoft’s part (the Dryad project’s implementation of multi-server Linq was pretty compelling), since HPC is a small part of Microsoft’s business, it probably made sense from a business standpoint.   In any case, we (as in all of us at PatternBuilders) are not concerned and just to be clear: we don’t believe that this announcement (or any other) means that the many Hadoop ecosystem players own the still forming big data analytics market.

That is not to say that the announcement isn’t proof of the strength of the Hadoop ecosystem. Hadoop is a nifty technology that offers one of the best distributed batch processing frameworks available, although there are other very good ones that don’t get nearly as much press, including Condor and Globus.  All of these systems fit broadly into the High Performance, Parallel, or Grid computing categories and all have been or are currently used to perform analytics on large data sets (as well as other types of problems that can benefit from bringing the power of multiple computers to bear on a problem). The SETI project is probably the most well know (and IMHO, the coolest) application of these technologies outside of that little company in Mountain View indexing the Internet. (more…)

December 12, 2011 at 1:41 pm 3 comments

PII Venture Forum: Session Videos Available

By Terence Craig

Mary and I had the distinct pleasure of presenting at the PII Venture Forum on the players and business models forming around PII.  As usual for a PII event, we learned more than we taught and had a great time meeting the other speakers and participants.

One presentation tip learned from the day: following a panel that includes uber VCs Fred Wilson and Roger McNamee, our friend Jim Adler from Intellus, David Glazer from Google and All things Digital’s Kara Swisher is like being the house band that is asked to follow The Who when Daltry still had the best voice in rock roll!

But Mary and I managed to survive and were happy with the results. You can see the video of our presentation here.

The blog is shutting its door for the Thanksgiving Holiday here in the US but we will back next week.  We wish you and your families all the best, whether you are celebrating Thanksgiving or not!

November 22, 2011 at 10:33 am 1 comment

Older Posts Newer Posts


Video: Big Data Made Easy

PatternBuilders Corporate

Special privacy section!

Previous Posts


%d bloggers like this: