"An extraordinary thinker and strategist" "Great knowledge and a wealth of experience" "Informative and entertaining as always" "Captivating!" "Very relevant information" "10 out of 7 actually!" "In my over 20 years in the Analytics and Information Management space I believe Alan is the best and most complete practitioner I have worked with" "Surprisingly entertaining..." "Extremely eloquent, knowledgeable and great at joining the topics and themes between presentations" "Informative, dynamic and engaging" "I'd work with Alan even if I didn't enjoy it so much." "The quintessential information and data management practitioner – passionate, evangelistic, experienced, intelligent, and knowledgeable" "The best knowledgeable, enthusiastic and committed problem solver I have ever worked with" "His passion and depth of knowledge in Information Management Strategy and Governance is infectious" "Feed him your most critical strategic challenges. They are his breakfast." "A rare gem - a pleasure to work with."

Friday, 27 June 2014

Why winning with "Big Data" is like a visit to Whitefellah Burrows

Like prospecting for opals, mining text data can leave you with an awful lot of mullock to shift...

A recent tweet by Carla Gentry put me onto a post suggesting that mining of text data could be the “killer application for “Big Data”

Maybe. 

Then again, maybe not...


Another of our stop-offs during our trip around Australia was in the South Australia township Coober Pedy – an Outback community of about 4000 people in the very middlest of the Middle-of-Nowhere.
The town’s name –Kuppa Piti in the original Aboriginal dialect of the region – literally means “Whitefellah Burrows”. And the whole town is effectively just a series of great big holes in the ground, dug by prospectors seeking the elusive treasure of opals, a delicate gemstone formed from translucent deposits of hydrated silica. Indeed, as well as working underground, many of the locals actually live there to escape the blistering heat of the day and bitter cold of the night-time. It’s a fascinating place, if not exactly pretty!


Coober Pedy and the surrounding region is the source of approximately 80% of the total world supply of opals, and is mainly associated with white or “milky” opal, where the whole stone is extracted, shaped and polished. A smaller supply of boulder opal, where a backing of ironstone is maintained with the opalescent layer, comes from outback Queensland in the area around Winton, while small amounts of black opal are mined in New South Wales. (Overall, Australia supplied 97% of the world’s opal). Much debate comes within the opal community as to whether the white or boulder type opal is best! (Beauty is in the eye of the beholder, I suppose; my wife Kylie has always hated opals, yet we opted to invest in a rare red-hued boulder opal from Bruno in Winton, so you can guess where our thoughts are on this one!)



Interestingly, every opal claim in Australia – whether in Coober Pedy or Queensland - is still staked out by hand and granted to an individual person. Each claim area is 100m by 50m wide - no more, no less – and anyone so minded can turn up, stake out on an unclaimed spot and start digging (subject to paying the local government your $66 per month claim fee). So whereas Australian gold, copper, tin and diamond mining are now all carried out pretty much exclusively on an industrial scale, this method of granting opal mining licenses means that there are no “Big Mining” opal interests. Opal mining is very much still a cottage industry full of character (and characters). This makes for a unique, vibrant and very human – if somewhat anachronistic – community of miners.
The challenge for the opal miners is to actually find the stuff. There’s really not very much of it around in comparison to the tons and tons and tons of worthless dirt spoil (or “mullock”) that surrounds the vibrant slices of valuable gemstone. The landscale around Cooder Pedy is littered with thousands of mounds of mine workings, with spoil piled up pretty much everywhere as if giant moles had infested the area. As if that wasn’t enough to deal with, gem opals are outweighed in huge quantities by the worthless, colourless “potch” opal that predominates. Some choose to make their living prospecting for opal by “noodling” (searching by hand through the loose surface rubble – also known as “fossicking”). Most opal miners get down-and-dirty in the many kilometers of underground tunnels (most dug by hand, although machine mining is now part of the modern-day mix). Great care is also needed when extracting opals from the surrounding rock, due to the fragility of the valuable stones. However you approach it, opal mining is hard and painstaking work, with no guarantee of success., though the miners will tell you that there’s no feeling quite like uncovering a new vein of gem opal. Love, rather than money, keeps many of the miners going.

Opal mining is not all just blind luck, though. There are strong geological indicators that the miners can look for that will give a clue of the right places to look. As well as following the trails of potch, fault lines in the surrounding rock can also strongly suggest that there may be a seam of valuable gem opal waiting to be discovered.

So in summary, to get to the good stuff, the prospectors combine hope, intuition, systematic methods and a lot of hard work.

Which brings us back to the beginning of this post. Is data mining of unstructured data the killer application for “Big Data”?

Well, I’d say that mining text data is pretty much the same story as mining for opals. To start with, there’s not much point to it unless you know what you’re looking for and why you’re looking for it. You’ll be a lot more successful if you start digging in the right areas to start with, which means doing your survey to ensure that there are at least basic indicators of some value to be extracted. Even then, there’s going to be an awful low of data mullock to sift through before you can happen upon any insight of any value. And even with all that in place, the current state of “Big Data” tools means that it’s still more of a craft than an industrial process, at least for the time being - so be prepared to dig by hand. 


So do your data survey first, know the value of what you're looking for, and look for the signs that you're in the right area. And even then, please be careful - it's all too easy to get hurt in the process!

Then, with a bit of luck, your “Big Data” applications might just help you shift some dirt.

2 comments:

  1. Great post Alan! Still deciding whether I like the travelog or the content more :-)

    ReplyDelete
  2. Thanks Martin - I'm not sure how much more travelog stuff I've got in me - I'm running out of notable incidents from the Australia tour!

    I reckon I've got plenty more opinionated opinion in me though...

    ReplyDelete