Connect with us

Tech News

Big data is useful, scary, and more subjective than you know

Big data helps us understand our customers, but it also helps budding companies sell information about you (or TO you), and is more subjective than you may know – it takes a human touch to determine what info is important.

Published

on

big data

big data

Big data is here and unavoidable

For years, we’ve written about big data and showcased the progression of business intelligence available now to brands of every size, in fact, most businesses have a feel for this type of data – open a spreadsheet of your sales data and you already know it’s just a bunch of numbers unless they are analyzed and filtered. Today, I want to review what big data is, how it is currently being used, what this means for the future, and most importantly, how it can be cherry picked and why it can upset entire industries.

[ba-pullquote align=”right”]”Big data” is typically consisting of at least dozens of terabytes in a single data set.[/ba-pullquote]“Big data” is defined as large data sets which cannot be managed with simple, common software that captures and processes the data, and is typically consisting of at least dozens of terabytes in a single data set. The challenges of big data are really big. It is described by Gartner analyst, Doug Laney as being three-dimensional, i.e. increasing volume (amount of data), velocity (speed of data in/out), and variety (range of data types, sources).

Let’s talk about how BIG this really is

Let me illustrate. The University of Nebraska physics department has 1.6 petabytes of data – that’s 1.6 million gigabytes in one department at one school. Boeing jet engines can produce 10 terabytes of operational information for every 30 minutes they turn. As of 2012, the average smartphone user has 736 pieces of personal data collected every day, stored for one to five years by service providers.

[ba-pullquote align=”right”]By 2020, there will be 5,200 gigabytes of data for every human on Earth.[/ba-pullquote]IBM’s chief executive, Virginia Rometty said, “By one estimate there will be 5,200 gigabytes of data for every human on the planet by 2020. And powerful new computing systems can store and make sense of it nearly instantaneously.” It has also been predicted that in the coming years, over 200,000 big data specialists will be required to make sense of the barrage of data being collected.

Big data is already being used today in a big way

Big data is a big deal and it’s not just because there’s a lot of it. In fact, today alone, SumAll raised $4 million and DataSift raised a whopping $42 million to help businesses make sense of their data as it relates to social media.

[ba-pullquote align=”right”]Retailers are analyzing your facial expressions on camera to tell if you’re a happy shopper, and tracking your gender, age, and size as you walk in the door.[/ba-pullquote]Big data is already used in amazing ways by the retail industry by analyzing shopper height and size as they walk in the door to determine age, gender, and more, and even have cameras analyzing facial expressions while you’re shopping to gauge your experience. If that doesn’t impress you, there’s already a seasoned company that is tracking “visual mentions” online so if you share a picture of your Starbucks cup on Instagram, even if you don’t say Starbucks or use GPS, Starbucks can see that their logo, even if curved, was used online on a social network.

Predicting the future with big data

But it’s not just that data is having a tremendous impact on life today, it is still a young sector with many startups yet to pop up to solve the data conundrums. SiftScience fights fraud using machine-learning that learns from data to recognize patterns of fraudulent behavior based on past examples, and Hadoop helps companies analyze massive amounts of generating about user behavior and their own operations while Recorded Future uses algorithms that unlock predictive signals based on web chatter to determine a brand anticipate risks and capitalize opportunities.

[ba-pullquote align=”right”]Intel is working on technology using big data to allow you to see three cars ahead, behind, and beside you.[/ba-pullquote]There are already projects in the works that allows forecasters to predict weather up to 42 days in advance, potentially saving lives and billions of dollars a year.Intel is working on a big data project that allows cars to communicate so drivers will be able to see three cars in front of, behind, and to the left and right – simultaneously. Ford is developing vehicle-to-vehicle and vehicle-to-infrastructure systems to warn drivers of potentially hazardous traffic events, like cars going through red lights.

But big data has some really big problems

First, and least upsetting, is that there are big problems with demographics, leaving brands with a lot of data that doesn’t yet mean much. Why? Incomplete self reporting is a huge issue because brands are still focused on using social networking profile data to gather intelligence on their site users, fans, and the like, but when they rely on this data, people may not be completely truthful (they may say they are 32, but they’re 12, and so forth). Additionally, privacy does protect users to a certain extent, blocking intelligence gathering by brands. Lastly, data is still largely inconsistent and unconnected – you may have a Twitter account and Facebook account, but a third party doesn’t know that unless (a) you use the same username consistently or (b) you grant access to both accounts through that third party.

While other problems exist (like how will we ever store all of this data, disseminate it, and make sense of it, and does it all really matter?), the biggest one we see is the potential for cherry picking, because when you look at a data set, it still takes a human to actually determine what is important to garner from that data set.

[ba-pullquote align=”right”]Big data may mean more information, but it also means more false information.[/ba-pullquote]Industry expert Nassim Taleb opined in February, “With big data, researchers have brought cherry-picking to an industrial level. Modernity provides too many variables, but too little data per variable. So the spurious relationships grow much, much faster than real information. In other words: Big data may mean more information, but it also means more false information.”

Taleb addresses something that could lead one to think that big data is faulty and bad, but perhaps Taleb is really pointing out the human nature that is still required in some instances of analyzing big data – and most people would not typically question a researcher or their methods, leaving analysis in its youngest phase subjective.

Chris Treadaway, CEO and Founder of Polygraph Media which is famous for data-driven analytics said, “To analyze big data, you have to know when you have enough data, know that you’re looking at the right data, and know how and when to draw conclusions from the data using methods developed from statistics theory and data science. That’s the great irony of “big data” – it’s as much of an art as a science, which is why the best efforts are multidisciplinary.”

“Big data can find tremendous hidden relationships,” Treadaway continued, “but you have to make sure your bias isn’t to find conclusions that don’t exist. Bias can cause the situation Taleb describes, and will cause disinformation as he says. If you’re cautious, discerning, and careful, you can make the most of big data. But there are pitfalls for the careless.”

And the coup de gras

[ba-pullquote align=”right”]Your performance data, finances, company info and more are already being repackaged for public consumption and monetization.[/ba-pullquote]The coup de gras is that professionals are being threatened by new ways big data is being used, but they are not recognizing it as a big data issue.

Several industries are seeing data about them individually, their performance, their company, their finances, all analyzed and repackaged for public consumption or monetization.

Imagine a site launches tomorrow based on publicly available data and you’re a social media consultant. Let’s say that this new site looks at who has recommended you on LinkedIn, Yelp, Angie’s List and so on, and has determined that the people recommending you are clients of yours, based on the assumption that it is the only reason they’d recommend you or review you. The new site also analyzes words and pictures used in your online bios to determine characteristics about you.

Then, they take those reviews and characteristics and quantify you into a score, giving you more points if someone from Coca Cola reviewed you than if the local dentist reviewed you, implying that you’re a higher quality consultant if you’ve worked with a major brand like Coca Cola than if you worked with a local dentist (God forbid you specialize in social media for independent medical professionals).

Then, Google gets interested in this new site and they invest, and later, they want to use that data to populate your Google+ profile, so now you, the social media consultant, has a score next to their face to determine how good you are at your job.

What’s wrong with that?

[ba-pullquote align=”right”]You must understand that data requires a human to determine what is relevant, which doesn’t always allow for the full context of the data points.[/ba-pullquote]Data is subjective, even when raw – it takes humans to determine what data points in the sea of data are relevant, and it doesn’t always take into account the context surrounding that data. You, the social media consultant, could have taken a two year sabbatical to execute social media strategies pro bono for three tiny charities, four local restaurants, two African orphanages, and a spa, earning a reputation for your high quality of work and compassion that can’t possibly quantified by a computer.

This scenario is fake. For now. But with every human generating billions of data points every year, evaluations are just the first of many steps in what is to come with big data – the data is now generated, and it is a race to see what can be displayed about you and your business so that companies can sell to you or repackage your data and sell it to someone else. Even your brand will be using big data to gain insights into your customers so you can better serve them.

[ba-pullquote align=”right”]The race is on to see what can be displayed online about you and your business, which is being repackaged and resold.[/ba-pullquote]There are pros and cons to big data, but the reality is that it is unavoidable, even if you ignore it or misunderstand it. Consumers need to begin to recognize when they see big data, and understand that it may not be the true context of that data, as it is ripe with humans’ decisions regarding what is important about a data set. This is just the beginning.

Get The American Genius
neatly in your inbox

Subscribe to get business and tech updates, breaking stories, and more!

Lani is the Chief Operating Officer at The American Genius and has been named in the Inman 100 Most Influential Real Estate Leaders several times, co-authored a book, co-founded BASHH and Austin Digital Jobs, and is a seasoned business writer and editorialist with a penchant for the irreverent.

Continue Reading
Advertisement
4 Comments

4 Comments

  1. Patrick Gallagher

    December 6, 2013 at 12:43 pm

    100% spot on conclusion: “…as it is ripe with humans’ decisions regarding what is important about a data set.”

    I like how Rory Sutherland (sometimes with Taleb or speaking about his work) kicks these ideas around. As soon as you pick certain data points and make them *the* metrics to follow the data becomes skewed and meaningless. You changed it just by looking at it so hard.

    Good stuff.

  2. Hank Miller

    December 9, 2013 at 7:35 am

    We are drowning in data and it can lead to paralysis by analysis.

    Watching videos and researching how to make shrimp scampi, set a broken wrist or install a hard drive does not mean that you can do it. Somewhere along the line a human with experience in the appropriate field has to provide guidance and identify the key points.

    Piles of data are just that – without someone with the abilty to effectively apply the appropriate parts to the specific question at hand there is nothing. Nothing but confusion

  3. Pingback: Top venture capitalist explains how tech startups can stand out when seeking funding - The American Genius

  4. Pingback: Big data is watching you - some will panic, others will rejoice - The American Genius

Leave a Reply

Your email address will not be published. Required fields are marked *

Tech News

Silicon Valley created tech for your family that’s too addictive for theirs

(TECHNOLOGY) Tech inventors are big on innovating and advancing tools, but a growing parenting trend in tech circles seems hypocritical.

Published

on

tech addiction in children

I consider myself an older Millennial. I was slowly but surely introduced to technologies as they became mass-marketable, but they didn’t affect every moment of my day-to-day life. I learned how to use computers in elementary school, I chatted on AOL as a preteen, and when I was 16, my parents gave me my own cell phone “for emergencies.” I promptly dropped it under the car seat, where it remained for a year, before I or my parents even noticed that it was missing.

In less than a generation, our relationship to cell phones has transformed completely. For one thing, my first cell phone didn’t have a touch screen. It didn’t have an internet connection. Hell, for an entire year, I didn’t even use the damn thing.

Fast forward to 2018, when your children can learn to use an iPad at the same time that they learn to use a toilet.

Interestingly, the tech whizzes who designed much of the technology that now pervades nearly every moment of our lives seem wariest of the negative impact screen time might have on kids. The NYT reports that the trend amongst Silicon Valley parents is to severely limit or even ban cell phone use by their children.

Parents in all echelons of the tech industry are limiting their kids’ exposure. Steve Jobs kept iPads out of the hands of his young children. The Gates offspring didn’t receive cell phones until high school (just like me, in 2001), and Tim Cook discourages his nephew from using social networks.

These concerned parents describe the addictive potential and negative consequences of screen time in increasingly pessimistic terms.

Athena Chavarria, a former Facebook employee, believes that “the devil lives in our phones and is wreaking havoc on our children.”

Chris Anderson (yes that Chris Anderson), former editor of Wired and founder of GeekDad, says that when it comes to screens, “On a scale between candy and crack cocaine, it’s closer to crack cocaine.”

Parents are even making contractual agreements to make sure their kids don’t use screens while under the supervision of their nanny or babysitter.

Like basically every human idea or invention ever, connected, screened devices reveal that our ability to create new technologies far outpaces our ability to understand the consequences – positive or negative – of that tech.

Those closest to the situation – the inventors themselves – are often the first ones to sound the alarm when they realize that their hard-won advancements may not have been such a great idea after all.

Said Chris Anderson of the addictive nature of cell phones, “We thought we could control it. And this is beyond our power to control.”

Get The American Genius
neatly in your inbox

Subscribe to get business and tech updates, breaking stories, and more!

Continue Reading

Tech News

Amazingly fun tech toys that are secretly educational

(TECHNOLOGY) STEM toys for children are fun *and* educational – here are some that have caught our eye.

Published

on

STEM tech toys for kids

There’s a new trend amongst startups – and amongst kids’ toys: educational playthings that teach your little ones STEM skills like programming and coding.

Toys that double as learning tools are nothing new, but digital, connected technology still is, and so is the idea that your toddler can get a leg up in the tech industry by getting an early start.

Parents, universities, and economists seem concerned that acquiring STEM skills will soon be the only way to guarantee a good job, despite reports from the U.S. Census Bureau that 3 out of 4 STEM majors end up in non-STEM fields anyway.

So if your kid is more into, say, baseball or dancing than computers, you might be wasting the pretty pennies these high-powered educational toys will cost you.

Kids, with their alarmingly short attention spans, are as likely to toss these toys back into the toybox as any other. But if your wee one seems to have a knack for all things technical – or if you’d just rather see them learn how to build a device than passively stare at one all day – then check out TC’s guide to STEM toys.

Even though these toys are marketed towards the younger set, I found myself a little envious, wishing I could take a few for a test drive – especially since many of them are modern, high-tech reboots on old standbys from my childhood.

Lego’s Boost Creative Toolbox uses the same classic Lego blocks, but allows you to animate and program your creations.

Several products cross-market with some of my childhood favorites; Dash Robotics has teamed up with Mattel to make Jurassic World robots, and Kano makes a Harry Potter Coding Kit that teaches kids to program a wand that can interact with digital content. There’s even Electro Dough which is basically electrically-conductive Play-Doh that can light up and make sounds. I want!

In fact, a lot of the toys combine arts ‘n’ crafts with STEM lessons. Adafruits makes a marker with electronically conductive ink that can light up circuits and interact with computer programs, and an electronic pencil that synthesizes music. Root Robotic’s little bot can draw pictures and compose songs.

For the more straightforward tech nerds, Makeblock, Evo, Robo Wunderkind, and Wonder Workshop all make programmable robots – a big step up from the “artificially intelligent” Furby’s of my childhood. Sphero’s Bolt is a ball-shaped robot, while Airblock makes a programmable hovercraft.

There’s the Pi-top Modular Laptop that teaching kids coding, and there are even opportunities for kids to build their own electronics; Kano offers a build-it-yourself computer.

The holidays are just around the corner – but whether STEM educational toys will be the next Tickle Me Elmo remains to be seen.

Get The American Genius
neatly in your inbox

Subscribe to get business and tech updates, breaking stories, and more!

Continue Reading

Tech News

A deepfakes creator for text so realistic it can’t be made public yet

(TECHNOLOGY) You know about video deepfakes, but the technology exists for doing convincing deepfakes for text. It’s so good that they aren’t ready to release it to the public yet…

Published

on

deepfakes text

Artificial intelligence is being used to complete more and more human tasks. But as of right now, news stories you read online – including all the articles here on American Genius – have been written by real human beings.

Until recently, even the most intelligent computers couldn’t be trained to recreate the complex rules and stylistic subtleties of language. AI-generated text would often wander off topic or mix up the syntax and lack context or analysis.

However, a non-profit called OpenAI says they have developed a text generator that can simulate human writing with remarkable accuracy.

The program is called GPT2. When fed any amount of text, from a few words to a page, it can complete the story, whether it be a news story or a fictional one.

You already know about video deepfakes, but these “deepfakes for text” stay on subject and match the style of the original text. For example, when fed the first line of George Orwell’s 1984, GPT2 created a science-fiction story set in a futuristic China.

This improved text generator is much better at simulating human writing because it has learned from a dataset that is “15 times bigger and broader” than its predecessor, according to OpenAI research director, Dario Amodei.

Usually researchers are eager to share their creations with the world – but in the case, the Elon Musk-backed organization has, at least of the time being, withheld GPT2 from the public out of fear of what criminals and other malicious users might do with it.

Jack Clark, OpenAI’s head of policy, says that the organization needs more time to experiment with GPT2’s capabilities so that they can anticipate malicious uses. “If you can’t anticipate all the abilities of a model, you have to prod it to see what it can do,” he says. “There are many more people than us who are better at thinking what it can do maliciously.”

Some potential malicious uses of GPT2 could include generating fake positive reviews for products (or fake negative reviews of competitors’ products); generating SPAM messages; writing fake news stories that would be indistinguishable from real news stories; and spreading conspiracy theories.

Furthermore, because GPT2 learns from the internet, it wouldn’t be hard to program GPT2 to produce hate speech and other offensive messages.

As a writer, I can’t think of very many good reasons to use an AI story generator that doesn’t put me out of job. So I appreciate that the researchers at OpenAI are taking time to fully think through the implications before making this Pandora’s box of technology available to the general public.

Says Clark, “We are trying to develop more rigorous thinking here. We’re trying to build the road as we travel across it.”

Get The American Genius
neatly in your inbox

Subscribe to get business and tech updates, breaking stories, and more!

Continue Reading
Advertisement

Our Great Partners

The
American Genius
news neatly in your inbox

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Emerging Stories

Get The American Genius
neatly in your inbox

Subscribe to get business and tech updates, breaking stories, and more!