About The Bitvore 22
The Bitvore 22 is a series describing a day in the life(time) at Bitvore and working on the bleeding edge of Artificial Intelligence (AI). Today we get to know Greg Bolcer, Chief Data Officer at Bitvore, as he answers our 22 questions.
A long time. They say that staying at a startup company too long is death to your career, but Bitvore keeps exceeding everyone’s expectations year after year. Bitvore’s not so much a job or a career, it’s an avocation. I want to see this thing through to wherever it ends up, and, so far, the sky is still the limit. How long is that? Before I was hired, I was helping them work out some technical issues they had with a customer, so you might say that is even before Bitvore was Bitvore—5+ years.
My first task at Bitvore was to figure out how to scale the initial proof of concept. The first iteration of the product was a 3D visualization of social media conversations and correlating that to participants and topics. While that was interesting, the textual analysis and unique, scalable analysis platform ended up being what found traction in the marketplace, but for other uses. As a multi-time Chief Technology Officer and VP of Engineering, the software development, operation, and design came naturally to me. It turns out that the data analysis and data operations (“Data Ops”) are the hardest part. My current role has shifted to developing a deep understanding of how the data is captured, processed, analyzed, and packaged. As the Chief Data Officer, I also spend a lot of time validating how we use the data and experimenting with how we can use it better through new analysis, tools, integrations, or insights.
Initially my role included technical team building, system design, and product strategy for creating something unique and defensible to take to market. Now my role focuses on how best to take advantage of the vast amount of unique, analyzed data. There are infinite ways we can package the data—every single one more valuable than the next. The main shift in my thinking over the course of my tenure at Bitvore is that I now strongly believe: “it’s the data, stupid”. All the software products are simply ways to enhance how we get the analyzed data to customers in the right way that maximizes the value.
There is nothing more gratifying than building and delivering a product that people actually want, need, and use. We have an amazing number of customers who simply love the product. I used to worry about the system cracking at the edges or not being able to sustain the quality of data that users expect and demand. Now that we have the technology and content processes ironed out, we get to focus on continuous improvement and new products. I like having the freedom to do that in conjunction with learning from others at the company who have successfully been through those product and company stages before.
A long time ago, a famous racecar driver was asked about his thoughts on speed. Rather than say 100 or 200 miles per hour, he said, “Speed is going 32 miles per hour around a 30 mile per hour curve.” Keeping with the analogy, Bitvore is solving a 1,000 mph problem, going 1,200 mph while everyone else is stuck in stop-and-go traffic during rush hour on the 405 freeway.
Our main products center around intelligence collected from sources of world news and information. We cast a very wide net to find content that is relevant to what our customers want, process it through an army of artificial intelligence and machine learning tools, and re-assemble it in a way that makes it tractable for people, teams, and companies to use.
Explaining that to non-technical people is fairly easy: We have a computer read 100% of the English news in the world as fast as it is available, determine what individual content is really about, and then figure out when something important happened that people care about.
The hard part non-technical people have is “why?” It’s hard to imagine there’s a whole world of critical information below the waterline when you think in terms of what sites you read, how much you trust them, how often you visit, what terms you use to track them, and what shows up on the first page of results from a search engine query.
There is an industry called “search engine optimization” or SEO. SEO is 100% opposite of what we do. SEO lets sites (and in our case news sites) optimize how they present their content such that they show up regularly in the top results. For the scale of specific things and events that people need to keep track of, users cannot be required to possibly know what optimized terms to search on any given day for them to get the best results, know how often they have to search, know how often things change, or how to be confident that there is no other information out there that they might need.
Before Bitvore, users thought of the problem in terms of what sites they track and what search results provided. Bitvore removes that abstraction and does “results optimization” and “information completeness” such that customers will see everything important and never miss something that may effect their business.
To me, the biggest promise of AI is that it allows people to think in terms of abstractions and higher-level concepts to accomplish useful tasks without having to micro-manage the individual details. One of my favorite quotes is from General George S. Patton. “Never tell people how to do things. Tell them what to do and they will surprise you with their ingenuity.” I think AI is at the point that we will be surprised by the ingenuity that it brings to a lot of human activities, thus allowing us, in turn, to apply our own ingenuity to at a much higher level of understanding and abstraction.
The biggest threat of AI is that some deep learning algorithm and model will think it knows yourself better than you do. It de-humanizes people into groups, discounts your own decisions, takes away choices, and, if handled improperly, can create a vicious system of rewards and punishments that self-reinforces its own model. The risk is that people will not even know it is happening. They will just get better discounts on products they should buy, consume the content that is best suited for them, and, as Huxley stated in “A Brave New World” drown themselves in a sea of irrelevance without really thinking. As AI integrates deeper into our culture, I think it is important to highlight fairness, accountability, and transparency. That includes “truth-in-advertising”, educational programs beyond computer science, legal issues like “ethical AI” and whistleblowing, public interest beyond consumer models, and tracking governmental and large-corporate uses like facial recognition, purchasing habits, and privacy.
I do not think we are anywhere near that type of dystopian future, but I think these are issues that need to maintained at the top-of-mind.
You are going to laugh, but individual websites or resources don’t actually hold the same level of importance when you have access to tens or thousands of them that you can algorithmically analyze and compare. It’s hard to go back to understanding the world from what sources you read after using Bitvore to do it at a higher level. Obviously, I don’t use our production system as my own personal playground—much.
That said, I have my favorites. Facebook has an academic-oriented AI and Deep Learning forum of about 80,000 users. It’s extremely well curated. MIT Tech Review has a set of newsletters called “The Algorithm”, “The Download”, and “The Chain Letter”. Archive.org is also one of the best sites on the Internet as I am a big fan of the history of the WWW, though I do not use it daily. I also take advantage of the network of colleagues and friends I’ve built up over my career. A lot of them are very senior technologists at large companies, venture capital firms, government agencies, and media companies. I am on quite a few private email lists that deal with everything from entrepreneurship, information security, WWW, big data, cutting edge technology, social aspects of computing, complexity theory, economics, computer visualization, and a few others that help keep me up to date on issues relevant to work in specific and the content industry in general. The privacy of these lists creates a shared understanding beyond what any person can keep track of themselves. A lot of the public websites, news and discussions do not always start from the same level of understanding.
Same reaction as last question. When I was a kid for fun, I used to read dictionaries and encyclopedias. One of my hobbies as an adult is actually browsing and reading the Google patent site. I love understanding how people think as captured in the bizarre legal-ease of patent filings. It’s fun to try to figure out what technologists are really doing versus what their patent lawyers put in a filing.
Second, I never stop rolling up my sleeves. Though I don’t always contribute to the day-to-day production, this includes scanning and tracking every issue filed for our code, understanding every system warning, and following all the design and architecture discussions and plans. Recently online, it has become common to share “workbooks” of AI and machine learning research. A workbook contains everything you need to run the same experiment yourself that the publisher did. That means you have the code, a snapshot of their trained AI model, and some data. In the current environment where science has a lot of “reproducibility” problems, there is a feeling that there should be a “code, or it didn’t happen” effect. I love that AI researchers publish their code, algorithms, data, and models. Every time I see something interesting online, the first thing I do is roll up my sleeves and see if there is a way to replicate their experiment in order to apply it to something I’m currently doing at Bitvore.
I’ve been at a lot of startups. The biggest issue is trust. You have to trust the people you work with are making the right decisions in the absence of any customer validation or in some cases the wrong customer validation. In early stage startups, which thankfully Bitvore is well past, there are multiple right answers, multiple wrong answers, and far too many questions for any individual to make well-informed decisions about by themselves. You have to trust your colleagues know what they’re doing. We have a great team of people at Bitvore and I trust every single one of them.
Let me state the obvious, Orange County is not Silicon Valley, nor should it be. One of the greatest altruistic engines in the world is having a local, entrepreneurial ecosystem that gives back. Orange County and Southern California have a deep bench of technology, ingenuity, and entrepreneurship. One of the things I like to do is help entrepreneurs understand what it takes to found a company and get it to the point it can take off. I still volunteer on startup judging panels at UC Irvine and our local angel funding groups, but my biggest contribution is introductions. In a world where people are inundated with communication, a few good, well thought out, meaningful introductions give a lot back to all involved.
My favorite tech products right now are Nvidia-accelerated Tensorflow, Natural Language Parts of Speech (NLP) analysis, and AutoML. AutoML as a generic term, is called automated machine learning. It allows non-technical content and image users to put examples in a folder in an “I’ll know it when I see it” manner. Simply by identifying the data, they can train a sophisticated deep learning algorithm on how to find other examples. That has never been automated in the history of machine learning at a scale that is being done today. You now have the ability for a non-technical person to evaluate precision, how accurate a model is, and recall, how many items are missed in a universe of data. AutoML is the first step to giving non-technical user the ability to distinguish things they’ve never been able to before without the help of a team of specialists.
Other than that, Amazon AWS, Google Cloud, and Microsoft Azure probably are three of the most amazing products ever launched in the history of computing. The details, scalability, and features of each has enabled a whole generation of innovation. We use their APIs and cloud-compute services every day in a manner than pushes the limits.
There is nothing I hate more than products based on limited bandwidth. Products that price or tier based on volume of network traffic seem to be disingenuous. I constantly think in terms of how the world would be if we had all the bandwidth needed at the edges of any network, anywhere. I understand there are economic costs, but if you can assume that as a starting point, the types of applications you could build would be limitless. Products that rely on artificial restrictions definitely are my least favorite.
I use all. Macs are limited for gaming, overclocking, and hardware upgrades, so using it as a daily workstation doesn’t make sense for me. I use Ubuntu for my workstation at the office as its closest to what we use in production on our cluster. At home I use a Windows x64 PC for gaming, coding, writing, media, and overclocking.
Mobile is a little different. I have an iPhone for app and parental control reasons, but we’re a multi-device family. I won’t even get into how many smartTV, gaming consoles, IoT, and home security devices I have. At some point you have to maintain all of them, keep them secure, updated, and useful. There’s an old quote about “what you own eventually ends up owning you”. It basically means that individually some of these things are useful, but collectively, they start to become high-maintenance.
The last, non-computing project I worked on was a vintage model rocket.
Back in the 1970s, they were coming up with all sorts of experimental designs for faster travel to the furthest reaches of the globe. The Concorde made it’s first trans-Atlantic flight in 1973. The Space Shuttle Enterprise was rolled out in 1976. Estes, a model rocket vendor, released a model called the Scissor-Wing Transport. It was a passenger-jet that was launched upright, but after reaching a certain altitude near space, would swivel the wing and glide down to its eventual destination. They’ve never built one in real life, but Richard Branson’s Virgin Galactic space flight earlier this December was built on the same idea. After a very difficult build that took the better part of a year, I took the kids out after Thanksgiving and launched it about half a dozen times. It was a lot of fun.
I typically don’t give presentations too often anymore. The last presentation I gave was to the Harvard Business Angels and the Tech Coast Angels as part of our seed round of fundraising a few years ago. I know this isn’t the question, but the largest audience I’ve ever been in front of was close to 3,500 people on a stage in Vegas. I got to do demo-support for the then CEO of Intel who took 5 minutes of his CTIA’2001 keynote speech to demonstrate a product I had built. The month before Intel’s then CTO had shown off the same product in front of 5,000 developers in San Jose to use two mobile devices to sign an electronic NDA between Intel and our company.
I don’t binge watch or watch broadcast TV. I’ve heard of most of the popular shows and know what they are about, but I’m more into movies, video games, and puzzles.
The last puzzle I did was a 5,000 piece, color-spectrum one.
The one before that was a changing color one.
I’m sure the family was binge-watching something while I was working on it!
This has always been one of those phrases that has always bugged me. When you are an entrepreneur, it doesn’t exist. Work life balance to me means having a regular job working for someone else. If you aren’t passionately pursuing work you love as part of your life, then you aren’t doing it right.
Wife and two boys. Both of our families live locally as we grew up in Orange County. My wife works at UC Irvine in the School of Computer Science. Both of my kids like soccer, school, technology, and video games.
I spend almost all of my non-Bitvore time driving kids to soccer, robotics, basketball, football, practices, games, parties, events, family get-togethers, etc. It’s called having “kids of a certain age”. Otherwise, I’m still on the computer playing games, tinkering with programs, or reading the tech news.
I’m a large dog person, but we’re in between dogs. My two most recent passed away after 15 years. Before that, I had a white German Shepherd when I lived down at the beach. My roommates were Huntington Beach City lifeguards, so my dog became their unofficial mascot. Where we lived has since become an official dog beach, too. I like to think of it as his legacy.
For me personally, I think being able to demonstrate some of the predictive capabilities of the massive amounts of data we have would really open people’s eyes to how valuable some of the stuff is. That is longer term. In the short term, it’s just helping support proof-points that we can apply the technology to other specific markets that we’ve decided to pursue—the faster the better!