Artificial intelligence is rapidly changing the world around us, but are we truly prepared for the consequences?
Join us for a candid conversation with T.D., an AI expert who offers a balanced perspective on the promises and perils of this transformative technology. Explore the ethical dilemmas, practical limitations, and surprising impacts of AI on everything from healthcare to the environment and the future of work.
*This transcript has been lightly edited for better readability. Reading time: 14 minutes.
Arham: Today, we’re diving into a topic that’s generating a lot of buzz: artificial intelligence, specifically Generative AI. It’s a complex topic, with a huge wave of advancements happening right now. We’re going to take a step back and reflect on some of the implications of this wave. To kick things off, what comes to mind when you think about artificial intelligence and ethics?
T.D.: That is a huge discussion. If I had to sum it up in one word, it would be ‘care’. Like any powerful tool, AI has incredible potential for good, but also for damage. And we’ve seen some of that damage already happen, and it is a hot topic that is heavily discussed.
Arham: This isn’t the first time AI has been hyped up. These waves have come and gone. Do you feel this AI wave will die out like previous ones? You know, there’s a big breakthrough, then people realize the implications, the consequences, even the costs. Is it something unique this time around, or are we headed for another AI slump?
T.D.: In the past, AI hasn’t died down, but the hype around it has died down because AI was over-promised. People got caught up in the excitement, let their imaginations run wild and expected more out of it than it could deliver. But those past breakthroughs have continued to march forward and be adopted into everyday life. Hopefully, we’ll see a similar pattern with generative AI. Initial excitement might fade as people realize it’s not the end-all, be-all — not the general-purpose AI some imagine, that can go out and do scientific breakthroughs all by itself. It’s not like we can say “Hey, go invent the cure for cancer.” And then, two weeks later, it spits something out. And then we say, “Okay, how do we build it?” It spits something out again. And then we build the machine that it tells us to, and off we go. I don’t think that general AI is going to be that. If we ever do reach something like that, it will be a component of it. The current research and technology, focused on bringing generative AI into our everyday lives, could contribute to that future.
But at the same time, I don’t want to downplay its impacts. Generative AI could very well be just as big of a breakthrough as getting to the moon if you think about it. The training sets needed to train some of these generative AI and big models, the annotation and work involved are incredibly impressive. I don’t want to diminish that in any way because it’s wonderful work. I do think this one’s going to have a bit more staying power because it’s not just returning numbers, it’s returning something that you can read through, and it looks a lot like something that a person would write.
I was also struck by this: I downloaded Ollama, version two or three, and it’s eight gigs on your computer. You run it on your laptop, it spits things out, and you can ask it pretty broad questions. While it has factual inaccuracies, it’s also surprisingly accurate on some topics. Think of it as a compression algorithm: they’ve taken a massive amount of internet data and compressed it into something that’s about eight gigs. This gives you a pretty good approximation of what you’d find online, even without an internet connection. From that perspective, it’s a very impressive lossy compression technology.
Arham: I was watching a TED Talk by Yejin Choi, who is an AI expert. And she said this statement about AI that I really want to dive into. She said, ‘AI is unbelievably intelligent yet shockingly stupid.’ Let’s focus on the ‘shockingly stupid’ part.
T.D.: (Laughing) That’s a beautifully succinct way of putting it. We’ve definitely seen cases where AI told people to add glue to pizza sauce to keep cheese from sliding off. But it says it with such confidence, we think it’s very intelligent. We start to realize how much of human interaction is a confidence game. If you’re really confident, then maybe you know what you’re talking about. And we see that with AI, and that’s dangerous.
Anyone who’s generated source code with it knows it often doesn’t run. I’ve had it generate Postgres database post statements with syntactic errors. After small changes, it runs, but not right off the bat. And when I was in college in TA, your code had to run. 50% of your grade was based on whether your code ran without error. Generative AI often fails that first test. Or, I guess, the first test would be “Did you put your name on it?” (Laughing)
Arham: The missing factor is what we call common sense. You know, you don’t put glue on a pizza. Can AI develop that?
T.D.: Well, there’s a saying, “Common sense isn’t that common.” I’ve seen people do things and wondered, “What were you thinking?” So, it might be a high bar to ask AI to achieve, when people do stuff all the time. Sometimes it’s low risk. How many times have you cracked an egg on the stove, forgetting the pan? Little things like, “I’ve done this a hundred times, but this time…” Is that common sense or not paying attention? I’m sure AI won’t achieve what everyone will say is common sense. We can’t ask everyone, “Does this AI have common sense?” Someone will disagree. That’s far too subjective.
I think there will be a point when it’ll convince a lot of people it does have common sense. Some models look like they have it, then fall apart with hallucinations when you ask them for facts. Like, “what were the top 10 grossing movies in the United States in 2016?” Some easily verifiable things. And that’s not common facts, that’s trivia. And IBM’s Watson won Jeopardy, so we’ve established that they’re good at matching things to trivia facts.
Arham: That makes me wonder if AI is only as good as the data behind it. More data doesn’t necessarily mean better data. With stronger data privacy laws and more AI regulation, will AI hit a saturation point? How far are we from that point?
T.D.: There’s discussion about a “data desert,” running out of data to train new models. But there’s a bigger issue we’re approaching—identifying AI-generated content. AI “goes off the rails” if trained on AI-generated data. We have a “poison data set” online. I’m not sure they could build a dataset like GPT’s now, because of so much AI-generated content. I’m not sure they could filter it out.
It’s like “forever chemicals.” I heard of a study trying to look at human blood, but it had Teflon. They had to use samples from before the Korean War, when Teflon was developed, because we all have it now. Generative AI is creating similar data, making large training sets difficult. That’s a bigger concern than running out of human-generated data.
Arham: A few years down the line, we might have too much data for AI to even hold.
T.D.: We’ve just made reproducing the work that’s already been done much harder.
Arham: With regulation and AI unable to identify its own creations, it seems only large organizations can afford it. And it puts a large dataset in the hands of a very few players. How does that make you feel?
T.D.: Overall, we’re seeing reduced access to this research, which happens with any technology breakthrough. You start out, all the low-hanging fruit is picked, then as you get higher, it takes greater effort. This follows a pattern we’ve seen in a lot of things. I’m reading about Rudolph Diesel; many diesel engine manufacturers started out, happy to make a 30 horsepower engine or very small engines. Now, fewer companies make diesel engines. We’ll see the same with generative AI.
There’s some work being done to try and put these datasets in the public so everyone can use it, but once you have the training set, there’s only so much you can do. Optimization continues, but it’s been ongoing since before the 90s.
My bigger concern is the amount of energy needed to run a Generative AI model. Google considers nuclear reactors. I remember when they were all about green energy, now it’s “We need nuclear reactors. We need more power.” And that’s much more concerning than losing access to datasets. Because as they grow, they’re harder to work with. Not everyone has a computer with half a terabyte of RAM and 18 GPUs, or the power to run it. Musk’s machine learning data center with H100 Nvidia processors must have massive power consumption, and that concerns me environmentally.
Arham: I’m glad you brought that up. That’s something we shove to the back of our minds. We saw the same with blockchain. Elon Musk went out and made a statement, and it’s bizarre how much energy these consume. You wonder if AI is even becoming accessible or is there a certain kind of AI that’s accessible to people who can’t pay millions for it?
T.D.: The nice part is those who can pay millions are mostly renting out access to their systems. Some are only offering rentals. Google has their in-house TPU processor, and while you can buy smaller TPU processors like the Coral for your Raspberry Pi, it’s nowhere near as powerful as Google’s big TPU systems. Those aren’t for sale; you can only rent them. So, we’re seeing a concentration of this.
Between Nvidia and Google’s hardware—and Apple’s AI cores, which are for running, not training—the TPU and Nvidia chips are designed for training. I saw Nvidia’s H100 cards going for 20 to 80 grand a pop, depending on features. That’s not something a consumer or small company will put in their systems.
That being said, most companies don’t need their own data centers or server racks anymore, thanks to the cloud. So, we’re seeing a concentration, but that concentration is being leased out to companies that can’t afford to build their own. Even standing up a single server is too much for many startups. It’s good and bad. I’ll be really interested to see what happens with TPU core development in the next five years.
Arham: What about transparency in AI training data?
T.D.: One of the big problems with these models is tracing back which facts led to an output—it’s almost impossible. I have a feeling there might be a mathematical solution to that. During training, back propagation traces gradients to optimize models. But understanding what training data led to a specific output is a bigger mathematical question. I feel like there should be a way, but I can’t prove it. A breakthrough would be if, after asking a model to write a cover letter, it gave the specific excerpts that influenced it. While all training data has some impact, identifying the “10% most impactful” data would be a huge advance.
We’re doing something similar with Retrieval Augmented Generation (RAG). Setting up a RAG model using any of the generative AI models. But RAG only goes so far. It doesn’t help with the training set. The training set is vital for things like author attribution. Because if you’re including a number of copyrighted works, and if that copyrighted work influences an output, should the author receive royalties? Right now, we can’t do that, but I could see a royalty schedule built on identifying which training data subsection generated a response. That would be very interesting.
Arham: I’m wondering how you balance transparency with data security
T.D.: The easiest way is to exclude training data that could be exposed to the person asking questions. I believe the best way to ensure data security is to train models on a company’s own data only. No other companies’ data. Segregate and partition the data.
Right now, we can’t prove there won’t be data leakage back and forth. There’s a number of things you can do with anonymizing and changing the data, but then you’re using modified data for training, and that raises concerns about the outcome.
I think you’d have to ask a researcher specializing in this, but it’s concerning. The safest approach is naive: “If this person can’t know about this data, they can’t use a model trained with it.”
Arham: How has AI affected you personally and professionally in your role at Boostr?
T.D.: AI’s non-deterministic nature makes my role more difficult, because we want to use AI, but we grapple with questions like, “How often is it accurate? What is accuracy? Does it give you the right answer 100% of the time? How do we know what the right answer is?” Everyone wants to use generative AI, but then they ask, “How do we know it’s only giving the correct answer?” You don’t. You need a human in the loop. It’s a new technology; you have to research and understand it. I also have to give accurate and honest evaluations to stakeholders when they say, “I want to use AI.”
On the personal side, it’s interesting. My wife is a social worker, and we did a talk on AI in social work. They have HIPAA compliance and their business side wants them to use AI to write case notes. They’re dealing with mental health and sensitive information. AI is listening and helping write case notes. Is that a good idea? What happens to the data used to generate case notes from rough notes? You don’t know unless your company has contracts with the AI provider ensuring proper data safeguarding and HIPAA compliance. Very few knew that. Unfortunately, for many therapists, their licenses are on the line, even though their company wants them to use this.
This is concerning from a non-technical, professional standpoint. Suddenly, your employer wants you to use AI. You could be a doctor, and they want you to use AI for charting to speed up patient visits. If you’re typing in information, whose fault is a data breach? The AI provider? The hospital? The doctor? It’s a legal mess. All we could do in our talk was help them with the questions they should be asking: “Do we have a contract in place with this provider? Are they HIPAA compliant?”
So, from a personal standpoint, it’s scary to see the excitement and people trying to use AI everywhere. I’m not sure it has a place in certain areas. If a mental health provider’s notes run through an AI, and the AI says, “This person is a danger to themselves or others,” leading to a psychiatric hold, does the therapist agree? Can the AI make that call? Can AI deprive someone of their rights? These are scary ethical issues. I think they’re being handled well now, with the mental health provider ultimately deciding. But these are concerns.
Arham: For anyone working with AI, what practical steps would you suggest to ensure they’re at least trying to be ethical?
T.D.: This requires imagination. You have to stop and ask, “What damage could this cause? What’s the worst that could happen? Is that plausible?” There’s a case where a parent is suing an AI provider because their teenage son used a virtual friend, and the AI allegedly told him to kill himself, which he did. It’s tragic. So you have to think about that. They spent a lot of time building in safeguards, but you have to ask yourself: what’s the worst that can happen? Is this AI writing emails to customers? What’s the worst that can happen there? Maybe it writes something addressed to a competitor. In that case, strongly advise users to proofread everything.
In a really crazy situation, let’s say we go all in on AI and have it automatically dose patients with drugs in a hospital. Horrible idea, I think we agree. Then, it’s intravenously pumping someone full of whatever drug it decides, and we don’t know what that will be. That’s relatively easy to imagine happening with a system automatically administering drugs without human oversight.
You have to look at how the AI-generated information is used. Are people using it to make life-or-death decisions? Or is it to decide what to cook for dinner? If you break your diet because the AI suggests potatoes au gratin, it’s not the end of the world. With all tools, we have to look at what could go wrong.
Arham: My final question is, how do you counter the argument that AI might put us in life-threatening positions, like with self-driving cars? Humans are more likely to cause fatal car crashes. Wouldn’t self-driving cars ultimately result in less loss of life? Is the problem only that we don’t know who to blame when it’s AI?
T.D.: Ultimately, you have to look at a reduced harm model. We understand harm will happen, bad things will happen. So you have to ask, “Which outcome has the least harm?” A good example is someone hooked on heroin and cigarettes. Both are bad. You generally tell them to stop using heroin and say, “Keep smoking cigarettes.” That’s harm reduction. No one’s going to knock it off all at once.
So, we need to really look at the current situation and understand that AI is going to cause problems. AI is going to make mistakes. We are going to have accidents. There’s going to be bad things that happen to it. And then ask ourselves, “Is that better than how things are now?” That’s probably a good way to look at it.
Arham: T.D., it was a pleasure. I’ve learned a lot today. I’m definitely not going to sleep tonight, but it was great talking to you, and I hope you all enjoyed the episode.
T.D.: Great to talk with you too as well!
Let's get in touch!
We'll send you more details