Distinguished Colloquium Speaker Series

Conversations with public health luminaries

The Distinguished Colloquium Speaker Series brings national leaders to the IU School of Public Health-Bloomington throughout the academic year. These events, which are free and open to the public, highlight key topics and contemporary issues in public health.

2025–2026 Speakers

Judith W. Dexheimer, Ph.D.
Associate Professor
UC Department of Pediatrics
Cincinnati Children's Hospital

"Predicting the Future of Small Patients: How ML and NLP are Transforming Healthcare"

March 31, 2026
11 a.m.
Innovation Center 105

Full video available here

Description of the video:

Welcome. Thank you for coming to our Distinguished Speaker Series. And I have the pleasure of introducing Dr. Judith Dexheimer. She is currently an associate professor in the UC Department of Pediatrics at the Cincinnati Children's Hospital. I had the pleasure of getting to know her last night with dinner and talking about her fascinating research, which is focused on biomedical informatics, health data analytics, and using machine learning. and natural language processing to integrate with healthcare systems and improve patient care. She uses electronic health data, algorithms, and clinical decision systems to improve outcomes for pediatric populations. And her work spans across pediatric emergency care, chronic diseases, and precision medicine, enhancing healthcare outcomes and efficiency. In 2017, she was a nominee for the Presidential Early Career Awards for Scientists and Engineers, one of the highest honors for early career scientists in the U.S., and has been invited as a speaker at several national conferences. So very excited to hear about her work today. Join me, please, in welcoming Dr. Judith Dexheimer to Open the Graduate School. Thank you. so I plan to talk about machine learning a little bit about large language models and then I have one of my projects to talk about I am happy to be interrupted I am happy to be told something is not large scale interested in the group because actually I am over prepared and I have two different projects to talk about there's no way I'm getting to both of them but I couldn't decide so there are a whole lot of slides there but I'm gonna start I'm not though okay I'm gonna start with an about me like how did I get here today um this is going to be our timeline this is going to start fascinatingly without a year um and then I'm going to step through each one of them. So I am from Florida. I am from the Gulf Coast of Florida. I'm from the Panhandle. And then I went to the University of Central Florida and I double majored in computer science and microbiology. And the real answer for people who are trying to decide what to do is it just kind of snowballed. I was going to do a minor in computer science. And then I was like, well, it's only a few more classes and whatever. So I'm hoping my own child doesn't do that. That was 2003. I don't know why I left off the year. And then I went to graduate school. This is my son who is now significantly older than this picture. This is my former Great Dane. And that sad little thing is Gabe, who is still our dog and still does not like it when I put clothes on him. And I went to Vanderbilt for graduate school, and I got a master's in biomedical informatics, and then I got a PhD in biomedical informatics, and I did my dissertation work in the pediatric emergency department, and that's how I ended up at Cincinnati Children's. It became a really good fit. And so while in Cincinnati, I decided I would go back to school, and I got an MBA in healthcare operations. I've done a lot of stupid races. This one's the escape from Alcatraz. The short answer in my personal life is if it's a really bad idea, I'm going to say yes. And then I went back to school in a different way and I became a CrossFit coach. And I bring this up because what my friend said was, I didn't know you could get a degree in doing your exercise. So I managed to do that, too. And now here I am. And so what I'm going to talk about is I primarily do machine learning, large language models now, natural language processing. I have unabashed love for the images you make with things like ChatGPT. You are going to see a lot of them, and I am going to do my best to always give you my prompt and sometimes the date that I made them. um this one i made march 5th on gpt 5.3 and i said make an image for a resource slide for using large language models and ai for research um i'm gonna have some older ones for you too just to point out the way things have changed what i want to say is it's it's pretty good it's fine there's nothing wrong with this particular slide it made me a nice looking image it made a slide you know this is a good use for it. In the old days, we would have all sat and Googled until we found something that we thought was appropriate and okay to share. So just this a couple of overview slides of artificial intelligence, etc. So artificial intelligence is not new. It started in the 50s. Machine learning is a subset of artificial intelligence. Deep learning is a subset of machine learning. And then generative AI, which is what we all talk about currently all the time, which is those large language models, ChatGPT, Google Gemini, etc., is a subset of the deep learning. As a definition of what is machine learning, it's an application of artificial intelligence that provides systems the ability to learn and improve from experience without being explicitly programmed. So I don't have to say if you see A, B, and C do E, I can give it some data, and then I can give it brand new data it's never seen and see that if it can figure out to do something new. That's the way the algorithms are working. They adopt from those past patterns, and then they try to adapt to new patterns. You see it everywhere, so you have used it in your daily life for some things that you probably realize, some things that you probably don't. Machine learning is roughly everywhere. It's in your text messages and those little predictive text lines down at the bottom. It is in grammar correction for them. It is in whatever you have. If you have something in your home for a smart home, if you have a car that you use, self-driving cars, it is, and it is fabulously used and maintained in the financial sector. and it's in your recommendations for any major company anything you would like to buy something like that they are all taking your data from everywhere they can get and then they're using machine learning to try to find patterns this is the if you go to Kroger approach and use your Kroger card what you're actually doing is giving them your data and then they can later say hey you bought this do you think you want to buy this too and it's a benefit to them so when would you use it? Some of this for the research side, some of this for kind of the world side. The prompt for this particular image is when you use machine learning. I didn't even do a good job with my grammar, but this is what I got. And so you would use it if you have very large amounts of data, because it can be very good for large amounts of data. You would use it if you want to try to find some patterns, because it's very good at that. If you want to predict something, especially something new I have had debates about this next one in life the if your outcome is rare it can't be so rare that it's never seen it before because then it can't find something brand new it has to have seen it at least once but if your outcome is rare it can do a good job of finding it or you need to process unstructured data and there's actually two images that it came up with. Neither one is different. Although I will point out, this is an older version. So you can see that the operation happening on the bottom left-hand side, I think is only on legs. People tend to have the right numbers of fingers and things now. So because it's my particular interest, and because it leads to large language models, I'm going to talk a little bit about natural language processing. So basically, the design and analysis of algorithms and representations for processing human language. What you really want to do is can you make unstructured text into something structured for the machine? The classic example that everyone learns when they're learning natural language processing is time flies like an arrow, fruit flies like a banana. You as a human have parsed it out into two different things. One is flies are noun, one is flies is a verb. The machine doesn't necessarily know that unless you teach it and then you give it a lot of examples. NLP is everywhere, every day. It is in your predictive text. It's in your grammar correction. Autocorrect is actually NLP. That's why you can't just plop something into Google and straight translate. It's in your spell check. It's in all the chat bots. It's in all of your smart home pieces. So your large language models. They are deep learning based. We saw them down at the very bottom. They generate text in a human-like fashion and they use recurrent neural networks to predict the next word of text. That's why they're pleasant to read what they put out. They are giving you what you think the next word of text should be. GPT 5.2, ChadGPT is made by OpenAI. It's not, there's no reason I'm using it as an example, except that it's just the one I happen to primarily use. They're all kind of equal. I'm happy to talk about which ones I like to use for which things, but it's just an example in this case. It's trained on large-scale text data sets, including books, articles, publicly available web content. They're all trained on everything they can get their hands on. It learns patterns and statistical relationships across words, phrases, and sentences, and then it takes those relationships and then gives them back to you. um this is most of what it does this is all of what it does these are really fabulous ways to um I'm going to go more into it but to find information to condense information they're not going to give you anything new they're only going to give you things they have seen before they're just machine learning um to give you an example and this is already a little bit out of date because GPT-4 is not used anymore and we're in the fives. There are a lot of them. Everybody that is a major company has their own. Microsoft has Copilot. Copilot tends to be used by institutions because you can put your own Copilot on your own server. Therefore, it's never touching the internet and you could put PHI in. OpenAI is the biggest. They did chat GPT. They still managed to be the biggest. I don't have it on here. Anthropic split off from OpenAI. That's what turned into Claude. Google created originally Palm and now it's Gemini. There are a thousand. You can find them. I have an 18-year-old child. He is in high school and he told me that they blocked all the large language models from the school computers. And I said, really? And he said, yeah and I said what about Claude.ai and he said I don't know and we found a new one they hadn't blocked I'm like there's always going to be one um and for full disclosure my child doesn't use them it's not like a I won't let him he's like a why would I bother and I say okay as long as you can hold out man um so what I did here is I made a slide in 2024 and then I asked chat GPT to update it. So here's what's on the slide from 2024, that GPT is a GPT-4 model, chat GPT, generative pre-trained transformer, that's what GPT actually stands for, and then the conversations may be saved and used to help improve future models. So I put it all in in 2025, and I asked it to give me updated information. So I'm just going to show you where it updated it. It updated it that it It was a five series model now instead of four. It mentions now that internet access is available through optional tools, but is not automatically built into the model. And that depending on privacy settings now is whether or not it saves your query. So you can now ask it in the form of an incognito mode. And in theory, it doesn't save your query. you are not students but I'm going to give you the student lecture never put any personal health information into one of these models never put an unpublished paper never put a grant into one of these because it saves everything and it learns on everything and you are giving up your IP which is why people have their own co-pilot at their own institution because then you can do that because it's protected. Otherwise, you're giving open AI your data to play with. That's part of what you agree to. So this image is just to tell you that they aren't new. In 2020, GPT-3 was released to the world. It actually goes back before this, but this is when they first started coming out. And November 2023 is where this stops. Everything below the line here is proprietary. Everything above the line is open source. And there's actually, because I stopped in 2023 from this image, there's actually a whole lot more. There's four, there's five, I think between January and now 5.2 went to 5.3. There's always going to be a new version. You can access old versions. If you want to play with these, there's lots to play with. So like I said, what I really love is image generation and I use it for presentations. I do not use it for grants because they've never come out correctly, but it's a good way to get a first step. So I asked OpenAI on 1-25-26 to Dolly to make me an image of a whitewater rafting trip with dogs paddling. I did it on Dolly. 5.2 can do it. And some of my later images are 5.2. Here's what it gave me. It gave me two images. It gave me this one. And it gave me this one. And so, I mean, it was kind of a stupid query in terms of this isn't going to give you something real anyway. But there are dogs. There are species of dogs. Most of them have paddles. It's good and bad between how some of them are using the paddles. One of those golden retrievers looks like they broke their paddle in half. So it's okay. This is a fun way to do things. All right. So then I told it I wanted to create a Scantron. So what I am envisioning is a Scantron sheet. These are what it gave me. And technically I can't fault it. It is a Scantron. But it's not what I was expecting. And the purpose of this is to say you had something in mind when you thought of a Scantron. And that doesn't mean it's what the AI had in mind. And this is a Dolly example. And you can tell that partially because while it's got letters um it doesn't all well they look okay i don't know what tets means but that's okay we're close enough um so you can see what you want to have and so i was like you know what i made a scantron now i'm going to add in a fish and so to me i will 100 admit that i envision marking my bubbles in the shape of a fish this is what I got instead again it's not wrong this is what I asked for um but it wasn't at all what I was envisioning on the way to do it and so I updated this and I said you know what make me a scantron sheet add in a fish and so this is to show you that in this January I made a new set and so all of these are always evolving. Here's my new Scantron. It's, you can argue, better or worse. I don't know why one of them is like forced perspective, but it is still a Scantron. It is still the idea. And now I asked for a fish in the updated version, and this is what I got. So again, I got a fish. I still didn't get what I was thinking. And at no point, by the way, did I redo this query and say, please at make a scantron make the bubbles in the shape of a fish here are all the things i want that by the time you're doing all of that that's your prompt engineering to get exactly what you want out of it um if there's no other take-home message besides the research side i guess it is prompt engineering is basically the new google search um back in my day me and the dinosaurs um got really good at our google searches so we could find what we wanted knowing how to sift through the websites and now it's relearning that with prompt engineering okay so a good question to ask is is machine learning biased the short version is if there are biased data going in there are going to be biased data coming out um And there's plenty of evidence in the world to suggest that there are a lot of biased data in the world. Some purposely, some totally unpurposely, but there's going to be a lot of it. I actually asked GPT 5.2 to make images for artificial intelligence and biased data. It gave me this one. I don't know where CV screening came from that's up there at the top or appraisal monitoring, but I told it, great, thanks so much. It gave me this one, which was very infographic-like, pointed out some of the places that it might have biased data. It gave me this one, which was very straightforward and to the point. And then it gave me this one, which is an AI bias pipeline and how data bias leads to unfair AI outcomes. Again, it's like everything. It's pretty good. It's okay. This is a great way to get a first draft, especially if you're stuck. So here is something I have been doing for several years. One of my friends suggested this prompt, and this is one of the best things it turns out I've ever done. But really, it's the keeping the record that's the best thing. So I took two different models. This is a very old Google Gemini. So please be nice to Google Gemini in this case. and I said, draw a line drawing of a doctor. Here's what Google Gemini created at the time, which is correct. It's an ASCII version, but we have one. And then I asked ChatGPT at the time and this is what I got. And I said, okay, well, that one came from Dolly. It came from one of the paid subscriptions back then. And I said, that one's clearly a male. And I said, all right, I'll repeat it. So I waited about a year. And on May 5th, 2024, I repeated it. I asked for a line drawing. By then, Google Gemini was pay for making images, and I didn't pay. So Gemini's out. They just get to stay because of the ASCII. And so I asked for a new one in 2024. And it gave me line drawings, which are correct, but are still male. So then I asked for another one. it gave me other line drawings they were correct but still a male and then I said okay well draw me a pediatrician because 56 percent of pediatricians are women I got a male treating a baby but that is a pediatrician I mean they are treating a baby now and then in 2026 again I asked for a drawing of a physician and this is what it gives now so what I haven't shown you is that all of these, I've just said, draw a doctor, draw a physician, draw something, kind of a neutral term. So I said, draw a pediatrician. And I got this one. And then I asked for another physician. I got this one line drawing. We're in 2026 now. I got an updated one of a pediatrician. And then I said, draw a female physician, because then I got tired of all of this. And then I got one. I got two because sometimes it gives me two, sometimes it gives me one. I had to ask for a female in the end. Sometimes you get one, sometimes you don't. When you ask, you absolutely get it. But it has made the assumption that asking for a physician, a pediatrician, any of those is going to be male. And I say this just like with all the others, it's a word of warning. like remember that there may be a bias that you aren't thinking about as well that could be in those data. I did not recreate it last night, although in true to form for everyone giving a presentation, I should have because I was indeed still fiddling last night with everything. So I'm not going to say that everything in the world isn't biased or that everything is, but so we ought to think about ways we can reduce the bias. One is identify the potential sources Another is setting guidelines and rules for eliminating it. Identify accurate representative data. Document and share how data are selected and cleaned. And these are really way more important in terms of academics and sharing your work. Evaluate model performance and select the least biased, if any of your algorithms you think might be biased. Again, intentionally or totally unintentionally. and then monitor and review models in operation. So you should always be double-checking them in case something is happening. I have about three slides, I do, for how you could use AI in your own writing because I find that this comes up a lot. This is, making a picture is a fun game, but it is not my research. And it does not give me anything in my research, but it's a great demonstration I used to do a, um, have it make like a something about me. And if you've never done that before, that's always fun because even though they're connected to the internet, if you say, Hey, make a biography about Judith Dixheimer, the first question is like, well, is she real or not? And I'm like, I am. And they're like, okay, what does she do? And I'm like, okay, I could have done this. um so how can you use it so that it's okay one of them is you could use it to help draft your presentation um i would use this presentation outline any anything you want to think of i don't mean make slides i mean ask it for an outline of what should i cover about x and then you can look at its headings and say, oh, these seem like a good idea or not. Another says idea generation and conceptual framing. I want you to know that that particular idea, I should have marked it, came straight from the AI. I would not use it for idea generation because it can't come up with anything truly new. Only you can do that. Outlining and structural organization, that's fine. If you need help, that's a good way to do it. Editing for clarity and style, you can absolutely do this. I do not personally recommend it because I give the example of one of my students who does not have great grammar when he writes, and he knows he doesn't. So he takes what he writes, he writes it all, and then he runs it through the institutional co-pilot to correct grammar. And he says, correct the grammar on this paragraph. And the grammar is correct when it comes out, but it is also like AI writing now. It's now very bland and very boring. And I've told him, I'd rather sit and correct all your grammar and then we can work on it before you submitted it somewhere, because now it looks like you used AI and you used it as a tool, but it looks like you asked it to write things. um summarization and literature synthesis this is a good way as long as your ai is connected to the internet to say you know what give me the five latest articles on using ai in pediatrics this is a great way to start it's not you can't replicate it so i wouldn't try to do a systematic review that way but you could go find the top five ten articles and then from those start searching others. You always run the danger of a made-up article by the AI. They look great. They look like they are the perfect article for what you need. And if you go put it in Google, you will never find it. You will find that title somewhere sometimes. You will find those authors. Sometimes they have published in that journal. Sometimes they have not. But it looks so good. And then you have to check them. So when you ask it for five articles, go make sure they all exist. And when it gives them to you, like if you say search PubMed, it will give you a link for each one of them too. It's not trying to deceive you. It's just trying to give you what it predicts you want. And then iterative revision and argument strengthening. Again, I don't think it's ever going to be a big help with this. But I will say that if there's something I'm writing, like I have an aim and I hate the way it's worded, I will ask it to reword my aim. I will not ask it to reword my aims page. I will ask it to reword my aim. I think I have this a little later. Think of it kind of like a colleague. Who might you ask a question about? Yeah. yeah it does want to please you no i think that's an excellent analogy oh crap i'm on the spot no i think that's good that so a good example of that one is why when you ask any query it's like that's a really great idea let me give you some examples of how this might help you um you can also get into an infinite loop it'll be like hey did you want this in outline form or did you want this in picture form and you're like oh i want an outline and then it makes an outline and it's like well would you like references in your outline or would you like it will go on forever it will never get bored um so I think that the golden retriever and judging by my own golden retriever the golden retriever with a ball she will go get that ball and bring it back and be like let's do it again um I use it in terms of the colleague approach. Like, I might say, could you read this one, AIM? Does it make sense? Or I might say, because I am truly terrible at titles, I might say, my project is about blah, blah, blah. What's a good title? It's never come out with a title that I've kept, but it has come out with titles I have started with and then edited them into something I wanted to keep. um that is what brings up the colleague side because I have a couple of colleagues who are really fabulous at titles and they are who I call hey I need a title for a presentation about x and they're like okay what about this and I'm like oh thanks um it's a tool it should be a tool in your life it should be a tool in your research life if you are going to implement it which I will talk about in a minute. It should be a tool for that too. It should never be the end-all be-all for anything. In 2024, we gave a presentation at the Pediatric Academic Societies all about CHAT-GPT and large language models and some of the ethics. And so I've taken two slides from that talk. Basically, the question is, can it be an author on my paper? And the answer is no. it cannot approve the final draft therefore it is not an author i have taken out all the examples because there are some published papers out there now that have examples right in the paper that are like as a large language model i cannot and so those slipped through all the cracks um those shouldn't be there it shouldn't be drafting your paper But again, if you want to say, hey, I have to start a lit review for a paper on AI and pediatrics, can you give me five articles, that's a good way to start. But then you're going to have to keep doing the legwork. So the answer in life is if you're going to use it, disclose that you used it and describe when and how you used it. So in the methods, if you used it for data collection, analysis, figure generation, say that. And then in the acknowledgements, if it helped you with writing, say it there. If you feel like you need to use it, you need to tell people you used it. Because it kind of levels the playing field in life. Just like I would reference something else. If I used a writing center, I would reference the writing center. I would reference AI. So one of the things I do in class is I have my students do a project with AI. And then the goal is for me to be able to tell whether or not they wrote a paragraph with a large language model or not. One of the things that they are consistently surprised by is when I can tell. And so some rules for you to consider in your daily life is AI tends to have overly general language. There's nothing specific. It's very vague. It's pleasant to read. The grammar is correct, but it's pretty general. It uses M dashes. It uses so many M dashes. I don't know why. If you have a sentence you had AI right, just take those out. As soon as I see more than one or two sets of M dashes and if I'm reviewing a grant, I'm like, oh, no. because now I suspect AI, even though they're a perfectly normal thing to use in the grammar world. Again, ways to think about using it, putting in your own draft, be careful about sharing your actual intellectual property. My students like to use it as a critic. My students like to put in a paragraph from their specific aims and then say, review this paragraph like you're an NIH reviewer. um and they tell me that if they just put in the paragraph and say how's my paragraph that it does the it's really great great job you know this clearly explains whatever and that if they tell it to critique it will give them critiques um I've never done it still um so I can't tell you how great it is but it's a good way to get another set of critiques if you trust it um and then it's a tool like Google. It's, it's not an actual collaborator. It's, it's not actually giving you good brand new information. It's a tool you're using to help you with various things. And so then my last like when to use it, process tasks like grammar, if you absolutely have to for any reason don't use it for knowledge generation um use it when you would use google it's actually a better google because it looks at the sources kind of more comprehensively and it's trained on all the sources and so far so good you can't pay to have your link at the top yet um when you're stuck i like to think of this as the blank page approach because it's so much easier to edit something than to start from scratch. So if you're like, I need an opening paragraph about X, take it. Use that as your start. There's nothing wrong with that. When you need an image, if you need an image for a presentation that someone isn't going to own, if you want something realistic, if you want an infographic, if you want something to add interest, and then use AI to improve the expression of ideas, not the ownership of ideas. That quote came from ChatGPT, which is why it's in quotes. So I'm going, well, I'll ask if there are questions, and then I'll step through one of my research projects. Are there any questions? Okay, so first I'm going to give you a guardrail reminder. AI has guardrails. I run into the guardrails an embarrassing amount of time. so last night I said you know what make me a picture for school violence detection and this is what it gave me it gave me a school hallway confrontation captured on cctv and I was like oh okay like that's not at all what I meant but I guess that's true and I said okay, nevermind. I said, make it an infographic. And it thought for a really long time. And then it told me that the image may violate their guardrails around acceptable depictions of teens and children. So I couldn't have an infographic. And I said, okay, all right, I get it. Create an infographic that shows ways to detect and stop school violence. And it thought for a really long time. And it told me I had once again hit the guardrails and violated their guardrails around acceptable decisions of teens and children. If you think we've got it wrong, please retry or edit your prompt. And I said, okay, all right. Then I said, create an infographic that lists the states with the highest rates of violence and aggression in schools, highlights key areas for improvement, and demonstrates ways to improve detection. It gave me this list, which is so long, I did not take enough screenshots. It was enormous. And at the end, it was like, do you want me to make an infographic? And I said, yeah, I do. Please do. And I got this. So in the end, I got what I wanted. but it took way longer than I would have ever thought because I thought my infographic request was never going to hit a guardrail and I hit these guardrails all the time because I am like we need an image for a presentation of a pediatric patient who has cancer and we just want an image and it's going to be you know I was going to say Mr. X but it's a kid so you know Johnny Smith here is our example patient here's what's happening and it'll be like I'm sorry I can't do that and I'm like oh okay make them smiling and it'll say okay and then it'll be a kid with a nasal cannula that's always like here or here or here and I'm like good enough like I'm not I'm not after perfection but this is a reminder for you again this infographic is it's okay it's not great it's a great first start yeah yeah okay yeah no they're not i do not know here i don't know what a lot of those states are actually next to utah which is not utah which i think might be o2 and then next to it is dt and above it is mx yeah no this is so it's okay it's pretty but that's all you've got yeah yeah is that something that you have done are there strengths and weaknesses we just recently did some image analysis using ai tools that was shockingly good I have never done image analysis with it. I have asked it to write code for me. And I used to have an example where we gave it, you know, a little Excel sheet and said, you know, deaths at hospital A and B, do an analysis and tell us what we should think about. I think it's an okay first pass, but I think no matter what it does, ask for its code because it'll still give you the code in the end. Okay. So I'm going to talk about, which I think you can tell, the detection of violence and aggression in youth at schools, so in adolescence. Another NLP reminder, because this one has examples, you take a piece of text and the first thing you do is tokenization and limitization, which is you break it into its individual little units. And then you do a negation detector. So in this case, I never fight with my classmates turns into negative fight. you give it these terms when you're creating them then you do something like feature extraction sometimes these are annotations in NLP these are all examples mostly examples from the project so violent acts or thoughts of subjects these are things we've annotated they're the words shoot, kill someone, murder, to harm, really want to hurt, etc. And then you do like an engram extraction. So that's, you know, a unigram would be one word, a bigram would be two, trigram, etc. Ours are five. The example I have is data is one, great information is two, I am fine is three. Then you do model development. And this is the image in the bottom right here is what I use for natural language processing, machine learning, all of it. This is all of machine learning pictorially. In this case, you have two different states. They're red dots and blue dots. You want to find a way to separate them. So you move things into feature space. You draw a line that's the best possible line you could have. This one's 3D. You can have as many D as you need. Ta-da, machine learning, done. Think about that little fee there as math, and then you're done. And then for us, we do model evaluation. We want to know if the kids are low risk or high risk or moderate risk. I will go ahead and say that we do not give children or their caregivers a risk level at any point in time. We don't say your kid is high risk. The model may have to say that in the back end, but we don't pass that on. um so why is nlp difficult like why do we have an entire field and an entire presentation about part of it one is sarcasm um one is that syntactic ambiguity the time flies like an arrow fruit flies like a banana spelling there is so much misspelling in a medical record so much in an interview and everything. Another is negation. The negation example is here. Example one is no issues other than her indicating she might, well, this was from an old suicide project. No issues other than her indicating that she might commit suicide. So the no issues is the negative. And then below that, he continues to deny any suicidal thoughts and is happy to come, et cetera. Which of those is a true negative and which one's a true positive. They're hard to tell, and you have to know how people are going to say things all the time. Slang, of course, is a problem. Domain-specific language, bias in your training data. There's all kinds of things. So as a background, we have the news. We know that there is a problem and a concern with any kind of violence in schools. and what makes the news are the school shootings. That's not our focus. Our focus is violence and aggression, usually peer-to-peer for kids 10 to 17 or 18 years of age. However, there's a lot of that. It's continued to increase over the last decade and about 20 percent of adolescents report being bullied on school property like during a school year. The schools with the highest rates of violence report higher dropouts, lower attendance, and poor scholastic achievement. How do we fix it? And the way we fix it is we give timely interventions to those kids with an elevated risk of violence, and then that ultimately decreases any kind of crime. And so an effective way to identify these kids could improve school safety, because we could help them earlier on. Clinical scales exist. They're used in the hospital. They cost an average of $1,000 to $2,500 each time. They still rely on a subjective impression by the provider. They rely on the clinical interpretation. They don't rely on the linguistics. So it's pretty clear which side I fall on. All right. So we have actually done a lot of work on this between me and Dr. Barsman. Drew Barsman is my co-PI and is a psychiatrist. And so what we want to do is find some of these warning markers. We want to find out if our algorithm is biased. We want to see what happens. If it's great in research, you know you can sit down with someone for 30 minutes. You're going to have a good, clean interview. You're going to record the interview. It's going to be fabulous. What happens in the real world when that can't happen? So what we wanted to do originally was develop natural language processing to automate the risk assessment process. We want to look at the linguistics of the adolescent. And then we want to make a data-driven prediction and decision. And then our psychiatrist is our gold standard. and your model whatever your model is going to be is never going to be better than your gold standard because it can't create new information and then we want to develop an accurate and scalable automated risk assessment system to screen and prevent the violence so this is only going to include a small sample of our interviews we have about a thousand patients now we're coming up on the tail end of an R01 where we've been recruiting patients for five years. But these patients, which are going to be, I think, 400 or so, have already been published on. So Cincinnati Children's, we are the major psychiatric provider of pediatrics, adolescents in Cincinnati. Um, something like 97% of kids in Cincinnati in our entire catchment area will touch children's at some point in time. Um, we see almost every single live birth in some form or another, as long as it's within the Cincinnati catchment area. So we have about 30,000 psychiatric inpatient admissions. These are all kids roughly 18 and under and about 52,000 outpatient visits. so we took kids 10 to 18 enrolled in schools not currently in state custody those with and without behavioral changes and we tried for an equal number of males and females and then we recruited from people in schools in our general area i mean a lot of indiana a lot of tennessee are not in the cincinnati children's catchment area those were referred to us for our psychiatric care. And then we recruited people from inpatient and outpatient. So here is what happens in an interview. You start with these little people on the left, the student population. Then we recruit or we have referrals for patients. And we're using the green, yellow, and red to denote someone who might be low, moderate, or high risk. And then you have a clinical interview and we record it and transcribe it. That takes about 30 minutes. There are two clinical scales. One is called the BRCA. It is the brief rating of aggression by children and adolescents. That's the one they use in the emergency department now. The other one is called the school safety scale. They each have about 14 questions. The questions overlap quite a bit and you'll find that as an adolescent they'll give you slightly different answers. Have you ever been in a fight at school? No. Tell me about a time you had an altercation at school. And then it'll be like, well, this one time on the bus. And as the informaticist, I'm like, oh my God. But our clinical research coordinators are truly fabulous at getting the kids to open up so that the kids feel comfortable sharing, whatever. They don't care. They're not upset. Like I might want to be like, I'm not going to tell them about the time I hit someone. These kids are, nope, it's cool. So we get that from them. And then it goes two ways. One is clinical judgment to the psychiatrist who then reads through the interviews. The BRCA creates a score who scores the interviews and determines their risk level. And then the other one is our natural language processing. And that goes into here that's ARIA. And ARIA is what our system is called, our algorithm, that looks at, you know, the linguistics of the kid because the questions are standard. We annotate our interviews, but it turns out we strip out everything and then we just make them into engrams and that will tell us enough. And then we have this low-risk, high-risk is what also comes from the provider. So we transcribe everything, we tokenize them, all of those steps I just went through on the picture. And then we look at the linguistic inquiry and word count dictionary. There are 51 predefined categories, positive, negative, perception, personal, cognitive. We clustered words into 100 categories that are unsupervised, and then we look at those engram features. It turns out an engram of five performs the best for us, but features that occurred fewer than five times were totally excluded because we assumed those to be kind of one-offs. There weren't enough of them. And then we do cross-validation on our data, and then we do the machine learning, which is logistic regression. If there are statisticians in the room, we can debate whether that's statistics or machine learning. It's okay, they're all really the same. We do support vector machines, we do a random forest, and we do artificial neural networks. So we have taken our data, we have processed it, we have gotten it to where we can say, you know, you're low, medium, high risk, and then we're running it through the machine learning. So we're doing that little fee step into feature space. The way we're going to evaluate it is the standard outputs, positive predictive value, negative sensitivity, specificity, and area under the receiver operating characteristic curve. um the way it looks is here I am highlighting for you how many features from the engrams we found there are 14 questions in the BRCA 14 questions in the school safety scale there were 5,270 engrams in the BRCA there were 11,450 in the school safety scale The PERC is a scale on this slide because it's in the paper, but we don't use the PERC anymore. PERC is used entirely clinically only, so we used it as an extra backup. This is for 131 subjects total. Looking at our training data, you can see that logistic regression, maybe you can't see, performed at 90.8%, so it did a great job. the SVM was 92.7 and 90.9. So we can, we agree with the providers most of the time on just linguistics alone. When you look at our test data, so we kept a holdout set so that we could evaluate this, you can see that the logistic regression performs with an AUC of 94.6. So it does a really great job of agreeing with the provider just on the linguistics in this interview. So we feel okay about our algorithm and its prediction model. Without using the interview information, our AUC is about 70% improving, adding in all those semantics. Now it's about 94%. So what we are trying to do is find children who have a risk of becoming violent or aggressive. We are trying to find the kids who need the earlier intervention to keep them, their classmates, their families safe. Because what happens now is if you are deemed high risk, I call it the I'm having a really bad day problem. If you are a kid who has done something that has gotten you pulled aside and they're like, I think you might need a psychiatric assessment. You're sent to the pediatric ED and you sit there for several hours waiting for the psychiatrist, the social worker, everyone else. Our goal is to come up with a system that gives you like a guidance counselor, a great second opinion that says, actually, they're pretty low risk. Why don't you have them follow up with psychiatry? Don't rip them out of school. Don't send them here. Don't have them marked as someone who is a problem child when they're not. And so feature selection did a really good job of helping to identify those warning markers. And those are really good for human understanding. So when I say system-wise to the guidance counselor, like this kid has violent thoughts or actions, that's something they understand. Because you have to trust whatever our algorithm is telling you before you want to do what it might recommend. So then we did a couple of other things. One is we looked at a bias analysis. And I want you to know that I went back to the paper last night because I said this is the most boring slide I've ever created. And this is how we published it in the paper was black and white. so it does stay kind of boring these are the number of subjects that belong to the true positive or false positive groups as predicted by aria the nlp system based on race that's in the top left the number of subjects that are true positive false positive based on sex is the next one across um and then true negative false negative based on race bottom left true negative false negative based on sex is bottom right and the answer is that whatever aria predicts does not have a bearing on based on whether we include race or sex in the model so so far so good we've done a good job having a model that seems pretty unbiased Now, it's a research sample in life, and so it may not be a sample that's going to come in when it's used every day in prevalence. So the other thing we wanted to do was evaluate what happens if you don't get to complete the interview. So I said there are 28 questions, 14 and 14. We figure in the real world you're going to run out of time. Something is going to happen, and you have to stop. So we used 412 of our interviews here. The blue dots which you can kind of see are the BRCA questions alone, the first 14. The red are the school safety scale and then the green are both of them together. The answer is the AUC is over in point nine after about only 10 questions. So we have enough information We just censured the interviews by these steps and then redid the analysis. We have enough information by about 10 questions so that if you took it to a school and they couldn't get through everything, it would be okay. You could still pretty much trust the outcome. You wouldn't want to trust it under 10 questions. So you can't have just one question. But if you can get through almost half the interview, you'll be okay. hey, we can give you a good example of what we think will happen. One of the other things we looked at was the predictive utility on predicting gun violence, because I know that what happens in the news is a school shooting. It is not a fight. It is not an altercation on the bus. It is something truly terrible. What this tells you from our study is we have a question that is about weapons. And it is basically, do you have positive or negative attitudes about weapons? Any kind, it doesn't matter. Most of the students who had a positive attitude about weaponry or access to weaponry were identified as high risk for violence or aggression. Access we have found doesn't matter at all, whether it's, oh my God, someone leaves a gun on the nightstand or yeah, my family hunts and we have, you know, the gun safe with everything locked up. It doesn't matter if they're a high risk kid, if there are any weapons in the home that counts as access, they're more likely to have a positive attitude towards weaponry if they have access and they're more likely to be high risk. For those we interviewed, it does not mean someone in the community is. And so from there, we wanted to look at the predictive utility on prospective gun violence. We did logistic regression. The people who were the kids who were high risk, we had an AUC of 0.88 using demographic variables and responses to the gun violence question alone for binary yes, no features. So we suggest that we may be able to say here's someone who has more of an affinity towards gun violence. This does not predict anybody. I cannot stress enough how we do not tell the children or their caregivers or the guidance counselors, this is a high risk kiddo. What we say is here are the risk factors. Here the protective factors based on their interview, you know, here is our recommendation. Because we don't want someone labeled by a piece of AI, because it may not be beneficial to them in the future. So how could this be used? The risk score could inform clinicians about the need for intervention. It could be used to aid in screening. Like I said, it's a second opinion. It could be used in hospitals to help in psychiatric screening. It could give feedback eventually to patients and caregivers, but not in the way that we're looking at it in the research term. They could get the risk and protective effects as well. A risk factor could be weapons in the house. Maybe we can do something about that if you have a child who needs some help. All right. So NLP and machine learning could detect violence risks for adolescent subjects. 10 to 18 linguistics were better than just demographics. The classifier agrees with the psychiatrist. And our goal, again, is an accurate and scalable screening service. A very important lesson before I get to the question slide. We have a lot of really wonderful people working with us, and I keep making the same mistake. if you use people's badge pictures, it ends up looking a little bit like mug shots. All of us, all the time. I don't know what we do wrong, because when I look at a badge, it looks fine. But all of them together, we look kind of rough. And then I made, I ought to put us all in and make us like action figures. um and then I have my query that I wanted a question image for a presentation about machine learning including the discussion of epilepsy and school violence I got an infographic and I asked again and then I got this so I got back to the kids beating each other up um I don't know why that one's okay but that's that's it so are there any questions also I don't know what this references it gave me on the bottom of this infographic. It's very likely that's a fake reference, so that might be very nice to look up, because I made this last night and it gave me this. So, all right, any questions? Or concerns? I'm open for all of them. Yeah? I just hear a lot about the environmental effects of AI. and how that's harming the environment? As somebody whose job it is to study AI, could you share your perspective on that? I think I can give you a non-political answer in that I am confident that my generation of an infographic makes no true impact on the servers running. Not compared to me making a Google query for an infographic that I can fill in. They are. There are a lot of servers. There are a lot of places. Originally, all the servers were up in Norway. Your cloud lived like in Norway because it's colder and they were underground to keep it colder. And now they're trying to find even more spaces for it. And there's a real problem cooling it all because it's warming things up because it's running too much. I think that's a problem. AI is being used as the culprit, and I don't think it's really the only one. But it's a potential issue in life. But I freely admit it is not going to stop me. So Dean Biden's research example, the results show that it's a very fast, whereas the data set is quite small. That means the evidence is in transfer learning, to use the bigger psychological data sets. That's a great question because we haven't used transfer learning yet. But in our larger data set, as we add more patients in, our support vector machine performs much better than the logistic regression. Usually we've made it too complicated, so logistic regression is fine. But we should look at the transfer learning too. The other thing is we don't actually add in the annotations right now. We just do them. um and so i think the higher the more patients we get the lower the prevalence is going to be of people who might become violent or aggressive and we're going to need to include all of those things to make it better yeah do you have any patterns on the agreement with the algorithm and the interviewer because I assume that you have several interviewers working with patients and you said that he tends to agree to 90 something person so were there any any uh friends about interviewers even more agreeable? Um so we we had two psychiatrists who were reviewing them and we always took whatever the forensic psychiatrist said as gold standard. The two psychiatrists I think never disagreed which actually sounds crazy now that I say that out loud. I don't see a difference based on who interviews the child but actually that's i don't know the answer now that i'm i'm dancing around it so i think that's a good question um and i think i ought to go find out the answer to see what we did because when it's algorithm versus human human always wins um the other thing we're trying to do right now is predict who might become violent or aggressive and have something happen so all All the people we've collected data on, we've done three months of follow-up with, and we're going to put all our data in and see if we can predict who's going to have something now that we have it. And that will give us an even better gold standard than just what the psychiatrist says. Yes, Mr. Scott. How can we prevent these tools to become something sci-fi type, like minority reform? Do you make me recall minority reform, all of that? How can we prevent these labeling wrong companies? Yes. So we have, as an institution, talked to several companies that want our algorithm, and I am very, very cautious to allow anybody to have access to it, because my deep fear is that some company is going to say, I can predict that you're going to become violent, therefore this is the truth, and that's not at all what we are predicting, ever. um what we are doing now is i mean i won't let it out is the way i'm doing it which is not scalable in life um is trying to create a generated list of risk and protective factors to provide instead of that high risk low risk number so that the number stays in the back with us but then we just say here are the risk factors um so that it's less of a you know worry that you might be labeled as a bad kid versus um whatever and the other thing we stress is that it's a snapshot of today and tomorrow maybe it won't be like this um a true psychiatric concern is a month or less so the three-month follow-up is really much longer than it needs to be and so you couldn't you shouldn't label a child because a month from now they may be totally different um in terms of their psychiatric concerns and issues and so it's a should be a snapshot for what do we need to do for you today not this is something for you for the future forever yeah it was minority report was on my list when I started the project all right we're a little over time sorry any other questions you know she'll be here you can come up to her and talk to her um but thank you so much for your presentation we have a little uh gift for you oh thank you um yeah and just thank you thank you very much um yes you have to open it open it up and yeah we're gonna be safe yeah open the box this is like a little it's like a little trophy yeah we give this to our distinguished speakers and it says it has a little quote in the back says the only phrase i've ever disliked is why we've always done it that way i always tell young people. Go ahead and do it. You can always apologize later. Great talker. Thank you very much.