Big Data & AI Architecture: Apache Iceberg via Spark and LLMs Pratik Patel
Schedule
Thu Oct 09 2025 at 06:30 pm to 08:30 pm
UTC-05:00Location
Improving | Houston, TX

About this Event
Can't attend in person?
BUT DON'T RSVP!
We'll start the streaming presentation at 7 PM.
We'll have pizza at 6:30 for those attending in-person!
This presentation delves into the potential of integrating LLMs with Apache Spark and Apache Iceberg as part of a Big Data to AI foundational architecture. In this session we’ll explore the potential of combining Iceberg, Spark and LLMs to give you a real world AI architecture that uses your data.We'll build an AI application that allows users to perform data queries and extract insights from massive datasets using natural language. We'll start with understanding the structure and architecture of a large dataset. Then we'll look at options for querying the dataset using Apache Spark and Trino. Finally, we'll use an LLM to query the dataset using natural language. We'll also look at other uses of LLMs as part of an overall solution, and explore the differences between different LLMs.
We’ll also discuss where event streaming (Kafka and Flink) fit into this architecture. The design of this architecture is meant to be flexible and give your dev team the ability to choose different technologies for the processing and querying. I’ll leave you with a CONCRETE example that you can run on your laptop and explore the possibilities. Again, this will be an example of a real-world application; the dataset used will be for home sales data for the last 15 years.
We will use these technologies:
* Apache Iceberg
* Apache Spark
* Trino
* LM Studio for running your own LLM
About Pratik Patel:
Pratik Patel is a Java Champion and developer advocate at Azul Systems and has written 3 books on programming (Java, Cloud and OSS). An all around software and hardware nerd with experience in the healthcare, telecom, financial services, and startup sectors. He's also a co-organizer of the Atlanta Java User Group and conference co-chair for Devnexus, frequent speaker at tech events, and master builder of nachos.
Sponsored by
Improving is a complete IT services firm, offering training, consulting, recruiting, and project services. Our innovative solutions and processes have helped hundreds of clients across the globe realize their tactical and strategic business objectives. As a result, our 1,000 employees have accumulated extensive technology and management experience in several industries, including financial services, energy, travel, retail, government, and several others.
Our culture encourages both the inspiration and motivation to achieve amazing things. We are constantly striving to live out our values of Excellence, Dedication, and Involvement through the foundation of trust.
Does the Houston Java User Group have a Web Site?
We sure do! You can find it right here: https://hjug.org
Thank you to our Door Prize Sponsors!
Hello2Morrow.com - 1 year SonarGraph license
JetBrains.com - 1 year IntelliJ Ultimate license
Webucator.com - voucher for one online self-paced course.
Want to present at a Houston Java User Group meeting? Contact Scott at scott AT KeepCalmAndRefactor DOT com.
https://groups.google.com/g/hjug
https://www.linkedin.com/groups/41555/
https://www.facebook.com/groups/49387404684
Where is it happening?
Improving, 10111 Richmond Avenue, Houston, United StatesEvent Location & Nearby Stays:
USD 0.00
